I have been longing to write this blog entry since October last year, after it was announced that the latest Nobel Memorial Prize in Economic Sciences went to Abhijit Banerjee, Esther Duflo, and Michael Kremer. In a previous blog entry where I shared my immediate reactions to this announcement, I briefly explained how the latest award felt the most relatable to me in many years, as well as how it may help to bring development economics back into academic and public relevance. With this blog entry, I intend to draw on my own works and experiences to reflect on the practical aspects of primary data collection in the policy world.
What ignited me the most to share my reflections was from having read “The Economist as Plumber” (2017) by Esther Duflo. With Duflo herself drawing on her own experiences in facing the practical challenges of evaluating development programs with governments and NGOs around the world, from the perspective of an academic economist, there were four main points reflected in her paper that I found the most relatable to my own experiences: 1) plumbing: attention to seemingly irrelevant, yet crucial details of policy implementation; 2) data: the importance of high-quality primary data for guiding policy designs and solutions; 3) inertia among government officials and bureaucrats; and 4) experimentation: never stop experimenting and pushing for more rigorous evidence in the face of resistance, mistakes, and failures. While “policy” and “program” may be used interchangeably, the term “program” refers to the specific interventions that take place on the ground according to a policy, namely the policy implementation.
For many people, the observations above may come across as so obvious and general that they do not require much further discussions. They are expected to be resolved by themselves without the need to be extra attentive about what’s truly happening on the ground. But such confidence is precisely the overall point of the paper: they are obvious, but still often turn out to be neglected in the process of programs being implemented, leaving devastating implications for their potential success.
The first main point is the call for economists to adopt the mindset of a plumber. There are two different ways of plumbing in policy design, according to Duflo. The first is the “design of the tap” work: taking care of apparently irrelevant details, such as the way the policy is communicated or the default options offered to customers. The second is the “layout of the pipes” work: important logistical decisions that are fundamental to the policy’s functioning, but often treated as purely mechanical, such as the way money flows from point A to point B, or which government worker has sign-off on what decisions. In other words, the fact that simple “plumbing measures” can make wonders will, on the other hand, demand careful attention to the details of how policy interventions are carried out in reality:
“Details that we economists might consider relatively uninteresting are in fact extraordinarily important in determining the final impact of a policy or regulation, while some of the theoretical issues we worry about most may not be that relevant. (…) So an economist who cares about the details of policy implementation will need to pay attention to many details and complications, some of which may appear to be far below their pay grade (e.g. the font size on posters) or far beyond their competence level (e.g. the intricacy of government budgeting in a federal system).”Duflo (2017), p. 2-3
On the surface, the call for economists to adopt a “plumbing mindset” is reminiscent of similar other calls for economists to adopt, for example, an “engineering approach” to market design by Alvin Rooth in a lecture in 1999:
“Market design involves a responsibility for detail, a need with all of a market’s complications, not just its principle features. Designers therefore cannot work only with the simple conceptual models used for theoretical insights into the general working of markets. Instead, market design calls for an engineering approach”.Duflo (2017), p. 4
Both the plumber and engineer economist emphasise the need to pay attention to the details of the environment in which general principles and assumptions are being applied to. Both types of economists are the antithesis of what far too many people associate “economists” with: instead of modelling the agents’ preferences for the sake of the assumptions of a theorem itself, models are to be modified according to the empirically observed preferences of the agents. The “credibility revolution” in economics has been helpful to this change.
But while both types are typically known to be more open than others to unconventional tools for solving and navigating through – as opposed to dismissing the existence of – idiosyncratic characteristics of an environment in which general principles and assumptions do not perfectly hold, the plumber economist usually has to navigate through far more uncertain environments: the real world. Duflo argues that “unknown unknowns” are particularly rife in the real world, in which solutions often depend on a range of factors that cannot be easily identified, quantified and dealt with analytically, but which will arise anyway during implementation.
“The fundamental difference between an engineer and plumber is that the engineer knows (or assumes she knows) what the important features of the environment are, and can design the machine to address these features in the abstract, at least. (…) When the plumber fits the machine, there are many gears and joints, and many parameters of the world that are difficult to anticipate and will only become known once the machine grinds into motion”.Duflo (2017), p. 5
What are the typical “plumbing” issues occurring in the field, and what are the gains from being attentive to such issues? Among other programs, the paper highlighted two programs as examples of where negligence of “obvious” implementation details on the ground almost utterly failed these expensive programs. In 2007, a program aimed at improving home water connections and sewage infrastructures in the two Moroccan cities of Tangiers and Tetouan was launched by Amendis, a local firm. One “carrot” measure to attract as many households as possible into the program was to offer interest-free loans to enable the households to cover the marginal costs of newly installed water connections. But for some time, the program struggled with a low take-up rate of below 10%. A major, unresolved plumbing issue of this program was the administrative and logistical barriers for the households to register themselves into the program. For many of them, the nearest municipal office to complete the registration was located too far from their homes.
The take-up rate rose to 69% only after the program’s field staff took the initiative to directly help out the households with photocopying and delivering the required documents for registration to the municipal offices. In a similar situation, from my own experience, not many programs would have been willing to allocate extra budget for such vital, yet simple procedural assistance. And this is where programs start to become paralysed, and in the end fail. It can be speculated that unimpressive large sums of foreign aid into various programs around the world may have failed to make any significant impact due to the lack of attention to the details of implementation, in this case the administrative steps for the households to sign up to the program. I will come back to this point and argue that this partly stems from inertia among program implementers.
In 1998, the Indonesian government launched its largest yet rice subsidy program (“Beras untuk Rumah Tangga Miskin (Raskin)” or “Rice for the Poor”), as a social policy response to the Asian financial crisis. In one year, the program could spend up to US$1.7 billion of the government budget, while only less than half of the rice reached out to eligible households. In the early years of this program, the plumbing issue was: how to make eligible households truly aware of their actual entitlements under this program? It was a matter of informational and legal empowerment. But while such terms appear grandiose, the plumbing measures were simpler: by delivering physical identification cards directly to the households, this plumbing measure helped mitigating the problems of i) low take-up and trust toward the authorities among the households, and ii) leakages and embezzlement among local officials, who are often in charge of implementing and overseeing central government-vetted programs.
The program has been scaled up ever since, with physical identification cards so far having been delivered to 60 million Indonesian households. According to an evaluation study from 2012, eligible households in the treatment villages were found to have received a 26% increase in rice subsidy, driven by both higher quantities received and lower copay prices, after they were made more aware of their entitlements by the physical identification cards.
Many programs in Indonesia are not only highly exposed to plumbing issues, but also (and perhaps most greatly) threatened by the negligence and lack of willingness among implementers to recognise and solve plumbing issues down to the very bottom. Indonesia is a vast archipelago, hence requiring itself to be run by a vast bureaucracy and public administration. Government decentralization is a huge topic, as the country struggles with weak coordination across its central –and local governments and ministries. Therefore, one of the most prevalent plumbing issues in Indonesia is, for instance, how money is to be transferred from the center, where development financing mostly come from, to the local outposts while minimising the costs of red tape and leakages.
The second main point of the paper is the importance of (primary) data collection and evaluation. For many empirical development economists, especially those who predominantly run randomized controlled experiments to evaluate programs, large sums of money and effort are usually allocated to the conduct of primary data collection alone. It is typically under this activity category that the plumber mindset goes truly into effect, given the need to notice details and monitor what’s really happening in the field during implementation. It is a highly expensive and resource-exhausting enterprise, not just in terms of time and money, but also psychologically speaking in terms of retaining the motivation of yourself and the people in the field.
As part of broader calls to end “data deprivation”, the now-former Chief Economist of the World Bank Pinelopi Goldberg, an academic development economist by profession, recently announced that the bank’s World Development Report in 2021 would be timely dedicated to the role of “Data for Development”: How can high quality development data help inform development policy and improve the lives of poor people? The bank claims that more data was created in 2015 and 2016 than in all previous years combined, largely as a result of the rapid changes in how data is created, collected, managed, curated, analysed and used from an expanding variety of sources: mobile phones, social media, data-driven companies, higher-quality government statistics and so forth.
In a working paper explaining for the ongoing economic slowdown in India (“India’s Great Slowdown: What Happened? What’s the Way Out?”), the economists Arvind Subramanian and Josh Felman called for a “big data revolution” in India’s data collection and accounting practices as to raise the quality and reliability of the government statistics. Only in this way can the slowdown be properly understood and inform future crisis-preventive policies. A recent opinion piece on the Financial Times depicted the situation of many African governments today not knowing enough about their own citizens: “How any people are there in Nigeria? What is the unemployment rate in Zimbabwe? How many people in Kibera, a huge informal settlement in Nairobi, have access to healthcare? The answers to such basic questions are: we don’t really know”. To this regard, Mo Ibrahim, a Sudanese billionaire, precisely believes that data is the missing SDG.
But while data alone cannot fix fundamental development policy issues, it nevertheless plays an essential role in pushing for greater policy transparency and accountability of programs and foreign aid that are wasteful and, truthfully, not really working well. I particularly share the following concerns expressed by Michael Kremer, one of the Nobel laureates last year:
“A new medication goes through the rigor of an RCT even when it only affects a few hundred thousand people. So, it is crazy that we spend billions of dollars on policies and programs affecting hundreds of millions of people with nothing close to the same level of evidence on effectiveness or lack thereof”.Michael Kremer in Hindustan Times (2019)
Nevertheless, and indisputably so, we must constantly remind ourselves that primary data collection is a worthwhile investment for generating more rigorous evidence and guiding better program practices. But as we confidently remind ourselves, another big challenge is to convince other people, especially government officials, bureaucrats, and other development practitioners, that it is after all worth the huge efforts. Why? Put it simply, the opportunity costs of not putting such efforts are greater: without high-quality data and evidence, how would we exactly know about a program’s success, truthfully speaking?
One debate concern about what amounts to be viewed as “rigorous” methods for primary data collection and evaluation, depending on the nature of programs. But that debate already assumes that everyone steadfastly agrees to the notion that primary data collection and evaluation are absolutely necessary. But another alarming debate is that, whether everyone in the real world believes so. In fact, I have come across far too many implementers trying to convince me about program success, based on loads of unverifiable individual stories, in ways that rather comes across as attempts to cover up for meager program success. While “anti-intellectual, evidence-free” sentiments are nothing new in society, I was naïve enough in the past to not anticipate such sentiments to exist even in the policy world.
In Indonesia, much of my own work has been dedicated to pushing vigorously for primary data collection and evaluation to earn greater priority in terms of program funding and planned activities. Similar to the depiction of the state of data and statistics in many African countries today, even some international organisations themselves do not always have the answers to even the most basic questions about their programs. A proliferating answer is: “We don’t know. There is limited data”. Instead of doing anything about it, they would prefer to sit back and rely on whatever anecdotal evidence they manage to put forward without having put any respectable efforts to come up with their own more convincing evidence.
This is a classic example of inertia: an astonishing lack of enthusiasm and unwillingness at least try to solve a problem, complemented with a disdain for evidence that may go against the good intentions or some pre-determined outcome targets behind programs (e.g. see “Development Projects and Economic Networks: Lessons from Rural Gambia“). The recent internal row within the World Bank, in which its research staffs were allegedly not allowed to publicise a working paper revealing that foreign aid may have accidentally led to increased deposits put into international financial havens by recipient countries (“Elite Capture of Foreign Aid: Evidence from Offshore Bank Accounts”), is an example of which researchers engaging with the policy world have to constantly justify themselves for their pursuit of evidence and truth.
Surely, data deprivation is very common in countries where the informal sector occupies a large share of the economy. But again, that should not be an excuse to not do anything about it. No matter how one large of a commitment and tedious primary data collection can be, we need to invest in it. There is so much untapped social and economic potential in the way that development policies, through better designs and evaluation, can make greater and more tangible impacts on people’s livelihoods.
A program to stimulate local economic development in rural Indonesia and Timor-Leste was already undergoing implementation before I had arrived. The program is small by scale, comprising three villages in the border areas of both countries. However, basic figures put up against measurable indicators to prove the ongoing progress of the program were missing. No baseline survey had been conducted either prior to implementation. Surprisingly, there were no planned activities or funding allocated to field data collection, either by the program implementers themselves or by some “outsourced” experts, for the entire program duration spanning several years. They must initially have intended to rely on anecdotal individual success stories reported by some field staff, regardless of levels of objectivity in their reporting. Nevertheless, this happened to be the perfect situation to apply the mindset of a plumber in order to scrutinise the program implementation.
“The relative indifference of governments to plumbing questions often translates into a willingness to give the plumbers a relatively free reign on the design.”Duflo (2017), p. 26
After several uneasy months of trying to convince the program implementers about the merits of embarking ourselves into the field to collect primary data, they finally agreed to allocate some funding into such activities. The funding mostly covered transportation and accommodation. Many basic resources needed for the activity of collecting the data itself were lacking. The resources can be divided into three categories: physical, brainpower and psychological.
In terms of physical resources, these days field surveys are increasingly conducted through digital means, such as mobile phones, phone calls, tablets and (or) laptops. Field enumerators, being people who survey the targeted audience, usually will have to go through a week-long training on how to perform the survey interviews properly, as well as how to report the data to the people responsible for performing the data cleaning and analysis afterward. That was the basic procedure and allocation of labour in my previous jobs, so I had initially anticipated the same level of rigor and commitment from international organisations, especially given the nature of the program.
But there were no trained “field enumerators” in place, only three to four local “field monitors” tasked to provide monthly reports that were not always verifiable. Some basic paper-based surveys had been conducted by them once. But those in charge of the program, based thousands of kilometers away from the field, had themselves not been aware of what questions had been asked, how many farmers had been included in those surveys, or how the farmers had previously been surveyed by the field monitors. These were the most basic plumbing issues that had not been taken seriously enough. Hence, the reported figures from the field were not always reliable. But the difficult circumstances of the field monitors explained much for the sluggish practices witnessed: no tablet, no laptop, no internet, and no guidance had been offered to them. Just to report from the field required that the field monitors had to commute to a nearby border town to get access to a laptop, internet and printing machines. Two of the field monitors lived in one of the villages, which is around an hour drive – through mountainous terrain – to the border town. One lived in the border town itself. Another one commuted between the border town and a different city, which required a five-hour drive for one way alone. Hence, the logistical constraints for coordinating the field monitors were immense.
In terms of brainpower, the field monitors had never received any training or guidance on how to conduct field data collection. They were all initially set free to decide on their approaches to this task. For many people, field data collection is quite an intimidating endeavor: Where do we start? What information do we want to particularly know and collect? What survey methods and code of conduct are suitable for the program and its beneficiaries? And that intimidation can hurt their self-confidence and breed inertia within them. On one hand, as locals, the field monitors were very familiar with the people living in the villages, so they carried some valuable local wisdom for how to approach the people participating in the program. But local wisdom may go uncontrolled and accidentally lead to favoritism and disputes. This presented another plumbing issue, a key detail of implementation that had also gone unnoticed for too long, among other details. So, the field monitors had to be guided more properly and closely.
Those in charge of the program were themselves not familiar with field data collection, and initially even averse to this idea. Hence, the task of training and guiding the field monitors felt like an exceptionally heavy and solitary exercise. The field survey questionnaire forms and content were revamped and designed from scratch. The procedures for collecting, entering, cleaning, analysing and reporting the data were written and taught to the field monitors from scratch as well. There was also an immense language barrier in the way, as the field monitors could only communicate in either Bahasa Indonesian or Bahasa Dawan (a Bahasa dialect spoken by people on the island of Timor). All the procedures and instructions, written and explained in English, had to be translated in some way. In this case, the most worrisome plumbing issue was: how to make sure that the survey questions are translated correctly and with clarity? The survey data itself, consisting of large amounts of quantitative and qualitative responses, could understandably only be collected in Bahasa Indonesian. Yet, to have someone willing to help out with the translation alone was a difficult struggle, which perfectly unraveled some quite unacceptable levels of inertia seen among the program implementers based in the capital. Some of them were simply not entertained by evidence and reporting. Myself travelling to the border town, in order to work with the field monitors more directly, usually took an entire day of airborne and on-the-road travel under very challenging terrains and landscapes. But those travels are definitely worth it, given the growing motivation seen among the field monitors.
In terms of psychological resources, it is specifically about the support, assistance, and commitment from the people in charge of the program. Usually, field data collection and analysis are done by entire teams. If a team member is unsure about something or faces some analytical problems, there will at least be one or more team members who can compensate for the limitations of the skills of others. In other words, field data collection is not supposed to be a lonely enterprise. Hence, for some time my anxiety ran high, as I was constantly unsure of whether I was on the right track in terms of methodology while not having many people around me to ask for advice. Even though I had earned the trust and support from those at the highest levels in charge of the program, the majority of those below were initially resistant to the whole idea of more rigorous field data collection and evaluation. It required more work and effort from their side, and it was a natural trigger. That presented another pain-staking challenge to undertake: how to keep up the motivation of not just the field monitors, but also among the program implementers while trying to get around the internal resistance.
This brings into relevance the third main point from the paper, namely inertia: a tendency to do nothing or to remain unchanged. In the context of programs, Duflo describes inertia as being “complete ignorance of the reality of the field”. To do plumbing necessitates containing inertia. This is perhaps the main point that I could relate to the most from the paper. While Duflo describes how inertia is deeply ingrained in governments, I would also extend this observation to international organisations. Following several rounds of self-contemplations, I have concluded that the internal resistance came from sheer inertia. In the policy world, I have conceded to the fact that not everyone is interested in details and evidence, as they require extra time, effort and dedication. Understandably, leaders in the policy have to take many large and important decisions under tight time constraints. But if not government officials or bureaucrats, then I had certainly expected people working in international organisations to be more entertained by details and evidence.
“It turns out that most policy makers, and most bureaucrats, are not very good plumbers. Part of this is because, contrary to most generous beliefs, agents in general are not necessarily very skilled at what they do.”Duflo (2017), p. 11
For a long time, I have been thinking very deeply about what factors either mitigate or exacerbate government and bureaucratic inertia. Why so? Because inertia can drive a tremendous amount of civil service talent and demoralise many of those who remain. This surely also applies to international organisations. Once inertia becomes so ingrained, it can become the most destructive barrier to better public service delivery and performance (e.g. see “The U.S. needs to upgrade its civil service”, discussing the relationship between state capacity and emergency disease prevention). Moreover, well-intended programs often end up with not-so-well-intended outcomes, in which “obvious” improvements did not turn out to be improvements after all, precisely because the details of implementation were either neglected or not gotten right. Recognising the role of structural factors and systems behind program failures does not have to necessitate the negation of the role of behaviour and organisational culture:
“Large-scale waste and policy failures often happen not because of any deep structural problem, but because of lazy thinking at the stage of policy design. Good politics may or may not be necessary for good policies; it is certainly not sufficient.”In Poor Economics by Banerjee & Duflo (2011), p. 261
While not entirely agreeing with some parts of the quote above, it’s the bolded part that I am more interested in emphasising. “Lazy thinking” here means having a minimal regard to the details of how programs are implemented, while at the same time believing that any programs with good ideas and intentions behind are destined to work out magically. If the “grand strategy” has been laid out, everything will work out, they often believe. From a discussion meeting with a few bureaucrats, who were addressed with questions concerning their plans on how to reform the public primary healthcare systems, Duflo recalls:
“What was striking was, not only did they not have any answer to these questions, but they showed no real interest in even entertaining them. Whenever we asked them to spell out what they thought their policy lever was (as opposed to their aspiration), the stock answer was that they did not really have one, that the local governments and medical officers could not be forced to do anything. (…) Their position oscillated between presenting the illusion of the perfect system and presenting the illusion of complete powerlessness in the face of local power and initiative.”Duflo (2017), p. 10-11
The fourth and final main point is experimentation. Duflo argues that, in the face of policy makers’ inattention to details, empty policy discussions or weak institutions, the room for policy improvements that these circumstances offers is a source of hope and motivation for the plumber in the pursuit of seeking better ways to implement and evaluate programs. While experiments are usually confined within the laboratories for natural scientists, many social scientists tend to conduct experiments by treating the real world as the laboratory itself: field experimentation.
Most of the time, the people in charge of programs are not present in the fields. They rely heavily on field monitors. Combining all the three main points above, which call for: applying the mindset of a plumber, investing in primary data collection, and containing inertia, field experimentation brings all program stakeholders closer to the field as to collectively observe what truly unfolds on the ground during implementation.
“In the service of fitting policies for the real world, field experiments are a natural complement to theory and economic intuition. Evaluation in the field is necessary for the plumbers because intuition, however sophisticated and however well-grounded in existing theory and prior related evidence, is often a very poor guide of what will happen in reality.”Duflo (2017), p. 21
Mistakes, or even failures, are bound to happen during field experimentation. And that is the whole point: ultimately, the desire to experiment is driven by a perseverance to overcome past or existing shortfalls. Fortunately, the local field monitors have been showing an admirable determination to keep learning and overcome their own personal disadvantages and the general challenges of doing field data collection. For this reason, my best time spent is mostly whenever I am in the field working together with the hardworking field monitors. The eagerness and humbleness of people living in the countryside just tend to be very genuine and, quite politely, different from that coming out of those based in the urban metropolises.
I recently read two opposing articles debating the true levels of wealth and income inequality around the world. The first article, an interview with the economist Emmanuel Saez by Pro-Market (a blog run by Chicago Booth School of Business), quotes Saez as stating that: “Saying inequality has not increased in the US is the equivalent of being a climate change denier”. The second article, titled “The inequality illusion” by The Economist (the printed version), summarised a series of studies using different methodologies for measuring wealth and income inequality, only to simply conclude that: “We don’t really know about the true extent of inequalities, and therefore should not overestimate the issue”.
I found The Economist to be hugely disappointing, but not because it exhorted precautionary warnings against inflaming the true levels of inequalities. Many scholars have done that, and the methodological debate will most likely never end. But what makes The Economist article stand out, in a bad way, is its overall message that: “since we don’t really know the true extent of inequalities, we shouldn’t be trying to do too much, policy-wise”. This is not plumbing, in which the concern is a scenario that well-intended policies may backfire, but inertia 101. If something is only prematurely understood, then at least try to create, mobilise or improve any available data as much as possible. The grim reality of limited data does not suggest that a phenomenon is non-existent. A phenomenon can exist, regardless of the current data availability in human senses. It is just waiting to be shed light on by us.
“Of course, as we’ve emphasised in our work, we’d like to get more data from governments. We think it should be the obligation of governments to provide more information. That’s also what sets us apart from the inequality deniers. If you read their studies, they essentially say: “we can’t really know that well”, but they never do the hard work that we’ve been doing, trying to improve things so we’d know better.”Emmanual Saez in ProMarket (2020)
Experimentation is also driven by a high spirit of pragmatism. For Duflo, pragmatism means placing more focus on what’s important for the world, and not strictly only on subtle theoretical issues. Of course, there is another whole debate concerned about as to what extent empirical economics can replace economic theory (e.g. see “Understanding and Misunderstanding Randomized Controlled Trials” by Cartwright & Deaton 2017).
I would add another important dimension to this view of pragmatism: avoiding methodological rigidity. While experimentation forces us all to be receptive of surprising information from the field, pragmatism should also entail an attitude of being receptive to a diversity of methodologies beyond randomized controlled experiments, nowadays claimed to be the “gold standard” of evaluation. Such triumphism risks us fall back into the mindset of, or even ideology, that potentially more suitable ways and solutions are not worth exploring. The rise of empirical economics, in which assumptions behind theoretical models were no longer taken as given in the face of more available empirical data, has been driven by the kind of pragmatism in which we are willing to explore methodologies beyond our disciplinary, self-imposed boundaries.
My overall message is that governments and international organisations should realise that there is so much untapped potential – in terms of evidence and lessons – generated from either 1) investing into more data-driven and rigorous approaches to implementing and evaluating programs, and (or) 2) partnering with other specialist organisations or NGOs to generate stronger evidence, partly as an assurance that the money behind programs are well-spent and making genuine impacts to people.
All the above commitments ultimately demand us to become more attentive to the details of implementation, as well as attracting civil servants who do not conform to the temptations of inertia. The challenges and limitations of plumbing and experimentation do exist: Will we forget the bigger picture if we focus too much on details, on smaller questions and solutions? How do we replicate or scale up a program, which has so far only worked well in a locality with solutions tailored to specifically that locality? But as I come from a perspective in which regulatory reforms have often received most of the attention in comparison to primary data collection, my overall view is that plumbing and experimentation can complement the typical “bigger picture” and “structural systems” strategies and solutions behind programs.
The feeling of insecurity over decisions or directions taken for my own work will most likely never go away. But those feelings are part of the learning path. With an open mind to explore new ways to improve programs, we assure ourselves that each and every one of us – the bureaucrat, the lawmaker, the statistician, the data scientist, the development practitioner and so on – will continue to learn together in the process of making programs more impactful.