Primary data collection during a crisis

On 5 June, the US Bureau of Labour Statistics (BLS) announced that the US employment rate was at 13.3%, with 2.5 million new (non-farm) jobs created in net effect in May. Given the astronomical net loss of over 20 million jobs recorded in April, which gave way to the highest spike in the US employment rate since WW2 at 14.7%, this outcome came out as a surprise to many people, especially politicians.

Many politicians and parts of the general public do not usually have the interest to scrutinize the underlying data collection methods and its circumstances behind these reports. Unfortunately, this often leads to meaningless celebratory tales and short-sighted, but potentially fatal policy decisions in the long term. The US President Donald J. Trump was quick to joyously embrace the latest employment report, going as far as claiming that “this is the greatest, greatest, greatest comeback in US history, and it’s only going to be better” during a recent White House press briefing. In addition, Republican lawmakers are now seriously considering to vote in favor of rolling back on stimulus packages for people recently laid off and struggling small businesses hit by the pandemic, believing that the latest employment report justifies more austere government spending.

The May report was produced in the middle of a pandemic with millions of jobless claims and benefits to administer at once. The data collection methods and the disarraying circumstances behind this report perfectly illustrate the trade-offs that survey staff are faced with these days, especially during an extraordinarily demanding time to quickly deliver information seen as critical for policy makers to consider further stimulus measures. In turn, the way that these trade-offs are managed reflects the degree of integrity and leadership of the staffs behind these reports.

The US employment report is based on monthly data collected from two types of surveys. The first one is a household survey, which interviews a sample of 60,000 households meant to represent the entire civilian non-institutional population. The second is an establishment survey, which interviews a sample of around 145,000 businesses and government agencies meant to represent approximately 657,000 individual worksites in the country.

In the May report, the BLS attached a notice (see below) on how the data collection and estimations were affected by Covid-19. I marked in yellow parts that I believe is important to further deliberate on. The Financial Times also ran an article that succinctly summarized the various fall-outs during BLS’s latest survey round.

Many have (rightfully) emphasised the parts from the notice regarding the miscalculations of absent or temporarily laid off people as “employed” this time. In response, officials from the US BLS and Census Bureau stated that, while they still can’t understand why this error keeps occurring, in spite of the increased training for field staff on criteria for different classifications lately, they will aim to ramp up further on those trainings. They disclosed that the unemployment estimates for April was in fact exposed to this error as well: while the US employment rate was officially estimated at 14.7% in that month, a correction of the error would’ve raised the estimated rate at a staggering 19.7%. That is almost 1 out of 5 employable Americans out of work, a rate that doesn’t even include people who are no longer actively looking for a job, often out of despair (that is, left the labour market).

This leads us to another highlighted part from the notice, which I am personally more concerned about: drops in the survey response rate. More technically, this is a worry in terms of sampling error, basically being that unemployment estimates from samples of people do not reflect the entire population, with one common reason that the sample size is perhaps too small. Politicians might interpret the latest fall in the US employment rate as sufficient evidence that “the impact of the pandemic on the national economy isn’t as bad as widely predicted”. But it can perfectly be the case that the lower survey response rate, and consequently the reduced samples of people participating in the monthly surveys, has failed to capture the true magnitude of unemployment.

As shown in the graph below made by former US Treasury official, Ernie Tedeschi, a trend of decreasing response rate preceded the pandemic. But the latter is currently exacerbating it, leaving behind some potentially permanent implications to the conduct of in-person surveys in particular. Employment is a question of dignity and belonging for most respondents, and to have surveys on such questions conducted less and less through direct in-person interactions with survey staff is neither an optimistic prospect for future response rates, nor for the wellbeing of people who need to talk and be reached out to by civil servants with a consciousness.

Amid declining morale among US civil servants, nowadays fueled further by the hostile political climate towards truth-seeking civil servants at the federal level, the BLS still went on to publicly disclose that the true unemployment rate in May was likely at least 16.3% rather than the official rate of 13.3%. Furthermore, what I additionally admire about BLS is the following part at the bottom of their notice:

“(…) However, according to usual practice, the data from the household survey are accepted as recorded. To maintain data integrity, no ad hoc actions are taken to reclassify survey responses.”

This is exemplary of how to manage the trade-off between the reality of more limited sample data and achieving precise estimates on the magnitude of unemployment, especially in the middle of an epic public health disaster without much precedent in modern times. To make it clear, I am less interested in debates about how allegedly “politicized” the work behind the latest estimates have been.

Like myself, who’s presently involved in collecting primary data of farm households while navigating through the practical and ethical challenges presented by this pandemic in the field, maintaining our professional integrity in intense times like these gives no room for engaging with political bickering. Unlike in the US, many households in rural Indonesia and Timor-Leste do not possess mobile phones for our survey staff to reach out to them, while we all adhere to strict physical distancing guidelines in tandem with the rest of the world. Internet connection was extremely unstable even in the heart of the capital Jakarta during the first two months of partial lockdown, so imagine how it must have been for our survey staff and farm households in the border of Indonesia and Timor-Leste.

Recently, a country director asked me about how I intend to overcome the emerging barriers in the field for conducting the surveys on how the farm households are struggling through this pandemic. I simply said: “At this point, any available data is good data. We will not give up“. In spite of the extreme logistical limitations in the field right now, our survey team has collectively been sympathetic and hard-working for each other ever since the pandemic hit us. It will take some time until we can gather together again in the field, being the likely prospect for primary data collection for any development economist in years ahead. On a positive note, the pandemic will hopefully accelerate development financing toward increasing the rural coverage of mobile phones and usage of electronic survey devices, resources that remain a distant luxury for ordinary people (and even researchers) living in the most remote parts of the world.

Photo taken with survey staff during a visit to a farm household in a border village in Oe-Cusse Special Administrative Region, Timor-Leste in December 2011.

The usual trade-offs in the conduct of high-quality surveys have surely been amplified by the pandemic. This can easily make survey staff feel that they are not managing the trade-offs in the right ways and hence feeling less confident about themselves: the imposter syndrome. Throughout this pandemic, that has been my case. But in the end, in my view, now is not the time to easily dismiss and throw away even the most limited data available. Try to make the best use of it in your reports and stories. As long as you keep a sense of professional integrity and the ability to take clear decisions during a crisis, you will do fine. Above anything else, stay healthy and protect the safety of everyone else at your best capacity. We are in this difficult time together.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s