Back to Sanctions Home











4.8 Improving the interpretation and reporting of data

4.8.1 Problems and cautions with interpretation of data

Extrapolation beyond the scope of the data source
Generalizing beyond the data is a frequent methodological error. When a study in a narrow geographic area of time finds excess mortality or excess malnutrition, there is often a temptation to extrapolate that finding as if it were representative of a larger, surrounding population. As a rough guess, it has some value, but it should not be presented as if the information actually proves anything about the larger population. This is a very common reason why others misrepresent information in a report, claiming that a scientific study has proven such projections to be accurate and forgetting the caveats or limitations stated by the authors. It is best to say that, “It cannot be determined with the information at hand how many children have died overall, but the evidence, in one study, suggests that the rate has increased”.

Extrapolating beyond the time frame
Another common mistake in analysing humanitarian consequences of crises is to extrapolate a data point over a longer period of time. For example, where very high excess mortality is seen in emergencies, it is usually documented only for a short period of time. Very high rates are then referenced over and over and in the process are understood by the media and professionals as referring not to a narrow point in time but to the entire period of crisis. For example, in Biafra in 1998, an analyst reported 2,000 deaths in one day in a famine zone. Based on that observation, others multiplied that by 365 days and concluded that 1 million Biafrans died that year from famine. No data were available to crosscheck this estimate. Therefore it is just as reasonable to extrapolate data from the beginning or endpoints of the famine as it would be to extrapolate a single, worstcase observation. A reasonable solution is to interpolate, not extrapolate. That is, to estimate that the true, average rate over a period is half way (or using other weighted measures) inbetween rates found at different points in time.

Extrapolating from selective populations
Much data comes from organizations working with particular groups in particular areas, for example persons who attend a particular church or children in a given orphanage. Beyond the problem of the limitations of servicedelivery data already mentioned, there are limits to how much can be extrapolated from a group that is atypical to a larger population.

Extrapolating from selfselected populations
When an NGO, for instance, reports—as they very commonly do—that the populations seen in its emergencyfeeding programmes have high malnutrition rates, that is to be expected because (a) people with malnutrition go out of their way to seek out these programmes, and (b) the criteria for entry into the programme requires that they exhibit malnutrition. Thus, the rates seen in these selfselected sites have almost no value in revealing the rates of malnutrition in the larger population. Unfortunately, much of this kind of data gets repeated and limitations of the meaning of the data are lost along the way.

Evidence of change
Often data suggests that the status of a population has changed because of a change in the use of some service. For example, in many emergencies, there is a reported increase in the number of persons seeking food and employment through public works projects that scale up when demand increases. An increase in the number of persons who come seeking work at a foodforwork project might or might not indicate a real increase in the price of food, the availability of food due to a failed harvest, the closing of certain markets, an increase in the size of the local population, an increase in the rate of unemployment, or all of the above. It would be inappropriate, without other corroborating information, to conclude that any one of these factors was the sole or main cause.

4.8.2 Improving data reporting

Where quantitative indicators are used, the information is almost always presented as a single number, e.g. “A death rate of 100/1000”. This form of data presentation fails to communicate the relative level of precision available for the numbers presented. More accurate would be the inclusion of a 95 per cent statistical confidence interval, e.g. 100/1000 plus or minus 10/1000. This requires some mathematical calculations.

Data sets should also always be recorded, maintained and presented with answers to the following four questions:

  1. What was the underlying population being surveyed—the catchment population from which the sample was drawn or intended to be drawn?
  2. What was the time frame (which months or dates) that the data referred to? Where recall or retrospective analysis is being conducted, what intervals of time were being asked?
  3. What was the sampling method? If randomization was used, or stratified sampling, what was the strategy? What was the sample size (referred to as “N”)?
  4. What operational definitions were used by those generating the original data? If diseases are used, for example, what “case definitions” applied in that situation? If unemployment statistics are generated, what do the categories mean—full or partial unemployment, among the total adult population or among those “seeking work”?28

Researchers should also describe their impressions of the imperfections in the data used, and the biases inherent in them, in order to communicate the level of uncertainty associated with the numbers reported. In addition, researchers should give the reader a sense of the level of precision implied by numerical estimates.

Indicators of inputs (such as food distributed or the value of medicines imported) or PROCESS (number of medical visits, the number of diarrhoea or measles cases reported, or the number of children out of school) are easier and quicker to collect and can be more timely and detailed than OUTCOME indicators such as mortality rates. Other OUTCOME indicators such as the percentage of children malnourished or the percentage of homes with access to clean water, while only partial expressions of the overall health situation, are relatively easy to collect in special surveys and are very useful for monitoring humanitarian conditions. By contrast, a small increase in risk of death, which is a rare event even at relatively high rates, is far more difficult to establish with adequate statistical confidence. That is, a change, which may be important for assessment purposes, may be very important to know about even if it is a small change, but because it is small may be very hard to observe or conclusively document. This is why, for instance, there is frequently a great deal of confusion and controversy over reports on infant mortality rates.

Analysis of the data, inferences that may be drawn from it, and what it is felt to demonstrate, should only be presented in a section after the data. Editorial terms should not be mixed in with the summarization or analysis of information. Data should first be presented, then any analysis or editorial comments about its meaning can be presented. In this way, the reader is permitted to first make his or her own judgement about what the data says.

 

Sanctions Home · Resources · Enquiries · Contents  · Field Guidelines Home