Click here for search results

Method of Gap-Filling

When data for a variable is missing for one or several countries in a certain period, the Bank uses a method for filling the data gap, which is based on the assumption that the growth of the variable from a period for which data exists has been the same as the average growth for those other countries in the same regional or income grouping, where data exists for both periods.

Given this simple approach, the size of the countries with missing data should not be too large compared to the size of the related country group. The Bank's gap-filling method require that the relative size of the countries with missing data has to be less than 1/3 of the group-total in some base period.

 gap1

The procedure followed in the examples above is quite straight forward, depending on benchmark data, data for every country in the base year. The Bank has established benchmark data for a set of key indicators; Gross National Income (earlier termed Gross National Product), Exports, Imports and Value added by industry (agriculture, industry, manufacturing and services).

However, benchmark data are also required for other indicators such as final consumption and gross fixed capital fromation -- in order to use the above described method of gap-filling for derivation of group or world totals. Thus, a method to estimate a benchmark or base-year data is required.


Estimation of base year data for additional indicators


The estimation of base year data for additional indicators rely on the assumption that the same relationship exists, at an aggregated level, between (specific) benchmark variables and other variables for each of the countries without data in the base year as for those countries in which base year data exist.


gap2 

These gap-filling procedures are run automatically, with no human intervention when tables for WDI, At A Glance etc. are created. This means that general rules are set, and applied on all countries and all variables when aggregates are generated. However, there is one significant drawback using such an automatic procedure, that is, the sum of sub-group totals estimated will not necessarily match the seperatly estimated group total. This because different growth assumptions are being used to fill the gaps depending on the level of aggregation.


Annex – Methodology


Let xit be indicator x for country i, year t. For the base year, t = 0, a selected number of benchmark indicators zi, e.g., GNP, exports, imports, population, are available for all countries. This is in general not true for non-benchmark indicators like value added by industry, private consumption etc., thus, some xi may be unavailable for some countries in t = 0.

Assume a case with n countries, where a relevant indicator zi is unavailable for the first k countries (k< n) in year t unlike 0.

First, check if the following is satisfied for year t = 0:

gap3 

If the above condition (1) is not satisfied –- no group total should be estimated.

Second, if condition (1) is satisfied, check if the following is true for year for t = 0: 

gap5 

If so, group totals can be estimated for year t = ……-2, -1, 1, 2,…… (or t unlike 0) as shown in (3) below:

gap4


if t > 0, and

gap6

if t < 0.

If condition (2) is not satisfied, if the following is true for year t = 0:

 gap7
If the particular indicator xi is not available for certain countries for t = 0, the missing xi value will be estimated for t = 0 as shown in (5) below:

gap8 

where xj0is missing for country j in year t = 0

When an estimate of xj0 is made, it follows that xi = zi in t=0, group totals can be estimated for any year following the procedures shown above in (1) and (3).




Permanent URL for this page: http://go.worldbank.org/RZVE9KDGT0