I have spent a pleasant day or so researching how much spending data national governments publish on the web.

A couple of observations:

There are about ninety countries with usable budgetary data.

A general ability to guess one's way in Romance and Germanic languages lets you read the majority of national government websites: English, Spanish, Portuguese, French and Dutch collectively give you the Americas, the Caribbean, Oceania and almost all of Africa. Arabic gets you another dozen countries.

The international language of maladministration turns out to be French.

Monarchies and socialist states take a less relaxed attitude towards brokenness on official websites. Republics are more laid back. This seems to be independent of national income. It's not clear whether authoritarianism discourages transparency: the right to reuse and redistribute a full breakdown of how much the state spends on torturing political opponents would presumably have a strong chilling effect.

As colleagues had already surveyed Europe, I started with Commonwealth countries, then mopped up the rest, circling the globe east from South America, which is roughly in descending order of fiscal transparency.

For a country's fiscal data to be maximally useful it needs to satisfy eight conditions; namely it must be

  • discoverable
  • open data (reusable)
  • structured data (in tables and columns, not in words)
  • machine readable (spreadsheet not PDF or Word)
  • disaggregated
  • up to date
  • timely
  • in an open format

Some of these are absolute necessities; others are part of the cost function for obtaining useful results from the data.

Examples of where these conditions didn't obtain:

About fifty countries seem to publish no data at all.

Most publication does not explicitly state that re-use of the data is permitted.

A very large amount of the data is in tables in PDFs, which is expensive and unreliable to extract.

Some of the data is in image files such as GIFs. These were clearly generated from an actual spreadsheet, so may be a deliberate attempt at obfuscation. In the opposite direction, some of the data is in the form of scanned paper documents, indicating attempts by public officials to counteract closed practices in other parts of the bureaucracy.

Quite a few African countries gave up publishing budgetary data in 2008. A lot of the material in the developing world is a few years out of date.

The sorts of problems above are the general problems of obtaining open knowledge from government data. In the case of spending data, there is an additional problem that the data should ideally be disaggregated: the absence of a standard multidimensional financial data format has left government publishers either providing a subset of possible aggregations or not bothering at all.

The output of all this is linked from the Openspending wiki Countries page.