Project Ronald, an introduction

I live in a society in which the Government is restrained from taking any action which can be denounced in a 200 word article in the The Sun. How British democracy has attained this pitch of excellence need not concern us here, rather we must ask: how can we increase the explanatory power of articles in The Sun?. To do otherwise is to favour the obstruction of public policies exclusively on grounds of simplicity rather than demerit, with the result that any bad policy may be adopted if its proponents first trouble to complicate it unnecessarily.

Newspaper articles and narratives are but the finished product in a long supply chain of knowledge. The Open Knowledge movement should conceive of itself as like unto the resourceful colonial railway companies of old, which opened up vast new territories and connected raw materials with factories.

Project Ronald, named for Ronalds Coase and McDonald, is my effort to connect up all the world’s tabular data. The aim is to enable speedy answers to the question “are these things correlated”, where this involves extracting information from public datasets published by different organisations. I take “speedy” to mean “in a couple of seconds”, and I have created a short video demonstrating the process from when someone selects successful search terms to find a dataset, to being able to do regressions in R, a statistics system; the video is heavily padded to show I’m not cheating.

The overall approach of the project is the New Jersey / UNIX philosophy, of getting a simple architecture of loosely connected components to work together, and worrying about obtaining the best implementation of each component later. Subsequent blog posts will relate the current implementation and future challenges.

