Last week I was invited to a preview of the new UK open data initiative announced by the Cabinet Office.
The site - http://data.hmg.gov.uk - is still in a closed preview, but it looks good, is built on Drupal and uses some third party data management tools. It contains a list of 1123 data sources, powered by CKAN, of which 20 have been converted to a linked data format. It has taken three months to free up the data that forms part of the initial release, and much work has gone into data conversion from PDF and other non-malleable formats into CSV and linked data. Some sources have SPARQL endpoints that will allow direct queries to return RDF or JSON.
Within the data sets are some that contain comprehensive lists of all farms, schools and other entities, so these could potentially be used as pivots for mapping other data and metadata. Others are built on edubase, which already contains 6m triples including performance data for schools and other educational establishments. Crucially, the data is open to all for both commercial and non-commercial use, in an attempt to stimulate innovation.
See Harry Metcalfe's summary for more details and Jeni Tennision for more on the data formats debate.
Andrew Stott, the Director of Digital Engagement, has led the project, working with Richard Stirling, Lisa James and others, relying also on help from people working across government departments, such as Emma Mulqueeny.
The project has four objectives:
- Transparency and accountability
- Empower citizens to drive public service reform
- Unlocking social and economic value (Taxpayers paid for the data)
- Help Britain research and technology have a leading role in the next generation Web
Another consideration will be to look at policy and standards issues in the public sector as a whole. For example, there are 400+ Local Authorities but no consistent standards for data. Andrew said they will also try to take the opportunity of contract renewals with third parties to give government full rights over the data produced by commissioned projects, but he wants to start involving people and get some feedback before they go much further:
"We want to see what people can do with it to shake some trees in Whitehall to free up more [data]"
I was impressed both by the vision for the project and the way Andrew's team have gone about realising it, and it was a nice touch to discuss their work face to face as well as just read the announcement. I would strongly urge developers to answer the call to get involved, and ideally to avoid obsessing about tiny details of data formats and licensing in favour of actually building something that is (a) useful and (b) communicable to normal people rather than just data geeks. It is easy to forget that what we are doing here is so utterly marginal in terms of both investment and impact that it barely registers on either government's or citizens' radar right now. But there is no doubt in my mind that this is where government should be headed if it is serious about engaging better, improving services and getting more value out of existing investment.

Hi Lee, great article, and helping to raise awareness...
...( I found it via twitter).
Very important to engage our developer community to contribute, geeks are finally getting the recognition they deserve. Meeting of two cultures and generations for the common good. Democracy rules. Well done to dirdigeng and team for all their hard work, and good luck to all who will help bring it to the public in an understandable, easy to access way.
Power to the People.
chris