Crowdsourcing improvements to government open data
Summary
Before joining Skylight, Kin Lane created Adopta.Agency — a civic crowdsourcing model that enabled developers and organizations to improve government open datasets and transform them into usable APIs. Backed by the Knight Foundation, the project demonstrated that distributed contributors could systematically increase the quality and accessibility of public data at a scale no single agency could achieve alone.
The challenge
Following implementation of the White House’s Open Data Policy, federal agencies released thousands of datasets to the public. The mandate was a landmark step toward transparency. But releasing data and making it usable turned out to be two very different things.
Many datasets were incomplete, inconsistently formatted, or missing the structure needed to integrate into applications. Developers who wanted to build on public data often spent more time cleaning and interpreting it than using it. Much of the government’s open data investment sat underutilized — technically available, but practically inaccessible.
The core problem was one of scale. No single agency had the capacity to clean, structure, and maintain thousands of datasets on an ongoing basis. What was missing wasn’t more policy. It was a model that could harness outside contributors to improve public data systematically, the way open-source communities improve software.
The solution
Kin designed Adopta.Agency around a simple insight: the same open-source collaboration model that works for software could work for data. Rather than asking agencies to fix everything themselves, he created a framework that let anyone — developers, civic technologists, organizations — contribute structured improvements to public datasets.
The key design challenge was making contribution accessible. Most civic technologists were comfortable with code, but not with the messy work of cleaning and structuring government datasets. Kin built the blueprint process to bridge that gap. The blueprint broke contribution into small, value-adding steps — JSON cleanup first, full API last. Participants didn’t need to take on the entire dataset at once; each step added value on its own.
He chose GitHub as the collaboration platform for a deliberate reason: civic technologists already lived there. GitHub gave distributed contributors a transparent, version-controlled environment for working on shared datasets — the same workflow developers already used for code. The barrier to entry wasn’t learning a new tool. It was just applying familiar skills to a new domain. Every improvement was visible, reviewable, and reusable.
With grant funding from the Knight Foundation, Kin built a working prototype and applied it to five federal datasets — the U.S. Federal Budget, the Veterans Affairs Open Data Portal, Department of Education Tech Data, My Brother’s Keeper, and ClinicalTrials.gov. Five different agencies, five different data types — the same blueprint worked across all of them. Each dataset produced a reusable pattern others could follow.
The results
- Secured grant funding from the Knight Foundation to develop and prototype the model
- Applied the approach to five federal datasets, producing a working proof of concept across diverse agencies and data types
- Created five reusable blueprint patterns that enabled others to replicate dataset improvements independently
- Demonstrated a scalable civic crowdsourcing model for improving government open data — without requiring additional agency resources