Moonsheep is a technological framework that allows organizations to conduct crowdsourcing campaigns, in which volunteers are transcribing online massive collections of scanned documents. The end effect is the structured data available in spreadsheets, CSVs, or JSON APIs.
Engine Room's replication sprint
It has all started for TransparenCEE when Product Manager Krzysztof Madejski has been invited to participated in next edition of the The Engine Room’s replication sprint, which are offline week-long events dedicated to adapt a civic tech tool for needs of a local organization. This time the goal was to deploy a tool to coordinate a community of volunteers that would transcribe scans of public documents and turn them into structured open data. The beneficiaries were Ukrainian OPORA that needed to open reports listing donations to political parties and Hungarian K-Monitor that wanted to open data of the MPs’ asset declarations.
It soon turned out that the existing tool that supposed to be replicated was not enough for more complex cases that were to be dealt with.
Design sprint preparations
That’s how we ended up at a development sprint in Jahorina, Bosnia, just before the regional POINT conference, working on a more robust tool — codename Moonsheep. We have engaged partners having past experience in the topic: Engine Room that organized two replication sprints and performed a thorough evaluation of existing tools; Open Data Kosovo that supported Engine Room in Quien Compro implementation and that has recently created Decode Darfur microtasking website for Amnesty International; K-Monitor that had practical experience with transcribing and verifying data using Vagyonnyilatkozatok, a website developed on the last sprint.
The sprint started by us agreeing that we won’t evaluate any existing tool. Instead we’ve planned to design an ideal tool that could be smoothly replicated for low-tech organizations and to see where it will take us.
It’s surprising how much we did in just three days:
We defined two beneficiaries roles at the organization adapting the tool:
1) a product owner who is a subject matter expert
2) a techie who is eager to employ tech tools, but may have limited coding experience
We defined a replication process involving two above roles as well as external experts
We sketched a needs assessment survey using which organization can prepare itself for the replication sprint. It checks preconditions such as organizational readiness, expected project impact and data availability
We defined core features and broke them down into Github issues
We designed mockups for the most crucial parts of the tool
We assessed which existing codebase/tools could be used to build Moonsheep functionality
Engagement of most of the project participants was so far on a voluntary basis. What we needed was team of people having enough time to deliver ideas and features we were discussing. That’s how we contracted following people:
Alan Zard who has coded the frontend
Piotr Pęczek who has coded the backend
Now we are onboarding others how they can use Moonsheep to open data on our new shiny landing page.