“The Goals paint a picture of the world as it could be.”
- Claire Melamed, CEO, Global Partnership for Sustainable Development Data (GPSDD)
Bloomberg’s Data for Good Exchange (D4GX): Data Science for SDGs event brought together data scientists, corporations, academics, practitioners, and civil society to discuss issues and explore opportunities related to data science and social good.
Workshop Framing and Goals
“Show of hands – who thought today’s workshop would cover how to use data science and administrative data to report on SDG indicators?”
Figure 1: Facilitating questions from participants on brainstorming uses of administrative data for the SDGs
When thinking about data science for the SDGs, many practitioners think straight to reporting. This is unsurprising given the focus on using data science to transform administrative data into national official statistics – like those used to monitor the SDGs. We aim to break this mindset for two reasons: 1) administrative data are often not sophisticated enough to really facilitate that transformation, and 2) still, reporting on these indicators is a means to an end.
Though national statistics can tell us a lot about SDG progress, they don’t share much about how to achieve that progress. Administrative data’s value lies in its ability to share service delivery-level information on how progress occurs – to inform effective service delivery and achieve the SDGs.
How could it be that these critical systems are so underutilized?
Administrative data systems are the single greatest area of underinvestment in national statistics strategies. Research by SDSN TReNDS and others have found that this can be attributed to both underuse and concerns over accuracy, methodology, sample size, interoperability, and privacy & security.
DG’s work understanding data use, specifically administrative data shed light on a vicious cycle: administrative data are often not useful because data quality is poor, but the quality is poor because the administrative data are not always useful.
Figure 2: The current cycle of administrative data use
Across the workshop, we aimed to break that cycle – considering practical risks and barriers to using administrative data to respond to the SDGs, brainstorming how data science can mitigate those issues, and identifying priorities & next steps to catalyze use and incentivize quality. Our biggest takeaways were:
1. Administrative data is still the “black sheep” of data ecosystems.
We asked participants to brainstorm decision-making use cases “if they had access to high quality administrative data” -- but that is a big “if.” It took some encouragement to get participants to imagine a data ecosystem in which administrative data quality/ accuracy/ coverage issues do not hinder decision making.
2. People understand the value of administrative data, and see its potential.
People were very comfortable discussing the service delivery decisions they would make, if they had access to high quality administrative data. But administrative data for decision-making does not occur in a vacuum. These data are part of a holistic ecosystem that includes national, “zoomed out” information that are key for providing denominators and context.
3. Interoperability is the #1 priority for increasing data quality, and overcoming data quality issues.
On one hand, increasing interoperability establishes data standards and increases utility, which ultimately reinforce data quality. On the other hand, the ability to incorporate additional sources can mitigate data quality issues, by boosting accuracy and coverage.
This workshop was a call to action to expand our thinking on what “data for the SDGs” really, practically means. To move beyond measuring the SDGs and towards achieving them, we need to consider a more holistic, interoperability-focused view of country data ecosystems.
To harness the potential of administrative data, demanders and suppliers within the #Data4Dev community must take that leap of faith – shifting focus away from data quality barriers, and towards service delivery opportunities. “Data facilitators” such as DG are critical actors in advancing practical conversations on data quality and use. Moving forward, we facilitators should seek to host more creative events to engage the community in imaginative, innovative administrative data use.