Preparing a successful data request
The steps below are best practices for making data requests. Data owners are frequently overburdened with daily operations. You can make it as easy as possible for them to fulfill your request by planning carefully and addressing all of the relevant questions up front.
Design questions that can be answered with data.
Before you can make an effective data request, you need to know exactly what you're looking for and why. Below are some questions to answer before moving forward.
"What is the overarching goal or objective that I want to accomplish?"
Knowing what you want before turning to data is a prerequisite to effective data use. For tips on writing goals and objectives that map clearly to specific types of data, see Appendix A: Setting SMART goals.
"What do I want to find out or measure?"
If you're conducting an experiment, then the data might be response statistics. If you're looking through financial or budgetary data, it might be spending trends, projections, and forecasts. Other situations will call for other kinds of measurements, but thinking about what you expect to find — or even the nature of the answer you're looking for — will help you identify the most relevant metrics. Some examples of suitable research questions for your data include:
- To what extent is the service provided by [Program X] reaching constituents?
- Do budget allocations meet needs? Are budget allocations used by the populations in need?
- How has enrollment in X program changed over the last five years?
- What's the supportive and constructive feedback of our constituents?
"How much data do I actually need?"
In the age of "big data," it may be tempting to collect as much information as possible and sort through it later. But this approach is counterproductive. Looking at too many metrics may overwhelm the analysis or distract you with red herrings that don't actually address your question. And requesting more data than you need from a sister agency could make responding to your request more time consuming than it needs to be. Some suggested methods of reducing the data you have to look at include:
- Looking at data over a smaller timeframe than the whole length you want to study. Looking at smaller windows can provide clues about how better to dissect the bigger picture without having to sort through as much information.
- Identify the information you need for your data analysis. You don't have to use every field present in a dataset to answer your question. That's why it's imperative to know what you'd like to measure so that you can decide which fields are likely to contain relevant information.
- Build some redundancy into the data you request. Once you've identified the information you need, consider how you'll work around a record that's missing data in a field. Are there other fields you could use as a proxy? Add these fields to your request.
- Use derived statistics whenever possible. If you know what question you'd like to answer, then you should be able to define specific stats that provide an answer. Using a limited selection of fields, you can compute averages, differences, percentages, and other derived statistics that construct these measurements out of less data while still answering your question.
"How can I break my analysis into steps?"
Solving a series of subproblems is almost always easier than solving a whole problem at once, and data analysis is no different. Can you segment your problem into smaller steps and use different facets of the data to answer sub-questions?
In particular, what's a good "first step" to tell quickly whether or not you're on the right track? (This is called "failing fast," which is a good practice to save time and energy by weeding out solutions that won't be productive.)
Identify the data.
- Before you can request data, you'll need to identify the agency and program that's likely to have the data you're looking for.
- Search for any public data dictionaries or reports that can tell you what data is available.
- Find the data owner for the agency or program
Once you've identified the data source, reach out for any standardized processes the agency may have for making data requests. Each agency will have their own process.
Data owners are accountable for the proper use and security of their data. In order to evaluate the risks and benefits of sharing data, they'll require an explanation of how you plan to protect and use it. The data owner's agency will have its privacy and security processes, and your practices will need to comply with them. When requesting data, you should be prepared to explain:
- What your objective is (that is, the question you would like to answer)
- How a partnership with the sister agency can you help answer that question, and how that question is a shared inquiry for both agencies
- What data you need to study your objective (including as much specific discussion of the fields of interest as possible)
- How you'll account for potential sources of implicit bias in the data you're requesting (see Review data for implicit biases for more information)
- Why you need that data and what you hope to gain or analyze from the data
- Who'll have access to the data if you receive it
- How you'll ensure that the data is handled ethically, safely, and securely, particularly in reference to the sister agency's data practices
- The timeline with which you hope to answer the question and analyze the data
- How communication between your agency and your partner agency will take place (should there be weekly check-ins about the data usage? reports filed about data activity? etc.)
Make the case for data sharing.
An effective data request starts with a clear description of why the requested data is important to both parties' missions and what you plan to achieve with the data. Understand the data owner's possible motivation to take on the data sharing project:
- Will this request improve services or data quality for the partner?
- How will residents or staff be better served?
To ensure full and sustained participation, each party involved in the data sharing effort should be able to see direct benefits from their involvement.
It may also help to think about how you will respond if your first request is denied. Here are some common reasons data providers deny requests according to the National Neighborhood Indicators Partnership:
- "Preparing the file will burden our already-overworked staff."
- "I'm afraid of being burned by bad publicity."
- "I'm worried about mishandling or improper release of the data."
- "The data are a mess."
Specify the parameters of the request.
The data owner needs to understand exactly how to fulfill the request. Some useful parameters or filters to consider include:
- The date range
- Specific fields or columns
- Specific datasets or databases
- Filters such as age, people tied to a specific program, or geography
Keep the scope and timeframe realistic.
Ensure that the data owner can fulfill the data request in a reasonable timeframe and with their available resources. If a data request is too taxing on the data owner, they may reject the request until the parameters change or the requester offers additional resources. Consider the time required for crafting and signing and data sharing agreement.