Frequently Asked Questions

Table of Contents

Add your US City to the Census

How can I add a US City to the Census?

Reach out to usopendatacensus@gmail.com and we will work to include your city in the template. Each city should also choose a community point person to act as an Open Data Census Librarian (see Submitting Information).

About the Open Data Census

What is the US City Open Data Census?

The US City Open Data Census is an ongoing, crowdsourced measure of the current state of access to a selected group of datasets in municipalities across the United States. Any community member contribute an assessment of these datasets in their municipality at any time. Census content will be peer-reviewed periodically by a volunteer team of Open Data Census Librarians led by the Code for America Brigade and the Sunlight Foundation.

How can the results of the US City Open Data Census be used?

The US City Open Data Census does not aim to create a comprehensive list of open datasets around the United States for data users, nor does it aim to define what datasets are the most important to open. Instead, the US City Open Data Census seeks to be a benchmarking tool, which people can use to ignite conversations with their government about open government data.

What's the current state of the US City Open Data Census?

The US City Open Data Census was launched on Open Data Day (February 22, 2014) in conjunction with CodeAcross 2014.

Another event-push for US City Open Data Census contributions took place during the National Day of Civic Hacking (on May 31 - June 1, 2014), but contributions may be made at anytime.

Who created the US City Open Data Census?

The US City Open Data Census was initiated in partnership by Code for America, the Sunlight Foundation, and Open Knowledge Foundation. It is maintained by Code for America, Sunlight Foundation, and Open Knowledge Foundation staff members, Code for America Brigade, and the Open Government Data working group with contributions from many members of the wider community.

What is the history of the US City Open Data Census?

The international Open Data Census was created by the Open Knowledge Foundation in 2012, and provides a clear measure of available open data -- not what is claimed, but what data is actually available and how open it is. The original Census was designed by open data experts, including the Open Knowledge Foundation Open Government Working Group, and undergoes a process of peer review and evidence checking to ensure high quality results. In early 2014, Open Knowledge Foundation announced the bespoke Local Open Data Census. The US City Open Data Census was launched on Open Data Day in conjunction with Code Across 2014 in a partnership between Code for America, the Sunlight Foundation, and Open Knowledge Foundation. Ongoing reporting on local open data access by Code for America and the Sunlight Foundation creates a focus for debate and review.

What datasets are included in the city Census?

There are 18 initial datasets currently considered in the 2014 US City Open Data Census:

Dataset Details
Asset Disclosure Top-level government officials’ financial assets, including: name of top-level government officials, title, investment information, prior and current business relationships, real estate interests, and personal income (including gifts and travel or speaking payments). (More info)
Budget Municipal budget at a high level (e.g. planned budget by unit of appropriation with a programmatic description of each unit of appropriation). This category is about budgets which are plans for expenditure (not actual expenditure in the past). (More info)
Business Listings A directory of all licensed businesses in the municipal area, including key information such as: name, address, contact information, business type. (More info)
Campaign Finance Contributions Amount contributed to each candidate and by whom. (More info)
Code Enforcement Violations Building code inspection data surfacing reports on particular properties from code enforcement officials. (More info)
Construction Permits Locations of issued construction permits. (More info)
Crime City crime report data, preferably at a reasonably disaggregated level. (More info)
Lobbyist Activity Actions of named registered lobbyists. (More info)
Parcels Property parcel boundary data. (More info)
Procurement Contracts The full text of municipal contracts with vendors, including amount, awardee (name, address), date awarded etc. (More info)
Property Assessment Data about assessed property values. (More info)
Property Deeds The recording of property sales, mortgages, and foreclosures. See your local Registry/Recorder of Deeds. (More info)
Public Buildings Locations of city-owned buildings. (More info)
Restaurant Inspections Outcomes of food safety inspections of restaurants and other similar providers of food to the public. (More info)
Service Requests (311) Non-emergency service requests, (that some cities request by dialing 3-1-1), such as: graffiti, non-working traffic lights, noise complaints, parking law enforcement, and potholes. Data should be at granular (per request) level. (More info)
Spending A complete list of city expenditures at a detailed transactional level (including: tax breaks, loans, contracts, grants, and operational spending). Records of actual (past) municipal spending at a detailed transactional level, for example, at the level of month to month expenditure on specific items (usually this means individual records of spending amounts at a fairly granular level - e.g. $5-50k rather than at the $1m+ level). Note: a database of contracts awarded or similar is not considered sufficient. This data category refers to detailed ongoing data on actual expenditures. (More info)
Transit Timetables (schedules), locations of stops, and real-time location information of all municipally run or commissioned transit services (buses, subway, rail tram etc). (More info)
Zoning (GIS) The mapped zone (GIS) shapefiles of designated permitted land use in your city. (More info)
Web Analytics Overall traffic stats, page-level traffic stats, site search logs, and browser-agent breakdowns from your city’s primary web property. (More info)

Provide feedback and suggestions for additional datasets to include in future censuses at the Census discussion list.

Detailed discussions of the data categories relating to submissions and review Issues and challenges for submitters and Open Data Census Librarians (reviewers) are discussed on the Census discussion list. See the full Census discussion list archive here.

How reliable is the US City Open Data Census?

The information in the Census is collected by open data experts and enthusiasts around the world including the Code for America Brigade, the Sunlight Foundation Local Policy Team, and the Open Knowledge Foundation Open Government Working Group. The Census data undergoes a process of peer review and evidence checking to improve the quality of results. That said, we rely on the contributions of local community users of government datasets, so if you see a problem please submit a comment. Contributors and Editors are also cited on each dataset submission.

Submitting information to the Census


At the moment we are only collecting information on data that is currently available in 2014. The US City Open Data Census is a survey of the state of open data around the United States focusing on the the availability and openness of a specific set of key datasets.

What's the US City Open Data Census data collection and review process?

It works like this:

  1. Contributors submit information about the availability (or not) of key datasets in their city (for example Budgets in San Francisco).
  2. For edits to submissions, contributors may Propose Revisions.
  3. Open Data Census Librarians either approve (with or without amendments) or reject the Proposed Revisions.
  4. If approved, these Submissions become an official entry in the Census and are displayed on the website.

How can I improve the Census information about a US City?

  1. If you have information about a dataset which isn't in the Census yet you can add it! Anyone can submit new information to the Census.
  2. Select your city in the list and click on it.
  3. Click the blue “Submit Information” button on the right next to the appropriate category.
  4. Fill the form based on the dataset you have found (there are detailed instructions on the page).
  5. Click Submit. Your submission is now waiting for review, and will be visible in the table as 'awaiting review' after a few minutes.

How can I correct an existing entry in the Census?

We welcome corrections to the US City Open Data Census. Anyone can submit corrections to the Census.

  • Select your city in the list and click on it.
  • Go to the correct city by clicking it.
  • On the city overview page, click the blue “Submit Information” button on the right next to the appropriate category.
  • Fill in the form based on the changes you want to make to the existing data.
  • Click Submit. Your submission is now waiting for review, and will be visible in the table as 'awaiting review' after a few minutes.

How can I add a US City to the Census?

Reach out to usopendatacensus@gmail.com and we will work to include your city in the template. Each city should also choose a community point person to act as Open Data Census Libarian (see below).

How do I become my city’s Open Data Census Librarian?

Open Data Census Librarians are the reviewers and point persons for the Census assessment in their community. They are responsible for filling out a profile page, becoming familiar with this FAQ and the Data Explainers, and reviewing open data periodically.

Open Data Census Libarian Basic Responsibilities

By adopting a US City Census page, you agree to do the following:

Become familiar enough with the US City Open Data FAQ and Data Explainers so that you can serve as a front-line resource on these materials to your community. Periodically review community datasets to keep your community up-to-date, i.e. review new datasets monthly (and/or after coordinated national hackathons), and all datasets annually. Review datasets assessed by your fellow community members, checking for: missing or dead links, and mis-assessed data availability criteria (including is the data: available online, publicly accessible, cost-free, machine-readable, available in bulk, license-free, and up-to-date.)

Getting Started
  • To adopt a city page in this project, email usopendatacensus@gmail.com and let them know you'd like your city added to the page and who your Open Data Libarian community reviewer will be.
  • Dig in! Review datasets and Open Data Librarians should vet previously reviewed datasets.
Stay in touch:

What do all the questions about the datasets mean?

When filling in information about a dataset, there's a list of questions to answer about the availability and openness of the datasets. The answers then appear in the City overview page for the Census.

Question Details Weighting
Does the data exist? We are looking for government data from an official resource either issued directly by the government or by a third party representing the government. Note that data offered by companies, citizen initiatives or any non-governmental organisation are not valid for submission. This means that we are looking only for what is officially published by governments. In some cases, the government will give the right to publish the data to third parties and in such cases submissions linking to third party sites are valid. It must, however, be explicitly stated on the data site that it is commissioned by the government (if the data is to be found on a company domain or on a domain of a civil society organisation or initiative, check if the organization has an agreement with the government to be the official source). 5
Is data in digital form? Can the data can be accessed from a computer or is it in the old paper form? If you can find it online, then it’s digital, even if it is just a scan of the paper the dataset is on. Note that data can be digital, but not accessible online. For example: A country budget can be stored on a private network in the government, but not on the Internet. This means that the data is digital, but not publicly accessible, (which is our next criteria, see further below). Still, if you have knowledge that the data is digital somewhere inside the government (for instance, if the government official you speak to tells you so), then you should mark this one “YES” (and add in the notes that you have the information from a phone call with a certain department or official person). 5
Publicly available? This mean the data the can be accessed by the public and it it is not restricted. Click "Show more" to she the conditions for publicly open. The data can be considered public when: 1. It can be accessed online without the need for a password or permissions. 2. If the data is in paper form, can be accessed by the public and there is no restrictions on the number of photocopies that can be made. Data is NOT publicly available in the following cases: 1. It can be made available only after a freedom of information request (FOIA) is sent. 2. It can be accessed only by government officials. 5
Is the data available for free? If you have to pay for it – it is not free. 15
Is the data available online? if it’s on the Internet and you can access it, it’s online. Notice that if the government emailed you the dataset but didn’t upload it to any webpage, it is not to be considered available online. 5
Is the data machine readable? Files are digital, yes, but not all can be processed or parsed easily by a computer. In order to answer this question, you would need to look at the datasets file type.
As a rule of thumb the following file types are machine readable: - XLS - CSV - JSON - XML If the files are in the following formats, the are NOT machine readable: - HTML - PDF - DOC - JIF - JPEG - PPT If you have a different file type and you don’t know if it’s machine readable or not, send an email to the Open Data Census list.
15
Available in bulk? Data is available in bulk if the whole dataset can be downloaded easily. It is considered non-bulk if the citizens are limited to getting parts of the dataset through an online interface. For example, if restricted to querying a web form and retrieving a few results at a time from a very large database. 10
Openly licensed? This mean the data term and conditions follows The Open Definition, which stipulates that in order for data to be open, it needs to be free to use, reuse or to redistribute. Which licenses are open? On the site of The Open Definition you can also see a directory of licenses that are certified open. How can you find the license? That can sometimes be the tricky part. Usually, license or Terms & Conditions can be found at the bottom of the website (linked in the footer) or under the website “About” section. If there is no visible license or the license is under the country's name, for example - “Copyrighted under the state of Lebanon” and there is no terms and condition or any other information on the site, the information is not open and you should answer “No”. If the information is licenced under Creative Commons licences, like in the case of Australia for instance, then the data is openly licensed (except if they are using Creative Commons’ non-commercial licenses - the ones with “NC” or “ND” in them - these are partially open, but not fully open according to The Open Definition - and therefore in this case to be considered not open). Note that sometimes countries do not license under Creative Commons, but the terms and conditions do allow use, re-use and distribution. In that case we suggest you write to the Open Data Census list and get feedback from the community. 30
Is the data provided on a timely and up to date basis? This means if the data is relevant to the year/time it is suppose to represent? Click "Show more" to learn about how to find information for this question. For instance, if the latest national budget made available is from 2010, then it is not up-to-date. However, not all datasets need to be updated in the same frequencies. Transportation data can be updated on a daily basis while postal codes might not change for many years. Refere to the tutorial if you are having difficulties to decide the best time option for this dataset. How to find when the data was published? - If the dataset was found on a government portal, there will most often be a timestamp attached to it. - If the data was on the a government site, sometimes the date will be written next to the dataset link or in the news section. - Sometimes the date stamp is in the dataset itself. For example, a tab that is named after the date it suppose to represent (you can also try to download the data and see the creation date of the file, although that might not represent the right date - use your judgement and again, leave good comments about your assessment). - Lastly, sometimes there are no timestamps at all. In that case,it might be most fair to mark it not timely or up-to-date 10
URL of data online? The link to the specific dataset if that is possible. Otherwise to the home page for the data. If that is not impossible, then the link to main page of site on which the data is located. Only links to official sites are eligeble, not third party sites. When it is necessary for submitters to provide third party links, then they are put in the comments section.
Format of data? This question describes the form that the data is available in. For example, for tabular data it might be: Excel, CSV, HTML or even PDF. For geodata it might be shapefiles, geojson or something else. If available in multiple formats, the format descriptors are listed separated with commas. Any further information is put in the comments section.
URL to license or terms of use? Please provide the url to the license or terms of use governing access and use of this data (if known). If there is more than one URL you would like to list, please just list the primary one in this field and add further information in the comments box below.
Date data first openly available? This question describes when the data first became openly available (online, in digital form, openly licensed etc). Sometimes this is approximate. For example, "2012" or "Jan 2012". If there is a precise date, then they are typed in in a yyyy-mm-dd format. If the data is not open, then this question will instead describe the date the data first became available at all. (Note: Obviously some open data was available in other forms previously, so the date specified here is the date it became openly available).
Title and short description? Please enter the title(s) and excerpted short description(s) of the dataset(s) as provided by the publisher. Description should be kept to a few sentences (max 1 paragraph)
Data Publisher? If known, please enter the department / organisation responsible for publishing this dataset along with contact email (if known). If the specific person responsible for this is known please also list them.
Rate Quality of the Data (Content)? Rate the quality of the data in terms of its actual content (ignoring structure) - is the data accurate, is it provided at a detailed, granular level, etc (and ignoring whether the data is in PDF or Excel or whether it is just human-readable rather than machine-readable). 1 is worst, 10 is best. Please justify your rating in the details and comments section
Rate Quality of Data (Structure)? Rate the quality of the data in terms of how well it is structured, and how easy it is to use. For example, is the data provided in a good data format (CSV vs PDF), are the files well structured and easy to process programmatically (or do you have to clean them before ue), is there an API? 1 is worst, 10 is best. Please justify your rating in the details and comments section.
Further Details and Comments (optional but strongly encouraged)? Please add detail here to expand on and support your answers above. Information on data availability is especially useful, for example, is the data partially available, are there plans to make it available in the future? is the data available from an unofficial source.
?

How should I use the comments/details field when submitting and reviewing?

Comparing datasets between local governments is, as mentioned, a complex and often difficult task. This is why the comments/details field is public, so that submitters and Open Data Census Librarians can explain the reasoning for their choices. In other words, the comments/details field is your main tool to ensure that your city’s entries and scores can be compared to other cities’. We therefore strongly encourage you to be thorough in your comments, as that will reflect on how your city is perceived and compared.

Tip: Try to see the comments of cities with similar score in the given category, or go to cities whose data systems and governance structure may be similar to your city’s.

Questions about the assessment of openness

Are data to be considered publicly available if a right-to-know request, such as a freedeom of information (FOI) or public records request, is needed to retrieve them?

For Census purposes, publicly available means without having to put in a right-to-know request -- so it should be available online without further ado.

What about cities where there is no official mention of licensing attached to the data in question?

Licensing of online datasets can be found in the datasets’ metadata or sometimes if the dataset is available via an online open data portal or database within that portal or databases Terms of Service. Licensing might also be articulated in your city’s open data policy. For maximal legal re-use, open government data should have a worldwide public domain designation, such as such as the Creative Commons CC0 statement or a Open Data Commons Public Domain Dedication and License (PDDL).

What formats can generally be considered machine readable?

Since machine readability is not strictly a matter of data format, here are some further considerations to consider: HTML, even well structured, will only sometimes count as machine-readable and is, by default, not machine-readable - because it most often needs parsing and thereby is not directly reusable.

CSV, XML and XLS would usually count as machine readable, but not always. Consult Sunlight Foundation’s Open Data Guidelines if you’re in doubt, or if there is a dispute, reach out to the Census discussion list.

In general we suggest to look at machine readable as a combination of fact and objective judgement, and not say that a particular format is automatically machine-readable or not machine-readable. So, machine-readable is to be understood in the sense that you could extract the data and directly reuse it.

This issue is discussed in more detail in the Sunlight Foundation’s Guideline on mandating data formats for maximal technical access and in this thread on the Census discussion list.

I want to help, but I'm not sure where to start!

On the US City Open Data Census, you can see more about the 17 categories of data that we are focusing on in the About section of the site. Each of the entries for each city has been sent in by community members, who have simply used Google or other search engines to find out what datasets are available (simply finding the URL) and under which circumstances (are the data openly licensed, can they be downloaded in bulk etc.) -- and then made a submission via the form on the site, where they simply fill out a handful of questions and put in the URL for the data. All in all it is a really easy (and fun) task that helps to put a city on the open data map -- and it's easy to get started!

You can do some research yourself! Pick a city where the Census shows there is data missing or where there are comments showing that there's uncertainty (perhaps the licence hasn't been specified, for example) or a city that you know well. A targeted search or working with others is most fun and helpful. You could ask on the US City Open Data Census discussion list whether there's a city or area you could help with. Get together with friends, colleagues, your local open data community, Code for America Brigade, or Open Knowledge Foundation local group and dig into data on a given topic or for a given city together.

Where can I discuss the US City Open Data Census with others?

Join the Open Data Census discussion list.

I'm confused! How can I get help?

There are lots of people who can help on the Open Data Census discussion list, and there are no silly questions, so we encourage you to post there.

Understanding the US City Open Data Census results


How does the scoring system work?

The US City Open Data Census measures the state of openness of 17 datasets for each city. The overall score for a dataset is based on the response to specific questions with varying weightings -- the weighting for each question is listed in the question table above. The overall city score is then calculated from the score on each dataset.

The score algorithm is:

If answer is ""yes"" to a question add the weighted value to score for that dataset

  1. Add up total scores for each dataset to get a city score
  2. As the weightings indicate, timeliness is now included with a weighting of 10.

One of the aims of the questions for each dataset is to provide an increasing set of requirements leading up to full openness (excluding ‘timely’ which is important but not a requirement for open data). It should be noted that this does not mean each question directly builds on the previous since some of are parallel (e.g. digital form and publicly available) but in general there is a progression, so ”No” on an earlier question may well imply ”No” on a later question.

If you are intrigued by Open Data...


Learn more

If the US City Open Data Census has caught your interest, there's lots more open data and open government to learn about.