Data Reference: Citing Data
We cite data for the same reasons we cite anything else: to give credit to those who made the data available and to help our readers find the data we used. A good citation answers 4 basic questions:
- Who collected, produced, or provided this resource?
- Authors, researchers, data collectors, and/or organizations that sponsored the research.
- What is this resource?
- A title, the data's version or edition, the format or resource type.
- When was this resource collected, created, or made available?
- A dataset may have a date-of-collection as well as a publication date.
- Where can someone find this resource?
- The website, organization, and/or publisher that provides access to the data. Whenever possible, this should include an identifier like a DOI or a URL for the data's website. Make sure your citation includes enough information for a reader to find the data easily!
The details you need to answer these questions will be different, depending on the data you are using. It's not an exact science, but as a general rule, a good citation will answer these questions accurately and thoroughly. It's better to have too much information than too little!
Data Citation Instructions
Many datasets, databases, and data resources will give you a recommended citation, or suggest how you should cite data from that source.
Sometimes the website has this information on individual data set pages. More frequently, the website or database where you found your data will also have information on how to cite that data in their FAQs, "About" page, or "How to Use" information.
- Dryad: each dryad data package includes a recommended citation for both the data package itself and for the academic article associated with that data package.
- The National Snow and Ice Data Center provides citation guidelines in their "Use and Copyright" page.
- The Roper Center's "Cite Data" link is visible at the bottom of each page in their databases. This guide to citing Roper Center data offers a recommended citation style and tells you where to find individual citations in their dataset abstracts.
- The U.S. Census Bureau's FAQs include information on how to cite data from American Fact Finder.
Data Citation Tools
The APA Style Guide provides a recommended citation format for databases, with examples, but other style guides, including MLA and Chicago, don't--so you'll have to create your own.
For citation styles that do not have a specific dataset format, you can base your citation on the closest equivalent formats: if the dataset is online, use a format for online items, and if it was created by multiple "authors" or editors or researchers, use a format for edited works or items with more than one author. Or, just base your citation off of the general reference format for your style guide.
Copyright & Use Restrictions
Copyright and Data
Under U.S. law, facts (and many collections of facts) are not protected by copyright; only original creative expressions are copyrightable. Even if someone claims that their data is copyrighted, the data themselves--the numbers or records or points of data--are not copyrightable. You can reuse that data in your own research without worrying about copyright infringement.
Copyright and Data Visualizations
Graphs, charts, and data visualizations can be more complicated. U.S. copyright law protects original works of authorship that are fixed in a tangible medium of expression. If a graph, chart, or data visualization doesn't have any originality--if the data are displayed in the only obvious way, or there's no creative choices beyond arbitrary colors in a graph--then it might not be protected by copyright at all. The more creative and original a visualization is, the more likely it's protected by copyright.
License and Use Agreements