It is just as important to give credit to data as other types of publications. Providing attribution to research data promotes easier access and allows results to be verified and re-purposed for future study.
How do I cite data and software (code)?
- If the data or code is part of a paper's supplementary material, cite the paper
- If the data or code is associated with a paper but also exists separately (e.g., in a data repository or website), cite both the paper and the data/code separately.
- Always include enough information in the citation to identify a dataset or software with sufficient granularity that the work can be reproduced and credit properly assigned.
In all cases, check the source to see if the authors have indicated a preferred citation.
- Preferred citations are commonly placed in readme files or CITATION.cff files
- If the data or code is archived in a data repository, the repository may be able auto-generate a citation for you based on the information contained in the entry.
- Some programming environments can also auto-generate citations for you. For example, R has a citation() command that generates citations for packages.
If none of these methods yield a citation, the following information should be included in a data or software citation where appropriate
- Author(s) or creator(s)
- Title
- Publisher or data repository
- Publication Year (date dataset or software was released or published)
- Identifier (DOI or other unique identifier)
- Version
- Availability or access (URL, company that can provide data or software, etc.)
- Date accessed
Examples
| Type | Citation example |
|---|---|
| Dataset | Sidlauskas B (2007) Data from: Testing for unequal rates of morphological diversification in the absence of a detailed phylogeny: a case study From characiform fishes. Dryad Digital Repository. doi:10.5061/dryad.20. Accessed August 15, 2011. |
| Tables, charts, graphs, maps or figures appearing in a publication | United States. Bureau of the Census. "Table 6. People with Income below Specified Ratios of their Poverty Thresholds by Selected Characteristics: 2009." Income, Poverty, and Health Insurance Coverage in the United States: 2009. http://www.census.gov/prod/2010pubs/p60-238.pdf. Accessed: 8/16/2011. |
| Interactive database | U.S. Geological Survey. "Geology of Colorado". Parameters: Geologic Map, Quaternary Faults, Cities and Towns. Scale 1"=75 miles. Dataset: National Atlas of the United States http://nationalatlas.gov. Accessed August 15, 2011. |
| Specific version of software w/DOI in a repository | Lewis John McGibbney, Omkar Reddy, Ibrahim Jarif, Noah Spahn, & Alex Goodman. (2018, November 30). nasa/podaacpy: Podaacpy v2.2.1 (Version 2.2.1). Zenodo. http://doi.org/10.5281/zenodo.1751973 |
| Non-versioned software, citation date corresponds to commit date | Klimowsky, K. (2018). Datahog. Accessed May 5, 2019. |
| A piece of software in general, no DOI available, w/online link | Boscher, D., Bourdarie, S., Brien, P., & Guild, T. (2008). IRBEMâ€LIB download. https://sourceforge.net/projects/irbem/. Accessed March 3, 2014. |
| Software available in a data archive | Lisa, M., & Bot, H. (2017). My Research Software (Version 2.0.4) [Computer software]. https://doi.org/10.5281/zenodo.1234 |
| Software not available for download | MATLAB (2018). version 9.4 (R2018a), The MathWorks Inc., Natick, Massachusetts. |
More information on citing data
- Inter-University Consortium for Political and Social Research (ICPSR)'s data citation best practices
- Data citation guidance from the publisher Taylor & Francis
Citation tools
- Citation managers comparison Including Endnote (free for UA affiliates), Zotero, Mendeley
- CiteAs: Generate citations for data, software and many kinds of non-traditional research outputs.