Last week DataCite – the international registry of data citations – released a new tool designed to allow users to create metadata using text inputs through a quick and easy form in HTML. What’s great about this tool is that it doesn’t require any software installation whatsoever, and it represents DataCite’s most recent version of their metadata schema – version 3. I tried out the tool myself and found it to be quite useful. The tool is very easy to install – DataCite’s description page of the metadata generator provides a link to a GitHub page. From there, you simply have to find the download option, and save the link with a .html file extension. Then, you can open the html file, and start generating metadata. I’ve included some screenshots of the tool below to give a clearer picture:
DataCite Mandatory Metadata
These elements represent DataCite’s most minimal metadata elements. Many libraries working with repositories or catalogues should find this useful because often identifying a standard for describing data can be difficult. Using this tool can allow you to develop metadata quickly and easily, without having to worry about developing your own standards. Notice too that there are “+” signs next to many of the fields allowing for multiple entries.
DataCite Recommended Metadata Elements
These elements now get into more of the descriptive components of a dataset. This can be a useful launching point for starting to describe institutional datasets within an institution, or as part of an organization. That being said, I believe that more metadata is necessary to adequately describe datasets. For example, in the biomedical field there are many more descriptors that could be added to these elements such as the subject of study, species, type of data (e.g., genetic, clinical measures, etc.) to name a few. This is important to keep in mind as this tool does have limitations with respect to applying suitable data descriptors depending on the discipline (e.g. engineering, social sciences, biomedicine).
DataCite Other Metadata Elements
Finally, other elements provide a useful way to describe the formatting and version components of a dataset. These elements are straightforward, and provide additional useful information about the dataset.
Now that we know what type of fields are included in the metadata tool, it is important to understand how the tool actually works. What’s great about this new tool is that once you start to enter content into the fields, the tool automatically starts to generate XML metadata for you. See the figure below for an example:
Once you’ve populated a form with metadata you can either choose to select all of it and paste it into an existing XML schema, or you can save the file as an XML document. This is a very handy way to generate metadata quickly, and implement it into a repository or catalogue in a relatively seamless way.
Academic and research libraries should especially take note of the metadata generator because it serves two important services: 1) if institutional repositories are interested in publishing datasets this tool will provide a quick and easy way to generate metadata and streamline the process, and 2) librarians can assist in the data description process if researchers are interested in publishing their datasets in programs such as Figshare or F1000 Research by using this tool. As Canadian libraries start to move towards providing data-related services at their institutions, it is important that we become more aware and proficient in using these types of tools. Something as straightforward as the DataCite metadata generator is an excellent way to become familiar with how datasets are described, and understand one way of how librarians can contribute to the research data management process.
1. Brase J. The DataCite Consortium. For Attribution – Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop [Internet]. The National Academies Press; 2012. p. 95–8. Available from: http://www.nap.edu/openbook.php?record_id=13564&page=95
2. Cruse P. DataCite boosts visibility, access to research data. UC Newsroom [Internet]. 2010; Available from: http://www.universityofcalifornia.edu/news/article/23055
3. Pollard TJ, Wilkinson M. Making Datasets Visible and Accessible: DataCite’s First Summer Meeting. Ariadne: A Web & Print Magazine of Internet Issues for Librarians & Information Specialists. 2010;30(64):10–10.
4. Simons N. Implementing DOIs for Research Data. D-Lib Magazine [Internet]. 2012 May [cited 2012 Dec 4];18(5/6). Available from: http://www.dlib.org/dlib/may12/simons/05simons.html
5. Purdue, CDL, others establish DataCite. Advanced Technology Libraries. 2010;39(3):8–8.