5.2 Metadata I: quality through standards
5.2.1 Suggested training topics
Introduction to metadata
Metadata standards
Dublin Core (simple and qualified)
MARC family (MARC21 and MARC XML)
MODS XML
ONIX XML
JATS XML, TEI XML, Crossref XML
KBART files
OpenAire guidelines
Europeana Data model
Datacite metadata schema
Metadata harvesting, depositing and export
OAI-PMH: protocol and infrastructure
SWORD protocol
Mass metadata export
Controlled vocabularies
Introduction to controlled vocabularies
COAR Resource Types Vocabulary
5.2.2 Notes on the training topics
The goal of this training block is to provide an overview of the common metadata standards and to explain the methods of metadata provisioning, depositing and export. In addition, an introduction to the common controlled vocabularies and the metadata fields to which they are applied can be given. The block is thus split into the following training sections:
“Introduction to metadata” is conceptualised as a foundational level topic, which will introduce the existing metadata standards and compare them. It will focus on the common semantic fields, such as “Creator of the resource”, “Identifier of the resource”, “License”, etc., which all standards cover but provide different labels to them, e.g. :
Table 5: The comparison of metadata standards, based on “TRIPLE Deliverable: D2.5 - Report on Data ".
The comparison is substantiated by the fact that the mapping of data models is a common task for many projects, willing to share their data under the OA licence.
The second section offers a deeper understanding of one of the descriptive metadata standards - like Dublin Core (simple and qualified), the MARC family (MARC 21 and its expression in XML), MODS XML, ONIX XML (used by publishers and specialised on book metadata), JATS XML and TEI XML (used mostly for article contents, but containing metadata in the header), KBART (used by knowledge bases and libraries to find out which content they have access to from a ). OpenAIRE guidelines and Europeana Data model are suggested as training topics as well: the former are recommendations for the publications repositories and data archives, utilising Dublin Core, Datacite metadata schema and OAI-PMH v2.0 protocol; the latter maps its own fields to Dublin Core. DataCite metadata schema is used for a correct identification of a resource for its citation and . Trainers may choose to introduce the standard that is most important in the given context.
The third thematic area is devoted to metadata harvesting, depositing and export. This advanced training will explain the key definitions of OAI-PMH protocol for data harvesting: harvester, repository, resource, item, record, identifier, set and typical HTTP . The implementation of OAI-PMH should be a part of the overall technical infrastructure set-up, including journals and articles endpoints. The SWORD protoll works in the other direction - it enables the remote depositing of resources into the Open Access repositories. Another area which can be covered is mass metadata export in one of the established formats. It can serve different purposes: on the one hand providing back-up for the publishing software migration or updates, on the other being a useful source for the bibliometric studies and data mining.
The fourth section of this block touches on controlled vocabularies, such as the COAR Resource Types Vocabulary, which allow the uniform description of the resources. Depending on the context of application this may be taught on a foundational level or on an in-depth advanced level.
5.2.3 Modules build-up: metadata i - quality through standards
Table 5: Modules for the training block “Metadata I: quality through standards”
5.2.4 Training materials
Existing training materials.
Table 6: Existing materials for the training block “Metadata I: quality through standards”
Training materials planned to be produced by CRAFT-OA
Table 7: Potential materials for the training block “Metadata I: quality through standards”
Last updated