5.2 Metadata I: quality through standards
5.2.1 Suggested training topics
Introduction to metadata
Metadata standards
Dublin Core (simple and qualified)
MARC family (MARC21 and MARC XML)
MODS XML
ONIX XML
JATS XML, TEI XML, Crossref XML
KBART files
OpenAire guidelines
Europeana Data model
Datacite metadata schema
Metadata harvesting, depositing and export
OAI-PMH: protocol and infrastructure
SWORD protocol
Mass metadata export
Controlled vocabularies
Introduction to controlled vocabularies
COAR Resource Types Vocabulary
5.2.2 Notes on the training topics
The goal of this training block is to provide an overview of the common metadata standards and to explain the methods of metadata provisioning, depositing and export. In addition, an introduction to the common controlled vocabularies and the metadata fields to which they are applied can be given. The block is thus split into the following training sections:
“Introduction to metadata” is conceptualised as a foundational level topic, which will introduce the existing metadata standards and compare them. It will focus on the common semantic fields, such as “Creator of the resource”, “Identifier of the resource”, “License”, etc., which all standards cover but provide different labels to them, e.g. :
Description
Dublin Core
Qualified Dublin Core
OpenAire Format (JSON)
Europeana Data Model
TRIPLE data model
Creator of the
resource
dcterms:creator
dc:creator
author/fullname
dc:creator
schema:author
Identifier of the resource
dcterms:identifier
dc:identifier
id
dc:identifier
schema:identifier
License
dcterms:license OR
dcterms:rights,
dc:rights
instance/license
odrl:inheritFrom
edm:rights, dc:rights
schema:license
Table 5: The comparison of metadata standards, based on “TRIPLE Deliverable: D2.5 - Report on Data ".
The comparison is substantiated by the fact that the mapping of data models is a common task for many projects, willing to share their data under the OA licence.
The second section offers a deeper understanding of one of the descriptive metadata standards - like Dublin Core (simple and qualified), the MARC family (MARC 21 and its expression in XML), MODS XML, ONIX XML (used by publishers and specialised on book metadata), JATS XML and TEI XML (used mostly for article contents, but containing metadata in the header), KBART (used by knowledge bases and libraries to find out which content they have access to from a ). OpenAIRE guidelines and Europeana Data model are suggested as training topics as well: the former are recommendations for the publications repositories and data archives, utilising Dublin Core, Datacite metadata schema and OAI-PMH v2.0 protocol; the latter maps its own fields to Dublin Core. DataCite metadata schema is used for a correct identification of a resource for its citation and . Trainers may choose to introduce the standard that is most important in the given context.
The third thematic area is devoted to metadata harvesting, depositing and export. This advanced training will explain the key definitions of OAI-PMH protocol for data harvesting: harvester, repository, resource, item, record, identifier, set and typical HTTP . The implementation of OAI-PMH should be a part of the overall technical infrastructure set-up, including journals and articles endpoints. The SWORD protoll works in the other direction - it enables the remote depositing of resources into the Open Access repositories. Another area which can be covered is mass metadata export in one of the established formats. It can serve different purposes: on the one hand providing back-up for the publishing software migration or updates, on the other being a useful source for the bibliometric studies and data mining.
The fourth section of this block touches on controlled vocabularies, such as the COAR Resource Types Vocabulary, which allow the uniform description of the resources. Depending on the context of application this may be taught on a foundational level or on an in-depth advanced level.
5.2.3 Modules build-up: metadata i - quality through standards
Topic
Level
Audience
F-A-I-R
Introduction to metadata
Foundational
● Editors
● Reviewers
● Researchers
F-A-I-R
Metadata standards
Advanced
● Software developers
● Technical professionals
F-A-I-R
Metadata harvesting: OAI-PMH
Advanced
● Software developers
● Technical professionals
F-A-I-R
Metadata depositing: SWORD
Advanced
● Software developers
● Technical professionals
F-A-I-R
Mass metadata export
Advanced
● Software developers
● Technical professionals
F-A-I-R
Introduction to controlled vocabularies
Foundational
● Editors
● Reviewers
● Researchers
F-A-I-R
COAR Resource Types Vocabulary
Advanced
● Software developers
● Technical professionals
F-A-I-R
Table 5: Modules for the training block “Metadata I: quality through standards”
5.2.4 Training materials
Existing training materials.
Title
Creator
Comment
OA Journals Toolkit https://www.oajournals-toolkit.org/
DOAJ, OASPA
The toolkit includes a variety of short articles about common issues in OA publishing. There is also a PDF version of the toolkit available. It may, e.g. act as a starting point and a resource for training.
OA Journals Toolkit: Article and journal level metadata (https://www.oajournals-toolkit.org/infrastructure/article-and-journal-metadata)
DOAJ, OASPA
This article from the OA Journals Toolkit briefly introduces different metadata standards and typical examples of collected metadata.
Metadata quality for publication: standards, practices, tools and actors
OPERAS/CO-OPERAS
The slides are from two workshops organized jointly by CO-OPERAS, the OPERAS Special Interest Group on standards and FAIR principles and the OpenEdition Center. The trainings were proposed to OpenEdition's open access journal publishers who were newly integrated on OpenEdition's publishing platform.
Table 6: Existing materials for the training block “Metadata I: quality through standards”
Training materials planned to be produced by CRAFT-OA
Title
Created in context of
Institution
Understanding bibliographical models in journal publishing
WP3 / T3.3
UGOE
Training for scientific journals on technical standards and visibility
WP3 / T3.3
IBL PAN
FAIR publishing self-assessment tool
WP3 / T3.3
AMU
Table 7: Potential materials for the training block “Metadata I: quality through standards”
Last updated