5.2 Metadata I: quality through standards

5.2.1 Suggested training topics

  1. Introduction to metadata

  2. Metadata standards

    1. Dublin Core (simple and qualified)

    2. MARC family (MARC21 and MARC XML)

    3. MODS XML

    4. ONIX XML

    5. JATS XML, TEI XML, Crossref XML

    6. KBART files

    7. OpenAire guidelines

    8. Europeana Data model

    9. Datacite metadata schema

  3. Metadata harvesting, depositing and export

    1. OAI-PMH: protocol and infrastructure

    2. SWORD protocol

    3. Mass metadata export

  4. Controlled vocabularies

    1. Introduction to controlled vocabularies

    2. COAR Resource Types Vocabulary

5.2.2 Notes on the training topics

The goal of this training block is to provide an overview of the common metadata standards and to explain the methods of metadata provisioning, depositing and export. In addition, an introduction to the common controlled vocabularies and the metadata fields to which they are applied can be given. The block is thus split into the following training sections:

  1. “Introduction to metadata” is conceptualised as a foundational level topic, which will introduce the existing metadata standards and compare them. It will focus on the common semantic fields, such as “Creator of the resource”, “Identifier of the resource”, “License”, etc., which all standards cover but provide different labels to them, e.g. :

Description

Dublin Core

Qualified Dublin Core

OpenAire Format (JSON)

Europeana Data Model

TRIPLE data model

Creator of the

resource

dcterms:creator

dc:creator

author/fullname

dc:creator

schema:author

Identifier of the resource

dcterms:identifier

dc:identifier

id

dc:identifier

schema:identifier

License

dcterms:license OR

dcterms:rights,

dc:rights

instance/license

odrl:inheritFrom

edm:rights, dc:rights

schema:license

Table 5: The comparison of metadata standards, based on “TRIPLE Deliverable: D2.5 - Report on Data ".

The comparison is substantiated by the fact that the mapping of data models is a common task for many projects, willing to share their data under the OA licence.

  1. The second section offers a deeper understanding of one of the descriptive metadata standards - like Dublin Core (simple and qualified), the MARC family (MARC 21 and its expression in XML), MODS XML, ONIX XML (used by publishers and specialised on book metadata), JATS XML and TEI XML (used mostly for article contents, but containing metadata in the header), KBART (used by knowledge bases and libraries to find out which content they have access to from a ). OpenAIRE guidelines and Europeana Data model are suggested as training topics as well: the former are recommendations for the publications repositories and data archives, utilising Dublin Core, Datacite metadata schema and OAI-PMH v2.0 protocol; the latter maps its own fields to Dublin Core. DataCite metadata schema is used for a correct identification of a resource for its citation and . Trainers may choose to introduce the standard that is most important in the given context.

  2. The third thematic area is devoted to metadata harvesting, depositing and export. This advanced training will explain the key definitions of OAI-PMH protocol for data harvesting: harvester, repository, resource, item, record, identifier, set and typical HTTP . The implementation of OAI-PMH should be a part of the overall technical infrastructure set-up, including journals and articles endpoints. The SWORD protoll works in the other direction - it enables the remote depositing of resources into the Open Access repositories. Another area which can be covered is mass metadata export in one of the established formats. It can serve different purposes: on the one hand providing back-up for the publishing software migration or updates, on the other being a useful source for the bibliometric studies and data mining.

  3. The fourth section of this block touches on controlled vocabularies, such as the COAR Resource Types Vocabulary, which allow the uniform description of the resources. Depending on the context of application this may be taught on a foundational level or on an in-depth advanced level.

5.2.3 Modules build-up: metadata i - quality through standards

Topic

Level

Audience

F-A-I-R

Introduction to metadata

Foundational

● Editors

● Reviewers

● Researchers

F-A-I-R

Metadata standards

Advanced

● Software developers

● Technical professionals

F-A-I-R

Metadata harvesting: OAI-PMH

Advanced

● Software developers

● Technical professionals

F-A-I-R

Metadata depositing: SWORD

Advanced

● Software developers

● Technical professionals

F-A-I-R

Mass metadata export

Advanced

● Software developers

● Technical professionals

F-A-I-R

Introduction to controlled vocabularies

Foundational

● Editors

● Reviewers

● Researchers

F-A-I-R

COAR Resource Types Vocabulary

Advanced

● Software developers

● Technical professionals

F-A-I-R

Table 5: Modules for the training block “Metadata I: quality through standards”

5.2.4 Training materials

  1. Existing training materials.

Title

Creator

Comment

DOAJ, OASPA

The toolkit includes a variety of short articles about common issues in OA publishing. There is also a PDF version of the toolkit available. It may, e.g. act as a starting point and a resource for training.

OA Journals Toolkit: Article and journal level metadata (https://www.oajournals-toolkit.org/infrastructure/article-and-journal-metadata)

DOAJ, OASPA

This article from the OA Journals Toolkit briefly introduces different metadata standards and typical examples of collected metadata.

Metadata quality for publication: standards, practices, tools and actors

(https://doi.org/10.5281/zenodo.7464507)

OPERAS/CO-OPERAS

The slides are from two workshops organized jointly by CO-OPERAS, the OPERAS Special Interest Group on standards and FAIR principles and the OpenEdition Center. The trainings were proposed to OpenEdition's open access journal publishers who were newly integrated on OpenEdition's publishing platform.

Table 6: Existing materials for the training block “Metadata I: quality through standards”

  1. Training materials planned to be produced by CRAFT-OA

Title

Created in context of

Institution

Understanding bibliographical models in journal publishing

WP3 / T3.3

UGOE

Training for scientific journals on technical standards and visibility

WP3 / T3.3

IBL PAN

FAIR publishing self-assessment tool

WP3 / T3.3

AMU

Table 7: Potential materials for the training block “Metadata I: quality through standards”

Last updated