Metadata guidelines

In order to access resources efficiently and accurately, it is essential that the structured data describing the resource, i.e. metadata, conforms to a certain set of minimum requirements. Although there are well-established metadata descriptions for various types of resources, the literature and dispersion of different metadata structures can be very intimidating. The following information tries to provide guidelines for the creation of basic metadata in the creation of digital language resources. The proposed guidelines only aim to provide a minimal set of required metadata items that will make the retrieval and browsing of digital resources possible, while making it relatively simple to create metadata records for individuals and organisations with limited knowledge and resources.

Because of the specialised nature of the various types of language resources that are relevant in the context of SADiLaR, it would be impossible to provide a single metadata structure that is applicable to all possible language resources. The metadata fields proposed in this document are a combination of Dublin Core Metadata initiative (DCMI); Text Encoding Initiative (TEI); Open Language Archives Community; DSpace; META-SHARE; ISOcat; and the Common Language Resource Infrastructure (CLARIN). If there are metadata items that are not addressed in the document; feel free to review the resource documents and sites in the final section for more detailed metadata fields.

Metadata fields

The following tables provide the minimum set of mandatory fields that should be included in a metadata record in order to make the language resources easily accessible and searchable. The field descriptions are mostly sourced from the NRF and DSpace, with additional information from META-SHARE, and Dublin Core.

 Mandatory fields

Field

Short description

Title

Title statement/title proper.

Author/Creator

Author(s) of the work.

Date issued

Date of publication or distribution.

Subject/Keywords

The topic of the resource.

Language

ISO 639-1/2 standard code for language of intellectual content.

Publisher

Entity responsible for publication, distribution, or imprint.

Description

A short description of the resource, which could be the abstract or table of contents.

Contact person name

Name of person with more information on the resource. 

Contact person email

Contact person’s email address. 

Additional common metadata fields

The following is a selection of commonly used metadata fields that improve the usefulness of the metadata and ability to search and filter items during searches. Where possible these fields should also be included in the metadata records.

Field

Short description

Contributor(s)

A person, organization, or service responsible for the content of the resource.

Format

The format of the resource, such as XML, text, docx, etc.

Medium

Physical medium of the resource.

Size/Extent

Size or duration or the resource.

URL

A URL used as homepage of an entity (e.g. of a person, organization, resource etc.) and/or where an entity (e.g.LR, document etc.) is located.

Date created

Date of creation or manufacture of intellectual content if different from Date issued.

Date copyright

Date of copyright.

License/Rights

Terms governing use and reproduction.

Identifier

ISBN/ISSN/ISMN/ISLRN

Citation

Human-readable, standard bibliographic citation.

Rights holder

A person or organization owning or managing rights over the resource.

Version/Edition

The specific version of edition of the original resource.

Description

A short description of the resource, which could be the abstract or table of contents.

Location

The place where the resources was produced.

Country

The country where the resource was produced.

Region

The state/province where the resource was produced.

City

The city where the resource was produced.

Coverage

The spatial or temporal coverage of the content.

 

Language codes

Language name

ISO 639-1

ISO-639-2

Afrikaans

af

afr

English

en

eng

isiNdebele

nr

nbl

isiXhosa

xh

xho

isiZulu

zu

zul

Sesotho

st

sot

Sesotho sa Leboa

nso

nso

Setswana

tn

tsn

SiSwati

ss

ssw

Tshivenḓa

ve

ven

Xitsonga

ts

tso

Established metadata initiatives

More information, including extended descriptions of fields, and additional possible fields are available from the following resources: