Tuesday, 27 October 2009

Consistency of Metadata

I am a strong believer in the sharing and interoperability of metadata, therefore I am trying to advocate descriptive standards both regarding syntax of elements and their content. I decided to use controlled vocabularies for my collection; most of them are well-established and time-tested within the cultural and heritage institutions, such as Thesaurus for Graphic Materials, Library of Congress Subject Headings. However, for the style element, which is a VRA-inspired extension of the DCTerms set I use a local short list that was prepared for the collection.

It is not only subject headings that I try to control in order to minimize errors and typos, in all tested applications I tried to come up with a drop-down menu for languages used within my collection, because terms like Yiddish or Lithuanian can cause difficulties even to experienced catalogers.

In order to make the retrieval of metadata functional and effective I try to keep values of metadata fields simple and relatively short, so that the data would display properly and did not conflict with layout of the page. Shorter entries are also easier for users to scan on the result screen.

One of the difficulties I have to face is that the collection is relatively small and thematically dispersed, so it is relatively difficult to come up with a good browsing categories, therefore I tried to choose broader access terms rather than specific ones.

Tuesday, 13 October 2009

OAI-PMH and Benefits of DC

For a while now I have been wondering about metadata interoperability. Günter Waibel and Mary W. Elings demonstrated* that interoperability is possible even if different communities use different metadata standards, or more in the spirit of the article, even if different materials are described by different standards. OAI-PMH is essential for this type of interoperability, but the OAI-MPH is just a tool - a protocol for exchange or sharing, but in fact what makes the exchange possible is Dublin Core.

I was never a big fan of Dublin Core, whether in its qualified or unqualified form. I was always skeptical that the effort to generalize the concept of description and remove it from the material to be described does not bode well for practices in cultural and heritage institutions. However, a title is undeniably a title whether it is the title of a book, or of a painting or of an archival artefact. Based on a descriptive standard, the title does not have to be always constructed the same way and look the same, but basically the concept is understandable - the title is that property under which an object is usually known. Once I accepted this truism, my opposition to DC as an intermediary layer became less intense.

OAI-PMH was one reason I changed my mind, but the other was DigiTool and its mapping file harvesting_schema that is based on the modified extended qualified DC, and which effectively manages to channel data from various metadata standards into descriptive facets that are then present to a user in resource discovery. There is little chance that the user will recognize the native format of the metadata, some residual delimiters may give away a MARC record, but content-wise the harvesting_schema allows for a lot of flexibility. It is also extensible , so one is not bound by the DCTerms set.

When it comes to description, I am in favour of MODS, but I can live with MARC as well, but I started to appreciate the fact that there is a light-weight DC somewhere out there. And I am glad that we can make metadata available in OAI-PMH in both formats in addition to DC elements or more precisely OAI_DC.

*Metadata for All: Descriptive Standards and Metadata Sharing across Libraries, Archives and Museums by Mary W. Elings and Günter Waibel. First Monday, volume 12, number 3 (March 2007), URL: http://firstmonday.org/issues/issue12_3/elings/index.html (Accessed on 2009-10-13)