MMA Version of MARC21 Elements and Vocabularies in RDF

Just posted by Diane Hillmann over at Metadata Matters.

Metadata Management Associates (MMA) is pleased to announce a new resource for the library community: The MMA version of MARC 21 elements and vocabularies in RDF, hosted by the Open Metadata Registry (OMR). Given the need for libraries to move beyond the MARC standard, and the desire for increased innovation as we move toward a successor, we felt this was a good time to make this data available to all. The MMA version of MARC 21 in RDF has been exclusively developed by Metadata Management Associates, and has not, as yet, been shared with the Library of Congress, although the recent announcements on MARC made it even more desirable to move in this direction.

So why did we spend the time to do this, given the intention to leave MARC behind? Most importantly, we want to make it easier for innovators to ‘play’ with this data, and to have URIs to use when they do so. We ourselves want to include MARC21 elements in our research into semantic mapping between bibliographic namespaces, and to inform proposed work on representing UNIMARC in RDF. We also want to provide inspiration for those eager to experiment with the issues around the transition from MARC 21 to the new environment of RDA and other bibliographic standards, and trust that the reassurance that it WILL happen–and is not rocket science–will help everyone interested in participating in that future.

What you see in the OMR now is what we’re calling ‘Level0’, the most basic loss-less way to transition MARC 21 data into the Resource Description Framework (RDF). It does not reflect the layers we think ought to be added on top:

  • Level1 may contain properties that gather sub-properties with similar semantics from Level0.
  • Level2 may contain properties and classes that represent aggregated statements composed of sub-properties from Level0 and Level1.
  • Level3+ can contain properties and classes that represent broader-level semantics and provide equivalence mappings to other namespaces for bibliographic metadata, such as Dublin Core terms, ISBD, and RDA.

Special thanks to Karen Coyle, whose work on the analysis of MARC 21 (most recently seen in Code4Lib Journal) inspired us greatly and whose questions pushed us in some important directions. Karen’s insights and ideas will be far more visible in Level1 and subsequent levels, when we build on the basics.

This initiative is also intended to inform the proposal to develop an RDF representation of MARC 21’s cousin, UNIMARC, presented at IFLA this year.

Please note that not all MARC21 elements are currently represented in the OMR. Some of the lesser-used tags in 00X-8XX have not yet been registered, and we are still looking into 76X-78X Linking entry fields.

We welcome feedback on this effort. Information on errors and specific issues is best communicated via the “FEEDBACK” links on all OMR pages. We’re happy to participate in discussions on the DCMI/RDA Task Group discussion list (DC-RDA@jiscmail.ac.uk), or anywhere else–just be sure we know that you’ve posted something, and we’ll respond.

Those of you headed for the DC-2011 conference are likely to see us presenting information on our new mapping initiative, and demonstrating how the MARC21 data in RDF supports useful mapping from MARC21 to RDA. We’ll post slides and links after that conference.

Metadata Management Associates
Diane Hillmann
Jon Phipps
Gordon Dunsire

Linked Data In Action

Richard Wallis recently posted on his blog, Nodalities, a presentation he did on linked data. I’ve heard Richard speak at Code4Lib and he along with Talis are doing some extraordinary stuff in terms of linking data and web 2.0 technologies. You have the option of viewing the slideshow below or you can also go to Nodalities to listen to Richard talk about linked data in action.

RDA’s Controlled Vocabulary

Lynne LeGrow recently posted a piece on the controlled vocabulary that RDA uses for carrier type on her blog Cataloging Aids. It can seem daunting when one finds out the number of options involved with RDA.

Some options for controlled vocabularies are:

The best place that I have found thus far to get a list of all these different types is at the Metadata Registry from the National Science Digital Library.

From their webpage, NSDL explains that:

The NSDL Metadata Registry is a fundamental piece of technical infrastructure for the Semantic Web. While originally built to support the National Science Digital Library (NSDL), the Registry is available openly and to all who wish to use its services.

The Registry provides a means for to identify, declare and publish through registration their metadata schemas (element/property sets), schemes (controlled vocabularies) and Application Profiles (APs). In addition to supporting registration of schemes, schemas and APs for consumption and use by human and machine agents, the NSDL Registry will support the machine mapping of relationships among terms and concepts in those schemes (semantic mappings) and schemas (crosswalks). Thus, the Registry will support the key goals of metadata discovery, reuse, standardization and interoperability locally and globally.

The Registry used as its inspiration the open-source Dublin Core Metadata Initiative (DCMI) Registry. The Registry extended the original DCMI goals to support: (1) the automated creation and maintenance of schemas and application profiles; and (2) the submission of schemas and schemes to a registry workflow for review and publication. All of the development work leverages the latest knowledge and standards for networked knowledge organization systems, schema and application profile declaration, and registry development.

The NSDL Metadata Registry project was funded by the National Science Foundation for its first three years. It is currently managed by Metadata Management Associates, a consulting partnership committed to maintaining the Registry as an open system.

This project goes beyond just providing lists. You have the option of subscribing to new changes and additions to the Registry by adding it to your preferred feed reader. Some of the latest changes all deal with RDA. Each list has the namespace, the list of controlled vocabulary and their URI’s, the history and who maintains the list. Also, the Registry offers views of the list, in RDF, or in a XML schema. Here’s the example of the schema for media type:

<?xml version="1.0" encoding = "UTF-8"?><xs:schema

        <xs:documentation xml:lang="en">
            RDA Media Type XML Schema
            XML Schema for http://RDVocab.info/termLIst/RDAMediaType namespace
            Date created: 2008-05-25 06:35:45
            Date of last update: 2008-05-25 06:35:45
            Based on RDA Media Type table 3.1 in the final draft of RDA dated Oct. 31, 2008.
            Further information about this Vocabulary is available at http://RDVocab.info/termLIst/RDAMediaType.html

    <xs:simpleType name="DCMIType">
        <xs:restriction base="xs:string">
            <xs:enumeration value="Audio"/><xs:enumeration value="Computer"/><xs:enumeration value="Microform"/><xs:enumeration value="Microscopic"/><xs:enumeration value="Projected"/><xs:enumeration value="Stereoscopic"/><xs:enumeration value="Unmediated"/><xs:enumeration value="Video"/>        </xs:restriction>


Even though there are what seem to be an overwhelming amount of RDA lists, take a look anyway. This is a powerful tool and has loads of useful information about RDA controlled vocabulary.

Linked Data and the Cloud

Paul Miller recently posted on whether linked data needs RDF. Basically, Paul says that RDF is a good thing and that linked data is more powerful with RDF (or the resource description framework).

However, he writes:

The problem, I contend, comes when well-meaning and knowledgeable advocates of both Linked Data and RDF conflate the two and infer, imply or assert that ‘Linked Data’ can only be Linked Data if expressed in RDF.

Paul continues his article to summarize his point of view and why he sees that linked data as expressed only in RDF is a problem. He adds a good deal about the controversies surrounding linked data and RDF as well as resources at the end of his post.

From following the discussion on linked data from the surface, it sounds like an amazing stage in the development of the web. This article highlights the disagreements of how to reach that next stage. Because of this, I found this article helpful in learning about the controversies surrounding RDF and linked data.

Leigh Dodds on Linked Data

Over at the Nodalities blog, Leigh Dodds posted a highlight of his presentation on linked data.

I opened by speaking about the fundamental idea behind Linked Data: that data be put online, in a very fine-grained way. This takes us beyond having stable links for datasets or just articles, and yields web identifiers for the Who, Why, What, Where and When of the content: every person; place; category; and event can each be identified, annotated and ultimately linked together into a navigable whole. RDF, as the core technology for Linked Data, is very simple to get to grips with, with the notion of resources and their connections being something that anyone can intuitively grasp in a few minutes.

He also includes a short inroad into the notion of verifying sources and checking the quality of data.

The ability to identify and ignore questionable sources, or identify stories that are drawn from inaccurate data or analyses, is something that has been previously been very hard to do.

It will be interesting to follow the progress of linked data to see if it can live up to this ability of data quality control.

This is a good detour. And, Leigh has included links to his PowerPoint presentation and some other resources for linked data.

RDFa Podcast

Paul Miller over at Talis recently released a podcast of his interview with Mark Birbeck about RDFa.

Some items that were discussed include: Drupal, Dublin Core, and Linked Data among a lot of other interesting topics. The complete interview is about 52 mins.

Don’t know what RDF and RDFa is? This is a good place to get some information on it and how to get metadata out onto the web in user interfaces….

