Linked Data In Action

Richard Wallis recently posted on his blog, Nodalities, a presentation he did on linked data. I’ve heard Richard speak at Code4Lib and he along with Talis are doing some extraordinary stuff in terms of linking data and web 2.0 technologies. You have the option of viewing the slideshow below or you can also go to Nodalities to listen to Richard talk about linked data in action.

NGC4LIB on Tim Berners-Lee and the Semantic Web

The listserv, the Next Generation Catalog for Libraries has been extremely busy this last month. Three discussions really stand out: FRBR’s Group 1 entities and what type of identifiers are associated with them, in particular works, expressions and manifestations ; Tim Berners Lee and the Semantic Web ; FRBR’s user tasks and their continued relevance.

Unlike some listservs, these threads can be read in their entirety online. William Denton’s FRBR blog as well as some others have already advertised this to the community. I would like to re-advertise these discussions because of their importance in understanding FRBR and RDA among other things. In doing so, I would like to highlight some points from these threads. I will do this in a series of 3 blogs one on identifiers, a second on user tasks, and then end with the thread on Tim Berners Lee, which is still very active on NGC4LIB. I posted last week on FRBR and identifiers. This post will be about the thread on Tim Berners-Lee and the Semantic web.

Don’t let the title of the thread fool you. This discussion went everywhere! There are still some replies to the listserv over the past week that continue the discussion. Remember to look at the archives for October and November. Basically, the thread began with the posting of a recent talk by Sir Tim Berners-Lee at: http://fora.tv/2009/10/08/Next_Decade_Technologies_Changing_the_World-Tim-Berners-Lee. This 38 minute interview was the beginning to a very rich discussion about data, how to identify data, and how to get data out on the web to be used and re-used.

Here are some points that I found of particular interest:

  • How do we get data out on the web? Will RDA help get the data out on the web?
    • This is an excellent question that comes up several times in this thread. This is also a question that came up in the recent OCLC webinar on what they are doing about RDA. For the most part, library data is stored away in catalogs that are really not being minded or search by search engines. The wealth of information is there but it is not in a web friendly format. As Jim Weinheimer and others pointed out, it is essential to get that information out there. However, there was mention that the Library of Congress library data is out there and the Internet Archives’ data is out on the web as well. So why aren’t people looking at it? Isn’t RDA and its relation to the semantic web supposed to help not only get library data out there but also get people looking at it? These questions really didn’t get formalized in terms of answers. Yet, it was interesting to follow the trail of people’s thoughts. Yes we need to get library data out there. RDA theoretically will help us do this because of its relationship to the semantic web. However, will this incite users to come and look at this data?
  • RDA and user taks
    • The discussion about RDA and the web led to the question of the relevance of the user tasks that RDA brought over from FRBR. These user tasks are: find, identify, select, obtain. FRBR was published in 1998, more than a decade ago. Do these tasks represent what users do when searching for information? Remember that FRAD, FRBR’s sibling for authority work, has slightly different user tasks. What does this say about users tasks?
  • The idea of a domain model (RDFS, OWL ontology, RDF)
    • One of the reasons that RDA will be useful for the semantic web according to some on the thread is that it is based on a domain model which can be expressed as an RDFS/OWL ontology. From the thread, this domain model is important since it provides a framework with which to work from. Even if this framework contains flaws, it is still a helpful framework that can evolve as the web evolves since it is tied to the language used on the web. This is useful since it also means that RDA will evolve with the web.
  • Multiple meanings of FRBR and RDA: FRBR, RDA, RDAonline
    • This was a very interesting post by Karen Coyle. I think she highlighted a huge problem. With all the discussions about RDA, FRBR, and the product of RDA that is going to be pubished sometime in the future, a multitude of interpretations surrounding these concepts have arisen. Karen was right to point out that we will have 3 things: FRBR, RDA, and RDAonline. These are 3 different things that serve 3 different purposes. In addition, we have to remember that RDA has inherited many legacy issues from AACR2. This is one of the reasons why RDA is criticised by some as not going far enough. To make matters more confusing, there is also the Metadata Registry, which is related to RDA, RDAonline, and FRBR but is its own enterprise with its own mission.
  • OLAC and WEMI
    • It has been known the audio-visual community has had trouble with the notions of work, expression, manifestation, and item for quite some time know. Until this thread, I really hadn’t found a good explanation as to the reasons why as well as what OLAC planned to do about it. Kelly C. McGrath wrote about OLAC’s position on Wed. 21, 2009. She was responding to the importance of the WEMI (work, expression, manifestation, and item) model as a good starting point. Kelly writes: “We are trying to take a practical approach. At a theoretical level, the four levels as defined by FRBR make a lot of sense (although if you , for example, include expressions of expression, it would seem you could have even more levels).However, when we came to recording things on different records, it quickly became apparent that the split between Work and Expression (e.g., things like color, aspect ratio, and costume designer only at the Expression level) was not very workable for us. We therefore settled on a model that uses primarily a Work/Primary Expression (usually the original public release if applicable) record and a Manifestation record. Information like color and aspect ratio of a DVD in hand are meaningless unless you know the original, intended values. So a 1:33 (full screen) DVD Manifestation of a TV program that was originally broadcast in 1:33 and of a film that was in 2.66:1 don’t mean the same thing to the purist. The purist would be happy with the former and not with the latter modified version. So we want to record the original, intended value at the Work level so that it can be compared with particular Expressions. We thought it was most practical to have the information that we intended to re-use for all instances in a single record. It is also in line with the way film reference sources and online databases like IMDB display information.We also think, from a practical perspective, that most Expression information can be coded in machine-interpretable form in the Manifestation record and a display of Expressions could be generated automatically. Every time a cataloger gets a new Manifestation, this information has to be reevaluated again. Moving image expressions tend to be multi-faceted so looking for an Expression record for the exact combination in hand could be time-consuming and finding expression records for each individual aspect is no better than just encoding the characteristics in the manifestation record.

      We don’t think a colorized version of a film is a new Work. Rather we would call it a new Expression and record it in the Manifestation record in such a way that it will be obvious to the user that the color of this version has been modified.

      It is also not clear to me that the hierarchical approach of choosing a work, then an expression, then a manifestation is always the order that users need. For moving images, for example, users might want to limit to those works available on DVD or usable in English up front.

      One way this might be displayed to users can be seen in Figure 8 (near the bottom) at http://kmcgrath.iweb.bsu.edu/MIWgrant.htm. The top facets are the WPE facets and the left facets come primarily from Manifestation records. So the original color or aspect ratio might be at the top and the ones for the available manifestations on the left. These comparison might be more useful in the WPE record view in Figure 9 (very bottom) where the original aspect ratio is given in the body of the WPE record and the available aspect ratios are given on the left. It might also be useful to label the non-original aspect ratio(s) as “modified.”

      FWIW, CEN (European Committee for Standardization) has also come to the conclusion that it is meaningless to talk of a Cinematographic Work outside of its realization. “The concept of cinematographic work comprises both the intellectual or artistic content and the process of realisation in a cinematographic medium.” (http://www.filmstandards.org/dokuwiki/lib/exe/fetch.php?id=start&cache=cache&media=cen-tc372_n0167_4th_wd_csh00102-r3_2008-12-03.pdf.)”

  • How do we share information and metadata?
  • FRBR, does it work best with an already large database of bibliographic data? Does it require that catalogers search for information they might not know or have access to? — Linking data, sharing data, …
  • Identity management
  • Are libraries outdated? Why aren’t people going to libraries?

I could list so many more topics from this discussion. Even though this thread is long and can be found in the archives for both October and November 2009, the discussions are well worth the detour.

