Metadata experts

Have you noticed that everyone these days seems to be a metadata expert? About two years ago, several at my institution participated in the ARL eScience Institute. This institute was a chance to develop a strategic agenda with solutions as to how librarians can respond to the rising need of researchers who work with intensively computation science and who need to do their 2 page data management plan for their grant (which happens to include a section on metadata and formats). Since that time, I have seen a number of webinars on how librarians can jump on the eScience train. Interestingly, there have been several “Metadata for eScience” workshops and webinars. My liaison colleagues are all excited about this. They come back and tell me…. Did you know that creating a document where your decisions are recorded is metadata? Besides the interesting point that metadata appears to be reduced to creating a document (preferably a text file by the way), it seems that metadata is again a trendy word. But how is it trendy?

I think this is a complex question that has a number of different answers. I’ve been thinking about the “coolness” of metadata on two fronts, namely automatically generated metadata and the idea of metadata. Automatically generated metadata is definitely trendy and rightfully so. Every time we create a document, take a picture, or some other random action with a machine that has a computer, there is metadata created. We’ve also all heard stories of how our habits are being tracked. This information comes from data compiled about us or metadata, which is information about us and how we behave on the Internet. The interesting phenomenon to this is that automatically generated metadata is seen in some circles as the cure all to discovery and access. In other words, all you need is automatically generated metadata. This is akin to the argument that all you need is Google because everything’s on the internet…right? Let’s remember another feature of automatically generated metadata with the help of iTunes. I remember that I put in one of my CDs, which happened to be one I bought oversees around 10 years ago. I wanted to import the songs into my library. Nothing extraordinary there and everything went smoothly – except that the song names weren’t recognized and everything was named Track with a number behind it. This automatically generated metadata certainly wasn’t helpful. Let’s take another example, my pictures. I love to take pictures of my hikes and in particular trees and flowers. Corny … yes. Fun…totally! When I download these pictures from my camera, I typically have a lot of pictures of trees and flowers. The automatically generated metadata from my camera gives me the date and time of the picture. I also have a file name – typically something really uninspiring as 000987image.jpg. If you have 5 pictures of trees and flowers, then this isn’t perhaps a problem. But what if I have a couple hundred and I want to share my pictures with other people? How is someone going to recognize 000987image.jpg as the picture of the Japanese Maple in the Boston Arboretum taken on a sunny crisp fall morning. Well the automated metadata isn’t going to help much in this case. Automated metadata is great. However, just like Google, it is not the end all be all of metadata. Not all discovery and access can be accomplished through automatically generated metadata. Much has to be through human intervention. This is true in the realm of eScience as well where a person has to provide some of the metadata if you want other people to access and discover one’s research. What’s trendy about automatically generated metadata? Perhaps it’s the hope of faster, cheaper, and efficient ways to discovery and access (or data tracking). However, there are some down sides as in the examples I gave. I guess the question becomes how to find a balance between good metadata that is both automatically and human generated… and then convincing your library administration that human generated metadata is as trendy and needed as the automatically generated kind.

The other interesting aspect is the “idea” of metadata. By this, I think I mean this general fuzzy Metadata or data about data. Metadata are data that help you discover and access resources. This is the big picture of metadata. It’s Metadata viewed from space. There’s nothing wrong with this. Indeed, it’s important to have a general understanding of metadata in general. Where it complicates things is when these generalities are used to describe the actual work done by catalog/metadata librarians. Going back to the example from my colleague – metadata is creating a document preferably as a text file. Certainly, documenting decisions made with your metadata is important. But metadata consists of more than just documentation. In fact, catalog/metadata librarians create, edit, maintain metadata of all different types. They create documentation that outline best practices, guidelines, etc. They create crosswalks between metadata standards, work on ontologies, controlled vocabularies, provide consultation services, do name authority work, and the list goes on. Actually the work done is at once complex, varied, and detailed. What is trendy about the “idea” of metadata? Perhaps it’s the ability to say that … well metadata is good for discovery and access. It’s like eating your favorite dessert without the hassle  of having to make it by hand every time you want it. The only thing is that the devil is in the details. The more I work with metadata the more this phrase rings true. The big picture is important. But it’s really the details where the action happens. The trendy focus on the big picture runs the risk of minimizing these details and the day to day work of catalog/metadata librarians. I guess the question becomes then how to promote the details as cool.

In all of this, I realized one important aspect. I am NOT a metadata expert. Why? It’s because this is a continually changing and evolving field. There’s always a learning curve. Being a catalog/metadata librarian means continually learning about metadata and gaining news skills.