From ingest, let me address another step on the DCC life cycle, namely that of access including issues of use, reuse, and transformation.
- Access, use and reuse: Ensure that data is accessible to both designated users and reusers, on a day-to-day basis. This may be in the form of publicly available published information. Robust access controls and authentication procedures may be applicable.
- Transform:Create new data from the original, for example:by migration into a different format, orby creating a subset, by selection or query, to create newly derived results, perhaps for publication
It is to help access, use, reuse and transformation that we use standards. If the metadata are inaccurate, inconsistent or are encoded in a non standard way or don’t follow any visible rules, it is almost impossible to access information correctly, reuse or transform it. It doesn’t matter if you work with a ILS, digital repository or another platform or solution (even Google Drive). Use standards. If you invent your own standard for a small project, then write down what this standard is and why it was created in a README.txt file.
I’ve heard the argument that the emphasis on standards might be over rated, especially if you use Dublin Core as an encoding standard. To date, I’ve seen more than 31 flavors of Dublin Core in terms of how each field is interpreted. It isn’t so much the local variations of what content goes into each field. There will always be local variations. But it is important to explain these variations and in general how the standard is being implemented so that others know how you’re using dc:date (date of the creation of the image or when the data set was established as an official data set and publication date). I try to emphasize this with colleagues and especially those doing research who need metadata to mark up their data. Standards are awesome. But sometimes you need something local. That’s fine too as long as you explain it and ensure that people other than you and your team know how you’re using the metadata and data and have a “map” to access, reuse and transform it.
Another reason behind the really absurd notion that standards are overrated is sometimes called the full text search solution of solutions. Here, full text searching is seen to find the or any resource based on the text in a document. Like with most things. One size does not fit all. Think of an image. Let’s say the image has a substantial amount of technical metadata but lacks any title, description, or any descriptive metadata. How would it be possible to discover this image? What of a resource that is a text that never mentions what it is about? Perhaps this text is a poem or a surrealist novel that is a play on the art form of the novel? Suffice to say that there is more than one way to discover and access resources. Full text is one way. Others are keywords, subjects, titles, authors, dates, and much more. It is through this variety that more can be discovered. It is really a disservice to our users to only present 1 way. Thanks to this variety, description and information of a resource can be fuller. This fullness beyond what is just in the text itself (i.e. associated metadata that uniquely describe the resource) provides context for the resource. Thanks to this context, people can then better know how to reuse this resource and its metadata. Also, it can offer clues about what is important for migration and transformation (especially if the people who created the metadata are no longer around).
Access is a continual process. When you first begin a project and are deciding about metadata standards and implementation, access has to figure into the discussion. If not, then you risk losing context and users who are unable to search and discovery your resources.