News

PBCore 2.1 Schema pre-release: Comments wanted!

The PBCore Schema and Documentation teams have completed their revision of the 2.1 schema and definitions, and have now opened a formal call for comments period before launch.  You can see the final draft of the XSD with updated documentation in our brand-new PBCore 2.1 GitHub repository: https://github.com/WGBH/PBCore_2.1. ; We really want feedback before the official release, so make sure we’ve nailed everything! Please submit any comments as issues in this new repo, so that we can all see and track them.

Our goal is to launch the updated schema on Friday, August 14, so we really need comments to be submitted by Friday, August 7 so we can fully address any crucial changes.

Process and rationale for the 2.1 Schema

In deciding what changes to implement for PBCore 2.1, the PBCore Schema Team considered the following criteria:

  • What problems and challenges with the PBCore 2.0 schema were brought up during our open call for PBCore users to submit issues on GitHub as of September 30, 2014?
  • What issues required a change to the schema, and what issues could be resolved by improving the documentation around PBCore elements and attributes to clarify their usage?
  • What changes would allow the 2.1 schema to remain backwards compatible with PBCore 2.0, so that current users could continue to validate their metadata? Keep in mind that PBCore 2.1 is an incremental version, not a major release.

After balancing these considerations, we decided to implement the following schema changes for PBCore 2.1:

1) In 2.0, the collection of attributes that includes ‘@source, @ref, @version, @annotation’ — which is designed to allow catalogers to provide accurate information about the source of their metadata — was available to most elements, but not all of them.

The updated schema provides the option to include ‘@source, @ref, @version, @annotation’ information to all elements. This change affects:

  • pbcoreDescription
  • pbcoreAssetDate
  • creator
  • contributor
  • publisher
  • instantiationLocation
  • instantiationDimensions
  • instantiationDataRate
  • instantiationFileSize
  • instantiationTimeStart
  • instantiationDuration
  • instantiationDate
  • instantiationTracks
  • instantiationChannelConfiguration
  • instantiationAlternativeModes
  • essenceTrackType
  • essenceTrackDataRate
  • essenceTrackFrameRate
  • essenceTrackPlaybackSpeed
  • essenceTrackSamplingRate
  • essenceTrackBitDepth
  • essenceTrackTimeStart
  • essenceTrackDuration

 

In all of these cases, these attributes are optional, but they will allow users to document their metadata in greater detail if they so choose. The increased ability to provide URIs for PBCore XML data elements will benefit users who wish to convert their PB Core XML records to Linked Data. Discussions are ongoing with EBU Core to provide a common RDF ontology for this purpose.

2) Several PBCore elements include attributes — specifically, the @titleType attribute (for pbcoreTitle), the @subjectType attribute (for pbcoreSubject), and the @affiliation attribute (for pbcoreCreator, pbcoreContributor, and pbcorePublisher) — for which users also requested the ability to provide the source of the value used to express the type. In future releases of PBCore, the schema could be altered such that these attributes become elements in their own right. However, in order to comply with goal of keeping PBCore 2.1 backwards compatible, this was not possible for 2.1. Therefore, we created several new optional attribute groups for inclusion with the following elements:

  • for pbcoreTitle:
    • @titleTypeSource
    • @titleTypeRef
    • @titleTypeVersion
    • @titleTypeAnnotation
  • for pbcoreSubject:
    • @subjectTypeSource
    • @subjectTypeRef
    • @subjectTypeVersion
    • @subjectTypeAnnotation
  • for creator, contributor and publisher:
    •  @affiliationSource
    • @affiliationRef
    • @affiliationVersion
    • @affiliationAnnotation

 

3) In PBCore 2.0, the element essenceTrackBitDepth did not include the option to add a @unitofMeasure attribute. PBcore 2.1 now includes this optional attribute.

4) In order to provide more flexibility in accommodating local metadata elements and values (e.g. from an in-house database), the requirement to use extensionAuthorityUsed when using the container extensionWrap has been removed. However, we still highly recommend using this element whenever possible to document the source system or schema of the element.

5) One newly suggested element, to define asset version, was approved by the Schema Team.  However, it was not explicitly added to the schema at this time due to the ongoing work to merge some efforts between PBCore and EBUCore (currently limited to a common RDF ontology).  This element does exist in EBUCore; therefore, the team suggests that this (and other similar elements) be considered for future releases of PBCore and/or a future merger with EBUCore. In the meantime, it should be expressed in PBCore using extensions, with the EBUCore element as the extensionElement and EBUCore as extensionAuthorityUsed, as follows:

  • version – The purpose of this element is to express the version of the intellectual content of the asset being described. In this case, version is specific to content, not to the instantiations of that content (e.g. UK edit, Hulu version, etc.). Use the EBUCore element version to express this information. In a PBCore extension, this could look like:

 

<extensionWrap>
<extensionElement>version</extensionElement>
<extensionValue>Hulu Version</extensionValue>
<extensionAuthorityUsed>EBUCore</extensionAuthorityUsed>
</extensionWrap>

6) The schema team found that several of the issues raised on GitHub were caused by confusion over the definition or usage of an element or attribute. Many of these were addressed by changes to the documentation, specifically the element and attribute definitions, which have been completely revised.  Best practice guidelines for nearly all elements have also been added, and will appear on the website alongside definitions. Longer explanations addressing common use cases (e.g. when and how to use extensions) will be provided in blog posts on the updated PBCore website.

Several other changes were suggested over the course of this process. Many would require changes that may be implemented for the eventual release of PBCore 3.0, which will provide a broader revision of the PBCore data model. Please also note that this release does not include changes to the PBCore vocabularies. Suggested changes for these are forthcoming.

We welcome your questions and comments about PBCore 2.1!

Webinar Recap: Examples in Context

This is the fourth and final post in a series about the PBCore webinar that the Education Team presented in October. A recording of the webinar can be found here, and we’ll be recapping the event over the next few weeks.  Part one of the series is located here, part two is located here, and part three is located here.

The previous blog post written by Sadie Roosa described the basic elements and form of a PBCore document. Expanding on that, this post will demonstrate various ways that PBCore can be structured in order to meet specific objectives.

Use Cases

The following use cases will be discussed:

  • Archival Description

  • Asset Management

  • Digital Preservation

  • Sharing and Exchange

  • PBCore in METS

PBCore is very flexible, which allows it to be utilized in any of these use cases. In fact, it is possible to design PBCore files that apply to a number of these use cases. However, for the sake of readability, the sample PBCore XML files discussed in this post are kept as simple as possible. All of the sample files are available on the documentation page of the PBCore website. Keep in mind that these examples are not meant to be prescriptive, but are rather meant to inform you as to what PBCore can do for you. Don’t worry If your institution uses PBCore files that look different than these examples! The standard is designed to be flexible in order to accommodate the needs of many different types of institutions, and any deviation from these examples should not be seen as wrong or incorrect.

PBCore in XML

Sadie’s blog post mentioned that PBCore data is the most useful when held in an XML document. XML stands for eXtensible Markup Language. XML is used as a data storage and transmission format across a number of fields and disciplines. It is both human and machine readable, and the data structure of XML allows multi-dimensional data to be nested hierarchically. These characteristics make it particularly useful in the realm of A/V archiving, where we often deal with many versions, instances, or parts of a single intellectual unit.

Take for example the process of transferring a single analog video tape to the digital realm. The first step may be to create a single Preservation Master file to represent all of the content on the tape, with a one-to-one relationship. Next, the Preservation Master file may be broken into several Access Copies according to the programmatic content on the tape, creating a one-to-many relationship between the Preservation Master file and the Access Copies. An excel document can do a good job of describing a one-to-one relationship, but it cannot accurately describe a one-to-many relationship. XML, on the other hand, can describe complex relationships using a tree structure. It is for this reason that PBCore is typically held in XML files.

The Simplest PBCore

Sadie’s blog post also discussed which fields are required for a PBCore file to be valid. With these in mind, we’ve created two XML documents that represent the “simplest” PBCore files possible. These documents have just enough information to validate, thought they do not actually tell us much about the assets they are describing.

The Simple Instantiation sample describes a single instantiation. It is possible to do this by using <pbcoreInstantiationDocument> as the root element. In this case, there are only two required elements: <instantiationIdentifier> and <instantiationLocation>

The Simple Description Document sample describes an asset. The three fields required for this to be valid are <pbcoreIdentifier>, <pbcoreTitle>, and <pbcoreDescription>.

By examining these samples, we can see that the bare minimum fields required for validation do not tell us much about the assets they are describing. PBCore documents this bare would almost certainly never appear in the wild. These documents were designed to show what PBCore looks like in its most simple form, in order to give those of you that might be confused about the standard an idea as to what is actually going on in these XML files. The rest of the examples discussed will have far more information in them, but are still structured similarly to these two simple documents.

Archival Description

In this use case, the assets PBCore is describing are physical objects. However, before we begin looking at the example, I would like to take a quite aside to discuss some vocabulary

In the parlance of PBCore, the term “Intellectual Content” is used to refer to what is typically known in the archival community as “Descriptive Metadata”. This includes information that can help to identify an asset, as well as describe the content of the asset. “Intellectual Property” is used to refer to what is typically known in the archival community as “Administrative Metadata”. This is information that concerns the creation, authorship, and ownership of the asset. For the purpose of this blog post, the terms DMD (Descriptive MetaData) and AMD (Administrative MetaData) will be used. Another type of metadata that will be discussed is Technical Metadata (TMD). This type of metadata refers any attributes which describe the physical or digital properties of an object. The size of a book, the playback speed of a tape, and the frame width of a digital video are all examples.

The example PBCore document for Archival Description can be described with the following model:archival_description

At the root level there is a <pbcoreDescriptionDocument> element. Within this element, we have DMD, AMD, and a Physical Instantiation.

The DMD is made up of a number of fields that describe the asset and its content. These fields include <pbcoreIdentifier>, <pbcoreTitle>, <pbcoreDescription>, <pbcoreGenre>,  and <pbcoreCoverage>.

The AMD section is made up of a number of fields that describe who was involved with the creation of the asset. In this case, these fields include <pbcoreCreator> and <pbcoreContributor>. You may notice that these elements are repeated many times within this document. This is a perfectly valid way to use these elements, since many a/v and broadcast assets have a number of creators, contributors, and copyright holders.

The physical instantiation exists within the <pbcoreInstantiation> element. All of the information in this section refers to the actual physical instantiation, which in this case is a Master version of a VHS tape, about an hour and a half in duration, that resides in the McHale University Library.

 The information contained in these three sections affords a number of interactions with the asset. It can aid users in finding this asset, it can help users determine what the rights associated with the asset may be, and how it can help users play back the physical instantiation of the asset, among other things.

Asset Managment

PBCore can also be used to aid in asset management. In this case, the PBCore document describe the physical and digital locations of A/V assets and their digital derivatives, as well as the relationships between the tapes and files. The example file can be illustrated with the following model:

 asset_management

In this model we see that like the last example, the asset is represented by a single pbcoreDescriptionDocument. However, unlike the earlier example, this asset is also described by three different pbcoreInstantiations. This is a many-to-one relationship, where each instantiation is related to the original content, but has a different physical or digital format.

The physical instantiation describes the original object, in this case, a reel of 1/4 inch audio tape. The first digital instantiation describes the 96kHz/24bit preservation master WAV file, and the second digital instantiation describes MP3 access copy derived from that master file. In the PBCore file, each instantiation section includes DMD, AMD, and TMD associated with that specific object. This information describes the properties of the object, the content contained on the object, technical details about the object that aid in playback and discovery, and also how the objects relate to one another.

The <instantiationRelation> element is a parent element that contains the <instantiationRelationType> and <instantiationRelationIdentifier> elements. The <instantiationRelationIdentifier> element contains the identifier of the instantiation or object that the instantiation in question is related to, and the <instantiationRelationType> element describes that relationship. For example, the preservation master is “Derived From” the physical object, and the acccess copy is then “Derived From” the preservation master.

The purpose of this example is to show that how a single asset can be described using as many instantiations as necessary. These instantiations contain information used for describing the objects, as well as relating them to one another, which aids greatly with asset management.

Digital Preservation

Another powerful aspect of PBCore is that it allows the inclusion of fields from other existing metadata standards. In the example for the Digital Preservation use case, PREMIS data is embedded in the PBCore file in order to combine the descriptive power of PBCore with an existing standard for preservation metadata. The following model illustrates how the PREMIS fits into the PBCore conceptually.

 digital_preservation

The idea here is that each instantiation contains a PREMIS event, and that the PREMIS events that concern reformatting, transferring, or transcoding an instantiation to another links those two instantiations together.

Within the PBCore, this all occurs in the <pbcoreExtension> or the <instantiationExtension> field, depending on whether the PREMIS event is at the Description Document level or the Instantiation level. In a general sense, these fields can contain any number of fields from another existing metadata standard. The example XML files provided on the website demonstrate how to use both the <pbcoreExtension> and <instantiationExtension>. Which element you should use will depend on what kind of information you are gathering, and how you want you information structured. <instantiationExtension> should be used for metadata concerning the specific instantiations, and the <pbcoreExtension> element should be used for information that concerns the asset across all instantiations.

Sharing and Exchanging

PBCore can aid in publishing or transmitting your assets through the use of the <pbcoreCollection> element. An example of an XML file that does this can be found here, and the following model illustrates the overall structure of the XML.

sharing_and_exchanging

In this model, we see that the <pbcoreCollection> element is at the root-level. This element has a number of attributes (collectionTitle, collectionDescription,  collectionSource,  collectionRef, and collectionDate) which provide DMD and AMD for the overall collection of assets to be held within the document. The assets within the collection are represented by different <pbcoreDescriptionDocument> elements, one for each asset. In the model pictured above, the description documents look similar to the description document discussed in the Archival Description case study; however, these documents can contain any information as long as the required fields are included.

The purpose of using <pbcoreCollection> as the root element is that it allows the user to combine a number of description documents into a single XML file. From there, any sharing or transmission, such as publishing the collected asset as an RSS feed, can be enabled with the XML file.

 PBCore in METS

In the Digital Preservation example we saw that it possible to embed other metadata standards in PBCore. In this example we’ll flip that around and look at embedding PBCore in another standard, in this case METS. METS (Metadata Encoding and Transmission Standard) is used by institutions to move and ingest files across content management systems. It was designed to have a malleable structure so that it could be used by a number of different institutions for a number of different uses. The model below shows that a single METS document can have the the following sections: Header, Descriptive Metadata Section (dmdSec), File Section (fileSec), Structure Map (structMap), Structure Link (structLink), and Administrative Metadata Section (amdSec). Within the amdSec we typically see Technical Metadata (techMD) and Source Metadata (sourceMD) sections. Both of these can be seen as Technical Metadata with the techMD referring to information about digital files, and sourceMD containing technical metadata about the physical source objects.

pbcore_in_mets

The METS guideline suggests that these sections to be filled with existing metadata standards, and since PBCore provides fields for type of descriptions used in the METS amdSec.

The XML files must be structured very specifically, and the example on the PBCore website can be referenced if you wish to see what this would actually look in the XML. The example is actually based off of METS XML delivered to Columbia University Libraries by George Blood Audio/Video/Film for a large video reformatting project.

 What’s Next?

We hope that this blog post has helped you to understand some of the many ways that PBCore can be used to empower your assets. We hope that if you were on the fence about using PBCore at your institution that this blog post clarified any questions you may have about doing so. However, we want to stress that you should make sure that you have the IT and content management systems necessary to support PBCore before you embark upon a PBCore initiative. The examples discussed in this blog post are meant to be informative and not prescriptive, so please keep in you may have to add or remove elements from these examples so that they meet the needs of your institution.

Webinar Recap: Getting Started with PBCore

This is the third post in a series about the PBCore webinar that the Education Team presented in October. A recording of the webinar can be found here, and we’ll be recapping the event over the next few weeks.  Part one of the series is located here, and part two is located here.

After going over a brief history of PBCore, Sadie moved on to a step-by-step guide to using PBCore to describe AV collections.

The first step to creating a PBCore record is to inventory your AV content,  using at least Identifiers, Titles, and Descriptions.  These are the only requirements to create a valid PBCore XML document.

There are many ways to go about collecting and storing PBCore data. If you use databases, like Filemaker, you can either revise an existing template or create a new template inline with the PBCore data model. This is pretty simple, as a lot of PBCore fields already correspond to fields you might already have, like title, format, duration, etc.  You can also use content management systems like Omeka, Collective Access, and Drupal, which already have plug ins for PBCore, providing easy PBCore outputs, such as PBCore xml.

Although this is not ideal, a very simple way that you can start entering PBCore data is in a spreadsheet. PBCore can express complex relationships, so it isn’t naturally flat the way that a spreadsheet is; however there are ways to record most of the data in a spreadsheet. Storing it in a spreadsheet is a good starting place, because later you can have someone write a script that will take that data and turn it into PBCore xml, where you can take advantage of the complex relationships that PBCore can handle. And considering how often most of us use spreadsheets in our work, it’s a very approachable first step.

Now we’re going to review some important concepts about how PBCore is structured, which inform what you will be able to use your PBCore for.

Instantiations and Assets

As was mentioned in earlier posts in this series, PBCore has these things called instantiations.  An instantiation is an occurrence of as asset. If you have a master tape of a program about cowboys, that tape is an instantiation. If someone dubs that master tape onto a DVD to make a viewing copy, then that viewing copy is another instantiation. If you digitize that program about cowboys, the preservation quality file you create during digitization is another instantiation, and so on.  Each instantiation is an occurrence of the same content: the cowboy program, which we refer to as the asset. All of the information at the asset level is related to that content, rather than to the tape or file, which is the instantiation.

In addition to having data at both the asset and instantiation level, PBCore also allows you to structure your data with one of 3 root elements. These will be explained more fully in a later post, however, it’s useful to keep these structures in mind as you’re considering how you’ll store your data.

Root Elements

The simplest root element is the instantiation document. This describes a single occurrence, and can be used for things like capturing technical data in PBCore about a digital instantiatiation.

 The most commonly used root element is the description document. These contain information on the asset level, and can contain one or more instantiations as well; however, to make valid PBCore description documents, you don’t actually have to have an instantiation at all. Most people I know do use instantiations in their description documents since they see no point in creating asset level data if it doesn’t relate to an object or file. Just know though, that a perfectly valid PBCore description document can consist only of an identifier, a title, and a description, which are all at the asset, not instantiation level.

PBCore also provides a root element, called PBCore collection, which allows you to group your description documents into one xml file, contained with some data at the collection level. Again, this was just a brief overview, and future posts will go into detail about the ways you can take advantage of these structures.

Now, let’s delve into the nitty-gritty of gathering all of this data and figuring out which PBCore elements and attributes it fits into.

Required Fields

Identifier

The first required field is an Identifier.  Identifiers exist at the Asset and Instantiation levels, and are unique to the asset or instantiation.  PBCore also requires source information for all of the identifiers you use. Typical sources on the asset level are things like randomly generated ids for each piece of content, or sometimes codes used for specific programs or films. Typical sources for instantiations identifiers are things like barcodes, tape numbers, filenames, etc.

The Identifier field is repeatable, which makes it important for every identifier to have a source.  In the PBCore schema, you can include both the tape number that a production used on their tape and also the barcode that the archive added to the tape when they processed it into their collection.  Including source data makes these identifiers easier to differentiate in the future.

Title

Another PBCore-required field is title. Title is also repeatable, and the schema allows for noting the type of title (although type is not required like source is for identifiers). By doing so, you can add any and all relevant titles, such as the series and episode titles. You can create a title for raw footage that you might have in your collections. And if that raw footage was recorded for a specific film or program, then you can also add the title of that in a separate title field within the same record. And just to reinforce the idea of assets and instantiations: since title refers to the content, not the specific occurrence, title is contained in the asset-level record.

Description

The final required field on the asset level is the description. This is another a repeatable field, although repetition of this field is used somewhat less frequently than repetition of the identifier and title fields.  Descriptions can include as much or as little information about the content of the assets as you wish, from summaries and shot logs to whatever descriptive data you can gather from the labels and other documentation available. Some users have employed various work-arounds, including just putting a space or some non-description like “description unavailable” into the description fields, so that there is a value in the field and the XML will validate. While I would encourage every effort to put actual data into this field, sometimes it is just impossible to do so.

Location

The final required field, location, is only at the instantiation level, and only required if you have an instantiation. This field records the physical (or virtual) location of the occurrence, so that after you describe it in PBCore, you can go back and find it when you need it again. Location data can be of various types. Sometimes you can make it as simple as the name of the holding institution, for example “WGBH Archives.” More specific location data can also be used. For physical items people often include which room, shelf, box, etc. that the item is stored in. It is just as important to provide locations for digital files.  Filepaths or hard drive names allow future curators to locate the digital file for future uses.

 Now that we have the basics of how PBCore is structured and what is required in PBCore, let’s dive into how to get the data and put in all of these fields. Let’s start things out a physical piece of media. You’ve got this tape in your collection. What do you do?

A good place to start is with the physical format. If you have the tape (or other piece of physical media) in front of you, it should be easy to tell what format it is.  The following information can often be found on the object itself:

  • Format

  • Date.  Sometimes there won’t be a date right on the label, in which case you might be able to get the date information from related documentation, or by watching the content of the tape.

  • Generation

  • Duration

  • Descriptive information, such as:

    • Title

    • Content description

Best practices for spelling, capitalization, and other style choices can be found in the recommended controlled vocabulary on the PBCore website.

But, I’m sure you’re asking, what if it’s an instantiation that you can’t hold in your hand?

Now say instead of a tape, you have a file. How do you get data about this file into PBCore? The process of putting data for digital objects into PBCore is pretty similar, although the methods for finding that data are somewhat different.

Two of the first–and easiest–things to capture are the identifier and the format. These can come straight from the file name–use it as the instantiation identifier.  The extension leads you to the data you will put in the Digital Format field. PBCore does not encourage adding “dot mp4” as a digital format, rather, use the Internet Media or Content Type, such as  “video/mp4.” Just by looking at the data that your computer will give you on the file, you can also gather other information on the digital media instantiation, such as file size, duration, frame size, etc.

Another good way to get this data about digital instantiations is by using tools that give you information about your files, such as ffprobe, mediainfo, and ExifTool (all of which are free). Using these tools, you can not only of generating this information, but also of take the data that the tools generate and put it straight into PBCore. One of the benefits of using tools like these is the automation: there are no human mistakes, no inconsistencies based on human judgment, such as different people using gigabyte or GB, and automation also saves on staff time.

I’ve gone through some of the easiest data to gather for both physical and digital instantiations. However, as you can see from this list, once you’re comfortable with PBCore, you can take advantage of the wide range of elements (and these are just on the instantiation level!). Using PBCore gives you a superb structure for describing your media assets as fully as you want: write rick descriptions, add subject headings and track the origin of those headings using a link to the authority’s URI, add genre information, and fill in the coverage field with information the content’s place and time using place names, geospatial coordinates, date ranges, and other types and formats of data.  Don’t forget to use an attribute to note which type of data it is!  Finally, use the intellectual property fields to detail any rights information available, including the content’s creators, contributors, and publishers. Within each of these elements there is one subelement for the name of the person and another sub element for their role, so that the name and role are always associated.

Finally, the American Archive of Public Broadcasting, is developing its own set of cataloging guidelines based on the PBCore elements we’re using.  These can be found here.

Webinar Recap: What is PBCore and why should I use PBCore?

This is the second post in a series about the PBCore webinar that the Education Team presented in October 2014. A recording of the webinar can be found here, and we’ll be recapping the event over the next few weeks.  The webinar began with a brief history of PBCore, which is outlined here.

Hopefully, by now you know what PBCore is. Previous blog posts have tackled the what. Still not sure?? Allow me to tackle the why.

Your problem: 

You’ve got stuff. Lots of it. It’s on different servers, in different file formats, in different vaults, on different shelves, on reels and cassettes. You have multiple copies of the same recording hiding in different places; you might have masters and derivatives, or you might just have the same copied tape sitting in 10 satellite locations. You’re worried about reaching the character limit of your file-system because the most recent set of files were begun with: project_x_03012015_onlocation_raw_master_copy_1

Your solution:

PBCore!

How?

PBCore is built on several important principles which can untangle your mess of files, folders, tapes, servers, and shelves.

Specific, but standard

Metadata standards are like opinions – everybody’s got one – but having this one in your pocket gives you an advantage. PBCore is specifically built to support the workflows and descriptive needs of the audiovisual community, all the way from production to long-term archival storage and preservation. It’s got a lot in common with other metadata standards with which you might already be familiar, Dublin Core and EBUCore. However, it’s not too specific that different parts of the community can’t use it – PBCore can accommodate other (non-AV) materials as well.

And finally, here’s a long list of peer institutions that use it already:

The Smithsonian Channel

The Dance Heritage Coalition

Alliance for Community Media

International Criminal Tribunals, The Hague

University of Notre Dame

Rock and Roll Hall of Fame and Museum

American Archive of Public Broadcasting

… and more

Interoperable

As the audiovisual and archival communities become more digital-centric, the ability to seamlessly share information between organizations is paramount. Having a common language to describe not only what materials are, but where they are, and in what context they exist, is essential. And while your custom descriptive process might work right now (and bravo to you for developing your own!), we have all gone through trying to share our work with our peers, only to find that we’re speaking completely different languages. PBCore gives you the shared vocabulary you need.

Now that you know why, in future weeks, we’ll be exploring exactly how to implement PBCore in your own collections.

 

Webinar Recap: A Brief History of PBCore

This is the first post in a series about the PBCore webinar that the Education Team presented in October 2014. A recording of the webinar can be found here, and we’ll be recapping the event over the next few weeks.  The webinar began with a brief history of PBCore, which is outlined here.

PBCore began in 2001 with funding from the Corporation for Public Broadcasting. The PBCore metadata schema was created for U.S. public broadcasting community for use by anyone managing A/V assets, such as:

          – Librarians

          – Archivists

          – Independent producers

          – Broadcasters

Today, the PBCore XML schema is also used as a general data model for audiovisual collections.

The very first version of the PBCore XML schema — version 1.0 — was released in April 2005. Developed as a derivative of the Dublin Core metadata standard, it contained 48 metadata elements intended to describe a media asset or resource’s intellectual content, creation, creators, usage, permissions, constraints, use obligations, and its form or format.

In January 2007, v1.1 of PBCore introduced nesting into the schema. By August 2010, v1.3 offered 62 elements organized into 15 containers and 4 sub-containers. Then, a significant overhaul was completed, and version 2.0 of PBCore was released in November 2011.

In 2014, the American Archive of Public Broadcasting, an initiative managed by WGBH and the Library of Congress, took charge of further developments to PBCore. Since 2014, over 50 members from the audiovisual asset management community have been actively working as members of the PBCore Advisory Subcommittee of the Cataloging and Metadata Committee of the Association of Moving Image Archivists. These teams (Communications, Documentation, Education, Schema, and Website) are charged with reassessing the PBCore schema, continuing outreach to PBCore users and potential adopters, creating resources for the PBCore community, and improving the PBCore website. They engage in activities such as revising and documenting the PBCore schema (version 2.1 is schedule for release in summer 2015), ongoing surveying of the A/V community, and preparing and leading webinars, conference presentations, blog posts about PBCore and best practices for describing and organizing audiovisual assets.

In 2015, as a precursor to the Code4Lib conference in Portland, Oregon, members of the PBCore Subcommittee and the AAPB met with representatives from EBUCore, a widely adopted European audiovisual metadata schema, to plan for future collaborations between the two standards (more on that coming soon!)

PBCore Controlled Vocabulary Recommendations

The PBCore Controlled Vocabularies Team has been working hard to review the existing state of the PBCore maintained and recommended vocabularies. We have made recommendations for which Asset level elements and attributes we think PBCore should maintain a unique vocabulary for, and which we should recommend the use of specific pre-existing vocabularies for. Our recommendations can be reviewed in this google doc (https://docs.google.com/document/d/1a5oXzT7BuYSfwRvafHeb_e-Mx8qBQeJqzFD8wcOnknk/edit), where we encourage PBCore users to make comments and ask questions. Starting on Wednesday February 18th, we will be taking your comments into consideration as we proceed to the next step of revising the controlled vocabularies. We greatly appreciate any input from PBCore users.

PBCore: A How-to and Why-to Webinar | Recording from 10/23/2014

On October 23, 2014, the AMIA PBCore Advisory Subcommittee’s Education Team offered a webinar titled “PBCore: A How-to and Why-to Webinar.” Geared toward archivists, librarians, and anyone who has audiovisual collections at their institutions, the presenters offered contextual background; explained the benefits and reasons why PBCore is perfectly suited for managing audiovisual collections; offered step-by-step guidance on inventorying av assets and getting started with PBCore; and described the use of PBCore in different settings, such as asset management, digital preservation, archival description, and use with other schemas such as PREMIS and METS.

Before we got started, I failed to mention why the PBCore Advisory Subcommittee felt it was important to host this webinar. PBCore is already pretty well established within the moving image and audio archival community, but not so much across the archival profession in general. Why? I think it’s because archivists have only recently realized that we are facing an imminent loss of our audiovisual cultural heritage if we don’t take steps to preserve these collections in the next 10-15 years. And for many years, institutions have not dealt with their audiovisual collections, often because these collections represent only a small part of the overall collection; because av collections are more expensive to preserve and manage; and because there aren’t enough people skilled in audiovisual preservation.

The PBCore Advisory Subcommittee is encouraged by the recent invigoration among archivists who are beginning to deal with their deteriorating av collections, as well as the digital video and audio collections, and we know that PBCore has a place in these efforts. PBCore is uniquely suited to provide a standard way for archivists to record metadata about their av collections.

We look forward to providing more opportunities like yesterday’s webinar in the future, as well as improving the schema over the next few months, clarifying and improving documentation, creating our new website, and generating new PBCore resources.

Many thanks to all of those who attended yesterday’s webinar, and if you have any questions, please don’t hesitate to reach out to the presenters whose email addresses I have listed below:

Casey E. Davis, WGBH | casey_davis [at] wgbh [dot] org
Maureen McCormick Harlow, PBS | mmharlow [at] pbs [dot] org
Sadie Roosa, WGBH | sadie_roosa [at] wgbh [dot] org
Morgan Oscar Morel | moran.morel [at] georgeblood [dot] com

Enjoy the recording and please feel free to share it among your colleagues and networks!
(The chat text is best readable when the video is viewed in full-screen.)

PBCore: A How-to and Why-to Webinar | Recording from 10/23/14 from American Archive on Vimeo.

This post was written by Casey E. Davis, Project Manager for the American Archive of Public Broadcasting at WGBH and Chair of the AMIA PBCore Advisory Subcommittee.

PBCore Handout & New PBCore XML Examples

The PBCore Advisory Subcommittee’s Communications Team has created a handout for people considering using PBCore at their institutions. Feel free to download the pdf and share it with your colleagues as you begin to consider options for managing metadata about audiovisual materials in your collections.

Additionally, Education Team member Morgan Oscar Morel has created several new examples of PBCore being used in different contexts, which are all now available on the PBCore website, including:

If you have any examples of PBCore in use at your institution and are willing to share them with the community, please send us an email at pbcoreinfo@wgbh.org

As always, don’t hesitate to reach out to members of the Advisory Subcommittee with any questions.