IC PUBS current usage

The current METS version (5.5) normalizes all incoming documents by converting them into XML. One of the output XML formats supported is IC PUBS v5 and v7. The metadata of the document is identified as best as possible* and converted into that standard's PublicationMetadata elements. The document's body is converted into an PUBS DocBody with Section's and Para's; in the future, we expect to identify and tag more of the structure, such as lists and tables, as well. METS also indicates the results of its extraction by inserting tags into the PUBS, marking the boundaries of the extracted items, thus creating an "in-line" element for each.

* It is important to note that METS fares better at identifying metadata in some document types than in others. It also uses whatever metadata was provided along with the document. There are a handful of required string-valued elements, such as Title, where METS will default to **UNKNOWN** if necessary and include a Warning to that effect. And there are certain other required elements -- in particular, PublicationMetrics -- where METS must default to omitting the element and include a Warning to note the resulting invalidity.

While PUBS allows a set of in-line tags far less rich than the set of ontology classes METS extracts, the specification allows the use of the 'qualifier' attribute to convey the often-more-specific information.

Example (PUBS7):

<pubs:IntelDoc
  xmlns:msp="urn:us:gov:ic:pubs"
  xmlns:ism="urn:us:gov:ic:ism"
  xmlns:ntk="urn:us:gov:ic:ntk" 
  xmlns:irm="urn:us:gov:ic:irm" 
  DESVersion="7" ism:DESVersion="7" ntk:DESVersion="5" irm:DESVersion="5"
  ism:resourceElement="true" ism:createDate="2011-07-21" ism:classification="U" ism:ownerProducer="USA" ism:compliesWith="ICDocument"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
  xsi:schemaLocation="urn:us:gov:ic:pubs ../Schema/PUBS/PUBS-XML.xsd">
  <pubs:PublicationMetadataList>
    <pubs:PublicationMetadata>
      <pubs:AdministrativeMetadata>
        ...
      </pubs:AdministrativeMetadata>
      <pubs:DescriptiveMetadata>
        <pubs:Title ism:classification="U" ism:ownerProducer="USA">Whatever</pubs:Title>
        <pubs:Language encoding="RFC1766">en</pubs:Language>
        <pubs:Description ism:classification="U" ism:ownerProducer="USA">Whatever</pubs:Description>
        ...
      </pubs:DescriptiveMetadata>
    </pubs:PublicationMetadata>
  </pubs:PublicationMetadataList>
  <pubs:DocumentBody ism:classification="U" ism:ownerProducer="USA">
    <pubs:Para ism:classification="U" ism:ownerProducer="USA"><pubs:LocationOfInterest qualifier="mets:Country">French</pubs:LocationOfInterest> Ambassador
 <pubs:Person qualifier="mets:Person" givenName="John" surname="Doe" position="Ambassador">John Doe</pubs:Person></pubs:Person>
<pubs:Event qualifier="mets:Travel">flew</pubs:Event> to <pubs:LocationOfInterest qualifier="mets:City">Cleveland</pubs:LocationOfInterest>.   ...  </pubs:Para>
  ...
  </pubs:DocumentBody>
</pubs:IntelDoct>

Future plans

The Common Metadata Standards Tiger Team (CMSTT) continues to produce new versions of the PUBS standard 3 times a year, with v10 soon to be approved. v5.5 offers the choice of 5, 7, 9; we plan to add 10 in the next release.