Blog

Search Web Services Document

Tony Hammond

Tony Hammond – 2007 November 09

In Search

The OASIS Search Web Services TC has just put out the following document for public review (Nov 7- Dec 7, 2007):

_Search Web Services v1.0 Discussion Document

From the OASIS announcement:

“This document: “Search Web Services Version 1.0 - Discussion Document - 2 November 2007”, was prepared by the OASIS Search Web Services TC as a strawman proposal, for public review, intended to generate discussion and interest. It has no official status; it is not a Committee Draft. The specification is based on the SRU (Search Retrieve via URL) specification which can be found at http://www.loc.gov/standards/sru/. It is expected that this standard, when published, will deviate from SRU. How much it will deviate cannot be predicted at this time. The fact that the SRU spec is used as a starting point for development should not be cause for concern that this might be an effort to rubberstamp or fasttrack SRU. The committee hopes to preserve the useful features of SRU, eliminate those that are not considered useful, and add features that are not in SRU but are considered useful. “

DC in (X)HTML Meta/Links

Tony Hammond

Tony Hammond – 2007 November 06

In Metadata

This message posted out yesterday on the dc-general list (with following extract) may be of interest:

_“Public Comment on encoding specifications for Dublin Core metadata in HTML and XHTML

2007-11-05, Public Comment is being held from 5 November through 3 December 2007 on the DCMI Proposed Recommendation, “Expressing Dublin Core metadata using HTML/XHTML meta and link elements” «http://dublincore.org/documents/2007/11/05/dc-html/» by Pete Johnston and Andy Powell. Interested members of the public are invited to post comments to the DC-ARCHITECTURE mailing list «http://www.jiscmail.ac.uk/lists/dc-architecture.html» , including “[DC-HTML Public Comment]” in the subject line. Depending on comments received, the specification may be finalized after the comment period as a DCMI Recommendation.”

STIX Fonts in Beta

Tony Hammond

Tony Hammond – 2007 November 06

In Standards

Well, Howard already blogged on Nascent last week about the STIX fonts (Scientific and Technical Information Exchange) being launched and now freely available in beta. And today the STM Association also have blogged this milestone mark. So, just for the record, I’m noting here on CrossTech those links for easy retrieval. As Howard says:

“I recommend all publishers download the fonts from the STIX web site at www.stixfonts.org today.”

(And for those who want to see more of Howard, he can be found in interview here on the SIIA Executive FaceTime Webcast Series. 🙂

DCMI Identifiers Community

Tony Hammond

Tony Hammond – 2007 October 17

In Identifiers

Another DCMI invitation. And a list. Lovely.

See this message (copied below) from Douglas Campbell, National Library of New Zealand, to the dc-general mailing list.

(Continues)

Hybrid

Tony Hammond

Tony Hammond – 2007 October 17

In XMP

So, back on the old XMP tack. The simple vision from the XMP spec is that XMP packets are embedded in media files and transported along with them - and as such are relatively self-contained units, see Fig 1.

Hybrid - A.jpg

Fig. 1 - Media files with fully encapsulated descriptions.

But this is too simple. Some preliminary considerations lead us to to see why we might want to reference additional (i.e. external) sources of metadata from the original packet:

PDFs
PDFs are tightly structured and as such it can be difficult to write a new packet, or to update an existing packet. One solution proposed earlier is to embed a minimal packet which could then reference a more complete description in a standalone packet. (And in turn this standalone packet could reference additional sources of metadata.)
Images
While considerably simpler to write into web-delivery image formats (e.g. JPEG, GIF, PNG), it is the case that metadata pertinent to the image only is likely to be embedded. Also, of interest is the work from which the image is derived which is most likely to be presented externally to the image as a standalone document. (And in turn this standalone packet could reference additional sources of metadata.)

(Continues)

NLM Blog Citation Guidelines

I’ve just returned from Frankfurt Book fair and noticed that there has been some recent in the The NLM Style Guide for Authors, Editors and Publishers recommendations concerning citing blogs.

Which reminds me of an issue that has periodically been raised here at Crossref- should we be doing something to try and provide a service for reliably citing more ephemeral content such as blogs, wikis, etc.?

OpenDocument Adds RDF

Tony Hammond

Tony Hammond – 2007 October 14

In Metadata

Bruce D’Arcus left a comment here in which he linked to post of his: “OpenDocument’s New Metadata System“. Not everybody reads comments so I’m repeating it here. His post is worth reading on two counts:

  1. He talks about the new metadata functionality for OpenDocument 1.2 which uses generic RDF. As he says:
> _“Unlike Microsoft’s custom schema support, we provide this through the standard model of RDF. What this means is that implementors can provide a generic metadata API in their applications, based on an open standard, most likely just using off-the-shelf code libraries.”_

This is great. It means that description is left up to the user rather than being restricted by any vendor limitation. (Ideally we would like to see the same for XMP. But Adobe is unlikely to budge because of the legacy code base and documents. It’s a wonder that Adobe still wants XMP to breathe.)

  * He cites a wonderful passage from Rob Weir of IBM (something which I had been considering to blog but too late now) about the changing shape of documents. Can only say, go read [Bruce’s post][2] and then [Rob’s post][3]. But anyway a spoiler here:

    > _&#8220;The concept of a document as being a single storage of data that lives in a single place, entire, self-contained and complete is nearing an end. A document is a stream, a thread in space and time, connected to other documents, containing other documents, contained in other documents, in multiple layers of meaning and in multiple dimensions.&#8221;_</ol>

    I think the ODF initiative is fantastic and wish that Adobe could follow suit. However, I do still hold out something for XMP. After all, nobody else AFAICT is doing anything remotely similar for multimedia. Where’s the W3C and co. when you really need them? (Oh yeah, [faffing][4] about the new [Semantic Web logo][5]. 😉

I Want My XMP

Tony Hammond

Tony Hammond – 2007 October 13

In XMP

Now, assuming XMP is a good idea - and I think on balance it is (as blogged earlier), why are we not seeing any metadata published in scholarly media files? The only drawbacks that occur to me are:

  1. Hard to write - it’s too damn difficult, no tools support, etc.
    • Hard to model - rigid, “simple” XMP data model, both complicates and constrains the RDF data model

Well, I don’t really believe that 1) is too difficult to overcome. A little focus and ingenuity should do the trick. I do, however, think 2) is just a crazy straitjacket that Adobe is forcing us all to wear but if we have to live with that then so be it. Better in Bedlam than without. (RSS 1.0 wasn’t so much better but allowed us to do some useful things. And that came from the RDF community itself.) We could argue this till the cows come home but I don’t see any chance of any change any time soon.

(Continues)

Metadata - For the Record

Tony Hammond

Tony Hammond – 2007 October 13

In XMP

Interesting post from Gunar Penikis of Adobe entitled “Permanent Metadata” Oct. ’04). 1.

He talks about the the issues of embedding metadata in media and comes up with this:

“It may be the case that metadata in the file evolves to become a “cache of convenience” with the authoritative information living on a web service. The web service model is designed to provide the authentication and permissions needed. The link between the two provided by unique IDs. In fact, unique IDs are already created by Adobe applications and stored in the XMP - that is what the XMP Media Management properties are all about.”

DataNet

Tony Hammond

Tony Hammond – 2007 October 12

In Data

Last week, my colleague Ian Mulvany posted on Nascent an entry about NSF’s recent call for proposals on DataNet (aka “A Sustainable Digital Data Preservation and Access Network”). Peter Brantley, of DLF, has set up a public group DataNet on Nature Network where all are welcome to join in the discussion on what NSF effectively are viewing as the challenge of dealing with “big data”. As Ian notes in a mail to me:

“It seems that for a fully integrated flow of data then publisher involvement is going to be required, and it is clear from the proposal that the NSF are also interested in rights management or at negotiating that issue.”