FAQDOCSDOWNLOADSGALLERYJA-SIG LogoSourceForge.net Logo
Overview
Getting Started
Using the Content Manager
Authoring Content
Developing Sites
Project Definition Files
Filesystems
Path patterns
Content Types
XML Content
RDF: Extensible Metadata
Dublin Core Metadata
VCard Metadata
XML Includes
XSL Transformations
Filters
Non-XML Content
Editors
Ignored Directories
Permissions
Navigation and Site Maps
Optimizations
Glossary
Bibliography

About metadata and RDF

Metadata is information about a resource. In the context of HyperContent, that resource can be any file that lives in a repository - whether it is an XML file, an image file, a Microsoft Word document, or any other kind of file. The most common use of metadata is to indicate a title, description and keywords about a resource, but robust metadata support requires a framework that can be extended to store any type of specialized or custom metadata.

RDF, short for Resource Description Framework, is a metadata framework standard issued by the World Wide Web Consortium (w3c), the same group responsible for HTML, XML, HTTP, CSS, DOM and many other standards which define the interoperability of web tools. RDF, as a way of presenting structured information, bears some abstract resemblance to XML, and can in fact be represented in XML. At its core, however, RDF uses a directed graph model which, while lacking the greater flexibility of XML, makes it possible to make metadata declarations and queries on the basis of simple statements.

An RDF statement has three components; a resource, a property, and a value. For example, the RDF for the file which is used to generate this page contains the statement "The resource located at '/docs/manual/develop/rdf.xml' has a property 'title' with the value 'RDF: Extensible Metadata'". In this case, the value is a simple string; it is also possible that the value of a statement might be another resource, which can also have properties. For example, the file which represents the author of this document has two statements which combined would read "The resource located at '/contributors/alex.xml' has a property 'VCard' whose value is a resource that has a property 'Full Name' with the value 'Alex Vigdor'". This chaining of resources as the values of properties of other resources allows for an XML-like nesting of metadata information.

A concept borrowed from XML that is critical to RDF is that of namespaces; a namespace represents the origin and specific intent of a set of names. These names might be the names of XML elements or RDF properties; the namespace is used to resolve ambiguity that can result, for example, if two different metadata standards both use a property with the same name, but with a different intent or structure. A namespace is identified by a URI, which is mapped to a prefix for readability. For example, the 'title' property used in HyperContent is in a namespace declared by the Dublin Core Metadata Initiative version 1.0, which uses the namespace 'http://purl.org/dc/elements/1.1/'. If that URI is mapped to the prefix 'dc', the Dublin Core title property is notated as 'dc:title'.

RDF in XML

The W3C has issued a standard for representing RDF in XML, so that XML can be used as the language for transmitting RDF information without losing the critical semantics of RDF. This standard is fairly flexible in terms of the XML structure, but this flexibility threatens to make RDF/XML very difficult to use with XPath, the query language used in XSL. HyperContent has implemented an XML representation of RDF that is compliant with the standard, but adopts a more rigid subset of the spec which is especially tailored for querying with XPath. Specifically, the resource -> property -> value relationship is represented in nested XML tags, where a property is represented as a tag with the namespace and name of the property; if the value is a resource, its properties are represented as child tags. If the value is data, it is represented as a child text node of the property tag. This XML representation is used both for metadata includes, and for the .rdf files that represent metadata when a zip archive of content is downloaded from the Content Manager.

Here's what the RDF for this file looks like serialized in XML. Note that the root element rdf:RDF is the container for a collection of statements, and provides the mapping of namespaces to prefixes. Four namespaces are declared here: the namespace for rdf, the 'fs' namespace for the filesystem, the 'dc' namespace for Dublin Core, and the 'cms' namespace for HyperContent. The fs:File tag represents the file resource, whose properties are child tags.



RDF Grammars

The next two chapters offer specific details about two RDF metadata grammars supported by custom editors in HyperContent, Dublin Core and VCard. Here we list the properties defined by two core grammars in HyperContent.

Filesystem grammar

xmlns:fs="http://www.ais.columbia.edu/sws/xmlns/cufs#"

The filesystem code which provides the repository layer for HyperContent has its own namespace; this namespace has a few properties which are used specifically for search indexing but do not appear in the RDF model generated by the filesystem. The principal property of interest for site developers is fs:File; the resource which describes a file in the repository has the property 'rdf:type' with the value 'fs:File'. When serialized in XML, this is reflected as an element 'fs:File' which represents the file resource; all the file's metadata properties are represented as children of this element. Thus in XPath a file's metadata properties are always located with a string like "rdf:RDF/fs:File/{ns:property}". See the example above for help visualizing the structure.

HyperContent grammar

xmlns:cms="http://www.ais.columbia.edu/sws/xmlns/cucms#"

HyperContent supports a number of built-in metadata properties which may be applied to a resource of type fs:File:

cms:comment
Text entered by the person who edited this file revision at the time they saved it.
cms:editor
The name of the person who saved this revision of the file.

cms:hidden
When present with a value of "true", indicates that this file should not be built or processed as an include.
cms:replacement
The value of this property specifies the path of a file or a URI for a resource from which data and metadata should be retrieved at runtime, to replace the contents of this file. Can be useful for creating a local placeholder for a network resource, or mirroring the contents of a file at different locations in the output filesystem.
cms:includes
HyperContent support ad-hoc, user-specified file includes. These may be used to make connections between two files, such as a course file and a faculty file to indicate the faculty member teaches that course. The value of the cms:includes property is a resource of type rdf:Bag. rdf:Bag is a special collection type of resource in RDF, which will contain one or more rdf:li resources. Each rdf:li resource, in turn, will have a property of type cms:Include, which may have the following properties:

cms:data
a value of yes indicates the data of the specified resource should be included.
cms:metadata
a value of yes indicates the metadata of the specified resource should be included.
cms:location
specifies the path in the repository or URI of the resource to be included.
cms:Thumbnail
The value of the cms:Thumbnail property is a resource which has the following properties.

cms:width
Width of the thumbnail image.
cms:height
Height of the thumbnail image.
cms:format
The mime type representing the thumbnail image format
cms:data
The binary thumbnail image data encoded in Base64