Reference Annotations

C4DM publishes a number of reference annotations, listed at http://isophonics.net/datasets/, and plans to produce more in general ongoing work. Examples to date include annotations of harmony (chords and keys), rhythm (beats and bars), and structural boundaries (e.g. verse and chorus), covering the Beatles catalogue, and a number of songs from Queen, Carole King, and Michael Jackson. These annotations are used widely in the research community for evaluating algorithms for automatic analysis of audio, including in the international MIREX evaluations (Music Information Retrieval Evaluation eXchange, www.music-ir.org/mirex).

The annotations are transcribed by expert listeners attending closely to musical recordings using interactive software such as Sonic Visualiser (www.sonicvisualiser.org), a process that takes a significant amount of time and care. Such data typically have an associated level of confidence, based on how thoroughly the results have been checked, but depending also on the intrinsic ambiguity of the musical information being described. The data are associated with specific releases of musical recordings, although these often cannot be shared for legal reasons. (To use the data, researchers must purchase the match- ing recordings using the release information supplied, such as CD and track numbers, titles and artists.)
Annotation data may be updated and should be versioned; the existing published data have in some cases been updated multiple times as corrections and additions are made. This is a normal situation which must be allowed for, as human annotators can not be expected to produce perfect work.

Currently C4DM has only informal methods for describing and distributing reference annotations through its Isophonics web site. Files are simply copied to the web server with the descriptive information entered in the Drupal-based website. There is no policy for managing descriptive metadata, no properly maintained storage facility for such data, and no effective policy in place for backup or versioning.