XML
From Arnout Engelen
See also:
Contents |
[edit] MIME-types
I'd like my Apache to serve up musicxml files with the proper MIME-type, 'application/vnd.recordare.musicxml+xml'.
Since those files have the '.xml' extension, we must enable mod_mime_magic and configure our system to detect musicxml files.
The '<!DOCTYPE score' seems to be a reasonable marker to detect MusicXML files by, except that we can't really predict at what byte offset it will appear. I figured it'll probably be somewhere between 38 and 57:
# xml 0 string \<?xml text/xml # musicxml >38 string \<\!DOCTYPE\040score application/vnd.recordare.musicxml+xml >39 string \<\!DOCTYPE\040score application/vnd.recordare.musicxml+xml >40 string \<\!DOCTYPE\040score application/vnd.recordare.musicxml+xml >41 string \<\!DOCTYPE\040score application/vnd.recordare.musicxml+xml >42 string \<\!DOCTYPE\040score application/vnd.recordare.musicxml+xml >43 string \<\!DOCTYPE\040score application/vnd.recordare.musicxml+xml >44 string \<\!DOCTYPE\040score application/vnd.recordare.musicxml+xml >45 string \<\!DOCTYPE\040score application/vnd.recordare.musicxml+xml >46 string \<\!DOCTYPE\040score application/vnd.recordare.musicxml+xml >47 string \<\!DOCTYPE\040score application/vnd.recordare.musicxml+xml >48 string \<\!DOCTYPE\040score application/vnd.recordare.musicxml+xml >49 string \<\!DOCTYPE\040score application/vnd.recordare.musicxml+xml >50 string \<\!DOCTYPE\040score application/vnd.recordare.musicxml+xml >51 string \<\!DOCTYPE\040score application/vnd.recordare.musicxml+xml >52 string \<\!DOCTYPE\040score application/vnd.recordare.musicxml+xml >53 string \<\!DOCTYPE\040score application/vnd.recordare.musicxml+xml >54 string \<\!DOCTYPE\040score application/vnd.recordare.musicxml+xml >55 string \<\!DOCTYPE\040score application/vnd.recordare.musicxml+xml >56 string \<\!DOCTYPE\040score application/vnd.recordare.musicxml+xml >57 string \<\!DOCTYPE\040score application/vnd.recordare.musicxml+xml
This does not yet match MusicXML files encoded in UTF-16. I haven't figured out how to encode that elegantly, but this seems like a start:
0 beshort 0xFFFE >2 belong 0x3C003F00 text/xml >0x74 beshort 0x3C00
Doesn't quite seem to work yet: it still gives me 'application/xml' - even when I remove that from /etc/mime.types!
[edit] Definition
XML formats are usually defined using a DTD, W3C XML Schema or RelaxNG schema.
Convertors:
- dtd to xschema
- w3c schema_hack
- lumrix dtd2xs
- thaiopensource trang
- some links
I briefly tried these, and trang seems the most promising.
[edit] Data Binding
See also:
When writing an import/exporter from/to some specific XML format, it'd be nice to have some code generator which takes a DTD or XSchema, and generates the datatypes and the parsing/translation code to get from XML to that data format and vice-versa.
Projects that seem to do something like this:
- Java
- PlainXML
- C++
- hydra express (haven't looked in-depth yet), properietary
- lmx, proprietary
- codesynthesis, free
xsd segfaulted on the musicxml DTD I generated with trang. I'm rather losing faith in data binding (at least for C++), maybe I should just skip this step and use a DOM parser directly... Note: I have a complete musicxml binding in C++ via xsd working here based on the official MusicXML W3C XML Schema and a few modifications implemented in XSLT. A timepart.cpp and parttime.cpp demonstrate the common transformation, and it all works. contact mlang@delysid.org if you are interested in MusicXML bindings using xsd.
[edit] Parsing
[edit] C++
Libxml++ (the c++ binding around libxml2) and xerces seem to be the main choices if you want DTD validation.
Rosegarden and Canorus, however, already seem to use QtXml. Maybe I should use that for starters - seems to be a rather straight-forward SAX parser, so shouldn't be all too hard to port to another SAX parser later on.
