Extending MathML 2

[24 April 2014]

The other day I got an inquiry from a user having trouble getting their extensions to MathML 2 to work in their new XSD schema. I learned some things while working on their problem.

First, viagra order let’s be clear. MathML says that it is intended to be extensible. Section 7.3.2 of MathML2 reads in full:

The set of elements and attributes specified in the MathML specification are necessary for rendering common mathematical expressions. It is recognized that not all mathematical notation is covered by this set of elements, that new notations are continually invented, and that sub-communities within mathematics often have specialized notations; and furthermore that the explicit extension of a standard is a necessarily slow and conservative process. This implies that the MathML standard could never explicitly cover all the presentational forms used by every sub-community of authors and readers of mathematics, much less encode all mathematical content.

In order to facilitate the use of MathML by the widest possible audience, and to enable its smooth evolution to encompass more notational forms and more mathematical content (perhaps eventually covered by explicit extensions to the standard), the set of tags and attributes is open-ended, in the sense described in this section.

MathML is described by an XML DTD, which necessarily limits the elements and attributes to those occurring in the DTD. Renderers desiring to accept non-standard elements or attributes, and authors desiring to include these in documents, should accept or produce documents that conform to an appropriately extended XML DTD that has the standard MathML DTD as a subset.

MathML renderers are allowed, but not required, to accept non-standard elements and attributes, and to render them in any way. If a renderer does not accept some or all non-standard tags, it is encouraged either to handle them as errors as described above for elements with the wrong number of arguments, or to render their arguments as if they were arguments to an mrow, in either case rendering all standard parts of the input in the normal way.

I don’t find this passage in MathML3, but the sample embedding of MathML into XHTML does extend the document grammar to include XHTML elements, so I believe that the design principle remains true.

It’s easy enough to extend the document grammar as expressed by the DTD: just provide new declarations of appropriate parameter entity references which include your new elements, something
along the following lines. Let us say that we have concluded that we want our extension elements to be legal everywhere that mml:mspace is legal, and we don’t need them anywhere else. We can write:

<!ENTITY % my.mml.extensions "my:ext1 | my:ext2">
<!ENTITY % petoken "%mspace.qname; | %my.mml.extensions;" >
<!ELEMENT my:ext1 (#PCDATA) >
<!ATTLIST my:ext1
id ID #IMPLIED
flavor CDATA #IMPLIED
tone CDATA #IMPLIED >
<!ELEMENT my:ext2 EMPTY >
<!ATTLIST my:ext2
gloss IDEREF #REQUIRED >

For XSD, it could in principle be even simpler. The simplest way to make an XSD schema easily extensible is to include wildcards at appropriate points in content models, to allow users’ extension elements to be included in valid documents. All the user has to do is supply a schema document with the declarations of their extension elements:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:my="http://example.com/nss/sample"
targetNamespace="http://example.com/nss/sample"
elementFormDefault="qualified">

<xs:complexType name="extension-type-1" mixed="true">
<xs:sequence/>
<xs:attribute name="id" type="xs:ID"/>
<xs:attribute name="flavor"/>
<xs:attribute name="tone"/>
</xs:complexType>
<xs:complexType name="extension-type-2" mixed="true">
<xs:sequence/>
<xs:attribute name="gloss" type="xs:IDREF" use="required"/>
</xs:complexType>

<xs:element name="ext1" type="my:extension-type"/>
<xs:element name="ext2" type="my:extension-type"/>

</xs:schema>

In the MathML 2 XSD, it turns out to be slightly more complicated, because despite explicitly expecting extensions to the document grammar, the designers didn’t put in the most obvious possible extensibility hook: the content models of MathML elements contain no wildcards, except in the case of the annotation element. So we have some more work to do.

Plan B is to use element substitution groups. Since we want our elements to be legal wherever mml:mspace is legal, we can just make our elements substitutable for mml:mspace. In the simplest case, we would then just write our schema document thus:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:my="http://example.com/nss/sample"
xmlns:mml="http://www.w3.org/1998/Math/MathML"
targetNamespace="http://example.com/nss/sample"
elementFormDefault="qualified">

<xs:import namespace="http://www.w3.org/1998/Math/MathML"/>

<xs:element name="ext1" substitutionGroup="mml:mspace"/>
<xs:element name="ext2" substitutionGroup="mml:mspace"/>

</xs:schema>

The wrinkle here is that when we write it this way, our extension elements get the same type as their substitution-group head, mml:mspace. If we just reinsert the declarations of my:extension-type-1 and my:extension-type-2 and the type attributes on the element declarations, the XSD validator will remind us firmly but politely (in most cases) that the types of my:ext1 and my:ext2 must be derived from that of mml:mspace. In the case of the XSD schema for MathML 2, that means they must be derived from type mml:mspace.type. For document-oriented schemas, this type-derivation requirement is a nuisance; it came from the data-base-oriented part of the working group that specified XSD.

Fortunately, it’s only a nuisance, not a serious obstacle. All we need to do is to define our extension types in terms of changes to type mml:mspace.type. This will require a couple of intermediate types which we’ll call bridge types. The first step in the derivation is to clear away everything we don’t want in our extension types, by restricting away any unwanted content (we’re in luck: mml:mspace.type has no content at all) and any unwanted attribute (again we’re in luck: all attributes are optional). Since one of our extension types uses an id attribute and the other does not, we’ll define two bridge types.

<xsd:complexType name="bridge-with-id">
<xsd:complexContent>
<xsd:restriction base="mml:mspace.type">
<xsd:attribute name="width" use="prohibited"/>
<xsd:attribute name="height" use="prohibited"/>
<xsd:attribute name="depth" use="prohibited"/>
<xsd:attribute name="linebreak" use="prohibited"/>
<xsd:attribute name="class" use="prohibited"/>
<xsd:attribute name="style" use="prohibited"/>
<xsd:attribute name="xref" use="prohibited"/>
<xsd:attribute ref="xlink:href" use="prohibited"/>
</xsd:restriction>
</xsd:complexContent>
</xsd:complexType>

<xsd:complexType name="bridge-no-id">
<xsd:complexContent>
<xsd:restriction base="my:bridge-with-id">
<xsd:attribute name="id" use="prohibited"/>
</xsd:restriction>
</xsd:complexContent>
</xsd:complexType>

Now we can define our extension types in terms of these (perhaps biting our tongues at the verbosity and awkwardness of the XSD syntax):

<xs:complexType name="extension-type-1" mixed="true">
<xs:complexContent>
<xs:extension base="my:bridge-with-id">
<xs:sequence/>
<xs:attribute name="id" type="xs:ID"/>
<xs:attribute name="flavor"/>
<xs:attribute name="tone"/>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="extension-type-2" mixed="true">
<xs:complexContent>
<xs:extension base="my:bridge-no-id">
<xs:sequence/>
<xs:attribute name="gloss" type="xs:IDREF" use="required"/>
</xs:extension>
</xs:complexContent>
</xs:complexType>

The reference to xlink:href requires that we import the XLink namespace (even though all we’re doing is saying we don’t want that attribute here), so we need to add another xs:import element as well as another namespace declaration.

The schema document as a whole now looks like this:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:my="http://example.com/nss/sample"
xmlns:mml="http://www.w3.org/1998/Math/MathML"
xmlns:xlink="http://www.w3.org/1999/xlink"
targetNamespace="http://example.com/nss/sample"
elementFormDefault="qualified">

<xs:import namespace="http://www.w3.org/1998/Math/MathML"/>
<xs:import namespace="http://www.w3.org/1999/xlink"/>

<xs:complexType name="bridge-with-id">
<xs:complexContent>
<xs:restriction base="mml:mspace.type">
<xs:attribute name="width" use="prohibited"/>
<xs:attribute name="height" use="prohibited"/>
<xs:attribute name="depth" use="prohibited"/>
<xs:attribute name="linebreak" use="prohibited"/>
<xs:attribute name="class" use="prohibited"/>
<xs:attribute name="style" use="prohibited"/>
<xs:attribute name="xref" use="prohibited"/>
<xs:attribute ref="xlink:href" use="prohibited"/>
</xs:restriction>
</xs:complexContent>
</xs:complexType>

<xs:complexType name="bridge-no-id">
<xs:complexContent>
<xs:restriction base="my:bridge-with-id">
<xs:attribute name="id" use="prohibited"/>
</xs:restriction>
</xs:complexContent>
</xs:complexType>

<xs:complexType name="extension-type-1" mixed="true">
<xs:complexContent>
<xs:extension base="my:bridge-with-id">
<xs:sequence/>
<xs:attribute name="id" type="xs:ID"/>
<xs:attribute name="flavor"/>
<xs:attribute name="tone"/>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="extension-type-2" mixed="true">
<xs:complexContent>
<xs:extension base="my:bridge-no-id">
<xs:sequence/>
<xs:attribute name="gloss" type="xs:IDREF" use="required"/>
</xs:extension>
</xs:complexContent>
</xs:complexType>

<xs:element name="ext1" type="my:extension-type-1"/>
<xs:element name="ext2" type="my:extension-type-2"/>

</xs:schema>

There are two easy ways a vocabulary designer can make this process simpler:

  • Include wildcards at points where you want your grammar to be extensible.

This is a bit of a blunt instrument, but it sometimes gets the job done.

If you want to give the extender more control (perhaps they want some extension elements to be legal in some contexts and others to be legal in other contexts), give them extension hooks in the form of abstract elements with a minimally constraining type (e.g. xs:anyType), so that they don’t need to play games with type derivations, the way it was necessary to do for the MathML 2 extension.

  • Include abstract elements with minimally constraining types at points where you want your grammar to be extensible in context-appropriate ways.

As a general rule: any important element class in your document grammar (e.g. phrase-level-element or list or paragraph-level-element) is a good candidate for an abstract element intended to allow users to add new elements simply by making their new elements substitutable for the appropriate abstract element. (We have a new phrase-level element? Fine: declare <xs:element name="new-phrase" substitutionGroup="target:phrase-level-element"/> and we’re done.)

Of course, the determinism rules (aka Unique Particle Attribution constraint) in XSD still make extending a complex document grammar harder than it needs to be. But by providing appropriate extension hooks, the designer of a document grammar can make things a lot simpler for the user with special needs.