Balisage 2012 – T minus 21 days

[16 July 2012]

Hard to believe, approved but Balisage 2012 is only three weeks away.

On Monday 6 August there is a pre-conference symposium on quality assurance and quality control in XML. I won’t list all the scheduled talks here, thumb but the symposium program has a good balance of theory and practice, sovaldi sale abstract rule and concrete application, and there are several case studies from organizations with major XML publishing programs (Ontario Scholars Portal, the U.S. National Library of Medicine’s National Center for Biotechnology Information, the American Chemical Society, and Portico).

Tuesday through Friday, the conference proper will take place. Among the many talks I am looking forward to, today I’ll mention just a few.

Mary Holstege opens the conference with a talk about type introspection in XQuery; as a principal engineer at MarkLogic, she has a deep background both in the technology of XQuery and related specifications and good understanding of how real customers with large amounts of textual data actually use XML.

Later the same day, Steven Pemberton of W3C will speak on the relation between data abstractions and their serializations, with (passing) reference to work on XForms 2.0. Steven gives dynamite talks, and I want to hear how he describes the interplay of general design problems with the concrete work of spec development.

And at the other end of the week, Friday morning Liam Quin (also of W3C) will talk about work he has been doing to characterize the body of material served as XML on the Web, in particular that part of it which is not actually well-formed XML (and thus, in the strict sense, not XML at all). Since sometimes people use the existence of non-well-formed data on the Web to support arguments that XML’s well-formedness rules are too strict for practical use, I look forward to hearing Liam’s analysis.

Of course, there is a lot more to look forward to. I hope, dear reader, that I will see you in Montreal next month!

XSLTForms 1.0RC, subforms, and a 50% speedup

[9 July 2012]

A couple of weeks ago, more about I took some time to explore the use of sub-forms in XSLTForms, information pills as a possible way to speed up an XForm I had written that was a little slower than I would have liked.

The short version of the story is: WOW! Well worth learning to use.

To understand the longer version, dear Reader, you should know that one of the most common performance issues in serious uses of XForms is that forms sometimes slow down when the instance documents they are working on get big. I assume this is because browsers are profligate with resources, perhaps because some aspects of the XML DOM force them to be, perhaps because profligacy pays off most of the time. But I can’t say I really know for sure.

So one of the things that sophisticated users of XForms spend a lot of time on is finding ways to avoid loading all the instance documents at once. (This is a lot easier when you’re using an XML database as a back end, of course.) Another is finding ways to avoid loading all of the form at once; that is where sub-forms come in. The word doesn’t occur in the XForms 1.1 spec, but a number of implementations provide experimental support for sub-forms as an extension. The basic idea is that whenever certain events occur in a form, the XForms implementation will load some appropriate resource specifying some XForms widgets and bind them into the current form. When other events occur, those widgets will be unloaded again. I first saw this in a demo on the BetterFORM site a few years ago, but I see that Mark Birbeck was talking about this as long ago as 2006. And more recently, Alain Couthures has added sub-form support to XSLTForms.

Making my form use a sub-form turned out to be simpler than I had feared. I already had a full working version of the form, and it was clear which part of it I wanted to load and unload dynamically. What I had to do was just:

  1. Move the part of the form that should load dynamically (which I’ll now call the subform) into a separate XHTML + XForms document. Give it a simple XForms model, and check to make sure that it works by itself. (It doesn’t actually have to work by itself, but it’s helpful to know the subform hasn’t got fatal errors on its own.)
  2. In the main form, put an XForms xf:group where the sub-form used to be; give that group an ID.
  3. Associate a load action with the appropriate event. (In my form, I had a trigger that toggled a switch, exposing the read/write view of some material. The sub-form now has that read-write view, and the trigger now throws a load action.)
  4. Associate an unload action with another appropriate event. (In my form, this was the trigger that formerly toggled the switch back to the read-only view.)

In principle, that’s it, though I had to fiddle a bit to make everything work right. In particular I ended up adding a ref="." attribute to the outermost xf:group in the sub-form. I’m not yet sure just when this is necessary and when it’s not.

The simple example of sub-forms loading on the XSLTForms web site is very helpful here: it’s a very simple example and illustrates all the moving parts clearly. (But you will need to read the source and think about what is going on; there isn’t a lot of commentary or documentation around.)

What really impressed me were the effects of this change on the performance of the form.

Since sub-form support was added fairly recently to XSLTForms, I had to upgrade from an older release of XSLTForms to the recent release 1.0RC. I did some fairly tedious timings before and after I made the change, and I can say with some evidence that this change alone gave my form about a 25% increase in speed. Then I made the changes mentioned above, to use sub-forms. That gave me another 25% increase, so that on almost all actions version 1.0RC using sub-forms was about twice as fast as the older version Beta3 using a monolithic form.

Moral 1: If you are having performance issues with an XForm, and you can see how you might use a sub-form, then try it.

Moral 2: If you are having performance issues with an XForm, and you are using XSLTForms, then try moving to 1.0RC even if you can’t see how to use a sub-form in your context. Alain Couthures has done a lot of work on performance, and it clearly helps.

Bear in mind that the precise syntax and semantics of sub-forms are a topic of discussion in the XForms working group, so (a) they are subject to change, and (b) the working group is open to suggestions for making sub-forms (or any other part of XForms) work better.

XML, XForms, and XQuery courses in June 2012

[5 April 2012]

I’ll be teaching three courses / workshops this June.

XML for digital librarians

Recently the organizers of the ACM / IEEE-CS 2012 Joint Conference on Digital Libraries asked me to teach a pre-conference tutorial on “Making the most of XML in Digital Libraries”; JCDL will be hosted by George Washington University in Washington, cheapest DC, prostate this year.

The tutorial description runs something like this:

The Extensible Markup Language (XML) was designed to help make electronic information device- and application-independent and thus give that information a longer useful lifetime. XML is thus a natural tool for constructing digital libraries. But where exactly does XML fit into the conceptual framework of digital libraries? Where can XML and related technologies help achieve DL goals?

This tutorial will provide participants with an introduction to basic concepts of XML and a DL-oriented overview of XML and related technologies (XML, viagra dosage XPath, XSLT, XQuery, the XML information set, XDM, XForms, XProc, and many more). The intent is to show how XML can be used to help digital libraries achieve their goals and to enable participants to know which XML technologies are most relevant for the work they are involved with.

The JCDL site doesn’t have a detailed schedule yet, so I don’t know the exact time and date of this tutorial.

XForms for XML users

I’ve also arranged with Mulberry Technologies of Rockville, Maryland, to use their training facilities to offer two two-day training courses immediately before and after the JCDL conference, so that people traveling to DC for JCDL can extend their trip at one or both ends to attend the courses.

One course provides an Introduction to XForms for XML users, covering a technology with huge (and largely unrecognized) potential for users of XML. XForms is built around the model / view / controller idiom, with a collection of XML documents playing the role of the model, and with the view represented (in most uses of XForms) by an XHTML document. The end result is that a few lines of XHTML, XForms, and CSS can suffice to build user interfaces that would take hundreds of lines of Javascript and thousands or tens of thousands of lines of Java or a similar language. XForms makes it feasible to write project-specific, workflow-specific, even task-specific XML editors; no serious XML project should be without XForms capabilities.

The XForms course will be held just before JCDL, on Friday and Saturday 8 and 9 June.

XQuery for documents

The other course provides an introduction to XQuery for documents. Much of the interest in XQuery has come from database vendors and database users, and not surprisingly much of the public discussion of XQuery has focused on the kinds of problems familiar to users of database management systems. Those who use XML for natural-language documents have, I think, sometimes gotten the impression that XQuery must be aimed primarily at other kinds of XML and other kinds of people. This course is designed to introduce XQuery in a way that underscores its relevance to the human-readable documents that are historically the core use case for XML, with examples that assume an interest in documents rather than an interest in database management systems. This course will be held just following JCDL, on Friday and Saturday 16 and 16 June.

Further information on the two Black Mesa Technologies courses is on the Black Mesa Technologies site at the pages indicated.

An XForms case study, part 2: triggers for setting values

[20 February 2012]

This is another in an ongoing series of posts describing the design and implementation of the evaluation forms we put together last year for Balisage: The Markup Conference. The first post in the series discussed the overall look and feel of the forms.

The conference has used printed feedback forms in the past, sanitary and the online forms ask essentially the same questions as the paper forms have always asked. Overall, this what’s your impression of the registration process? the materials (the on-site guide to the conference, the proceedings), etc. etc. For each question, the paper forms ask for an overall judgement (Good, Okay, Bad) and provide a space for comments. In recent years, the space for the overall judgement has taken the form of three small graphic images, showing smiling, neutral, and frowning faces:

A first version of the form

The XML document created by the form has a sequence of elements for the various topics: registration, materials, presentations, etc. Each covers one question in the form. For most questions, the element has an attribute named overall to record the user’s overall judgement of how well we did in that area, and its contents include any comments the user types into the comments field. (A few questions have a more complex XML representation, which will be discussed in separate posts.) If you would like to see the XML produced by the form in full detail, you can: go to the overview page, select the form you want to examine, optionally fill it in with some sample data, and select the Show XML button at the bottom of the page.

A straightforward translation into XForms of the question about registration would use a select1 element with the three values Good, Okay, and Bad, and an associated comments field, like this:

<xf:group ref="instance('eval')/registration">
<xf:label>Registration process</xf:label>
<xf:select1 ref="@overall" appearance="full">
<xf:label/>
<xf:item>
<xf:label>Good</xf:label>
<xf:value>good</xf:value>
</xf:item>
<xf:item>
<xf:label>Okay</xf:label>
<xf:value>okay</xf:value>
</xf:item>
<xf:item>
<xf:label>Bad</xf:label>
<xf:value>bad</xf:value>
</xf:item>
</xf:select1>
<xf:textarea ref=".">
<xf:label>Comments</xf:label>
</xf:textarea>
</xf:group>

In a browser, it would look something like this:

You can also load the entire form into your own browser to examine it.

A first enhancement: smiling faces

The first obvious enhancement is to follow the paper form in using images of smiling and frowning faces for the various values. A few hours on the Web turned up a large number of public-domain clip-art sites, many with sets of widgets including smiling and frowning faces. It turns out that there are also smiling-face and frowning-face characters in Unicode (in the Miscellaneous Symbols block U+2600 to U+26FF), but no corresponding neutral face. And nowadays there is an entire block of Unicode emoticons (U+1F600 to U+1F64F). I was tempted by the Unicode characters, but ultimately decided that on my screen, at least, they were too hard to read clearly. So eventually I settled on a set of icons from a clip-art site.

Integrating the smiley images into the labels is simple: just embed an img element in the label, as shown here.

<xf:group ref="instance('eval')/registration">
<xf:label>Registration process</xf:label>
<xf:select1 ref="@overall" appearance="full">
<xf:label/>
<xf:item>
<xf:label>
<img src="lib/smiley_thumbs_up.png"
class="emoticon"
height="20"
alt="Good"/> Good
</xf:label>
<xf:value>good</xf:value>
</xf:item>
<xf:item>
<xf:label>
<img src="lib/smiley_pleased.png"
class="emoticon"
height="20"
alt="OK"/> Okay
</xf:label>
<xf:value>okay</xf:value>
</xf:item>
<xf:item>
<xf:label>
<img src="lib/smiley_thumbs_down.png"
class="emoticon"
height="20"
alt="Bad"/> Bad
</xf:label>
<xf:value>bad</xf:value>
</xf:item>

</xf:select1>
<xf:textarea ref=".">
<xf:label>Comments</xf:label>
</xf:textarea>
</xf:group>

This provides a bit more color in the form:

To make it easier to see what the images are saying, the full form includes a large version of the images at the top, with explanatory labels. (This version of the form actually uses two different sets of clip-art images, so I could look at each of them in context and decide which one I wanted to use.)

A second enhancement: clicking on the images

The overall Good / Okay / Bad ratings still take up a lot of screen real estate, though.

And what I really wanted was not just to use the images as part of the labels for the radio buttons, but to have an even simpler interface: ideally, I wanted just to show the three icons and let the user click on them to specify that they thought things were good, back, or okay. (There is a certain amount of danger in this way of thinking: part of the point of XForms is to provide device-independent forms that might be rendered in different ways on different devices: getting really specific about details like this may interfere with device independence.)

This can be achieved in XForms, too. After some experimentation, I ended up a method of handling this that looks like this in the browser: a large image showing the current rating for the topic, and three small images (but larger than the images in the earlier versions of the form) for changing the rating.

The large display is handled with a sequence of xf:group elements, each of which contains an img element and each of which binds to the overall rating for a given topic if that rating has the particular value associated with the image.

<xf:group ref="instance('eval')/registration">
<xf:label>Registration process</xf:label>
<div class="overall-rating">
<xf:group ref=".[@overall='good']">
<img src="lib/smiley_thumbs_up.png"
height="150px"
alt="good"/>
</xf:group>
<xf:group ref=".[@overall='okay']">
<img src="lib/smiley_pleased.png"
height="150px"
alt="okay"/>
</xf:group>
<xf:group ref=".[@overall='bad']">
<img src="lib/smiley_thumbs_down.png"
height="150px"
alt="bad"/>
</xf:group>
<xf:group ref=".[@overall='dunno']">
<img src="lib/smiley_no_speak.png"
height="150px"
alt="good"/>
</xf:group>
</div>
...
</xf:group>

So the big image for the registration topic will show a smiling face if the overall attribute on the registration element has the value good, and a frowning face if it has the value bad, and so on.

The value of the overall attribute could be set by the select1 elements shown above, but in the final version we used a set of XForms triggers (user-activatable controls — on a laptop screen these are typically buttons) labeled with the images for the values:

<div class="simple">
<xf:group ref="instance('eval')/registration">
<xf:label>Registration process</xf:label>
... <!--* read-only display image, as shown above *-->
<div class="overall-triggers">
<xf:trigger>
<xf:label><img src="lib/smiley_thumbs_up.png"
height="40" alt="good"/></xf:label>
<xf:setvalue ev:event="DOMActivate"
ref="@overall"
value="'good'"/>
</xf:trigger>
<xf:trigger>
<xf:label><img src="lib/smiley_pleased.png"
height="40"
alt="okay"/></xf:label>
<xf:setvalue ev:event="DOMActivate"
ref="@overall"
value="'okay'"/>
</xf:trigger>
<xf:trigger>
<xf:label><img src="lib/smiley_thumbs_down.png"
height="40" alt="bad"/></xf:label>
<xf:setvalue ev:event="DOMActivate"
ref="@overall"
value="'bad'"/>
</xf:trigger>
</div>
<xf:textarea ref=".">
<xf:label>Comments</xf:label>
</xf:textarea>
</xf:group>
</div>

As you can see, each trigger contains a label element which in turn contains the image, and a setvalue element which specifies the action to be run when the trigger is activated. In each case, the action is to set the value of the overall attribute to the appropriate value.

The group as a whole, the large image, and the set of three triggers are each wrapped in an XHTML div element with a different class value, to make it easier to style the display appropriately using CSS.

With XForms, it was easy to get a first working version of the form, much easier than it has ever been for me to get an equivalently complex form running in straight HTML forms, with its model of form content as a flat undifferentiated sequence of attribute-value pairs. And with XForms and CSS, it was (relatively) easy to change the look and to replace the radio buttons of the first version of the form with image-labeled buttons. Working with Javascript libraries has never been anything like this straightforward for me.

An XForms case study, part 1 (look and feel)

[16 January 2012]

In a previous post, generic I mentioned the evaluation forms we put together for the Balisage conference last year, here using XForms. This is the first of a series of posts discussing some aspects of the design and development in more detail.

One of the first requirements for these forms was that if at all possible they should have the same look and feel as the pages on the main conference site. In XForms, this turns out to be remarkably simple: XForms is designed to be styled using whatever styling mechanisms are usual for the host document vocabulary. In the case of XForms in XHTML, running in a Web browser, that means the form can be styled using CSS. And because the form is embedded in a normal XHTML document, any necessary logos or graphic apparatus can be embedded in the normal way.

The forms pointed to http://balisage.blackmesatech.com/ do three things to maintain the Balisage look and feel:

  • They point to an appropriate CSS stylesheet, in the usual way.

    <html xmlns="http://www.w3.org/1999/xhtml"
    xmlns:xf="http://www.w3.org/2002/xforms"
    xmlns:xhtml="http://www.w3.org/1999/xhtml"
    xhtml:dummy="Help the poor user of Mozilla
    evade the Mozilla namespace curse"
    ...
    >
    <head>
    <title>Balisage Speaker/Presentation Feedback</title>
    
    ...
    
    <link href="lib/feedback.css" rel="stylesheet" />
    
    ...
    

    [The xhtml:dummy attribute is a work-around that helps XSLTForms compensate for the bug in Mozilla’s XSLT implementation: Mozilla does not support the XPath namespace axis, so the only way XSLTForms can discover what namespaces are in scope is to walk around the tree collecting namespace bindings from elements and attributes in the document. The value of the attribute doesn’t matter; I use a value that reminds me of why I have to have the attribute there in the first place.]

  • They use the same overall document structure as the main Balisage pages: three divs of class header, mainbody, and footer, respectively,
    with the second in turn divided into navbar and body.

    <body>
    
    <div class="header">
    ...
    </div>
    
    <div class="mainbody">
    <div class="navbar">
    ...
    </div>
    
    <div class="body">
    ...
    </div>
    </div>
    
    <div class="footer">
    <hr />
    <p>Last revised 26 July 2011.
    Copyright &#169; 2011 <a
    href="http://www.blackmesatech.com/">Black Mesa
    Technologies LLC</a>.
    <a rel="license"
    href="http://creativecommons.org/licenses/by-sa/3.0/">
    <img alt="Creative Commons License"
    style="border-width:0"
    src="http://i.creativecommons.org/l/by-sa/3.0/88x31.png"
    /></a>
    <br/>
    The <span xmlns:dct="http://purl.org/dc/terms/"
    href="http://purl.org/dc/dcmitype/InteractiveResource"
    property="dct:title"
    rel="dct:type">Balisage 2011
    Feedback Forms</span>
    by <a xmlns:cc="http://creativecommons.org/ns#"
    href="http://balisage.blackmesatech.com/2011/feedback/"
    property="cc:attributionName"
    rel="cc:attributionURL">Black Mesa Technologies
    LLC</a>
    are licensed under a
    <a rel="license"
    href="http://creativecommons.org/licenses/by-sa/3.0/"
    >Creative Commons Attribution-ShareAlike 3.0
    Unported License</a>.</p>
    </div>
    
    </body>

    [Note: strictly speaking, as those who go and look at the HTML coding of the main Balisage site will discover, this is not “the same overall document structure” as the main Balisage pages. For reasons I won’t go into here, the main Balisage site uses XHTML tables to lay out the pages. I usually try to avoid tables, so the forms use div elements with CSS style rules to govern the layout. Apart from avoiding invidious comments from those who disapprove of using tables for layout purposes, I find the documents somewhat easier to navigate and edit this way.]

  • They embed the appropriate material. The header embeds the Balisage logo:

    <div class="header">
    <a href="http://www.balisage.net/">
    <img src="bflib/Balisage-2011-logoType.png"
    width="35%"
    alt="Balisage 2011 logo"/>
    </a>
    </div>

    And the navbar embeds a suitable set of links.

    <div class="navbar">
    <p class="upbutton">
    <a href="http://www.balisage.net/">Balisage 2011</a>
    </p>
    <p class="upbutton">
    <a href="../../">Balisage at Black Mesa Technologies</a>
    </p>
    <p class="downbutton">
    <a href="http://www.balisage.net/2011/Program.html">Program</a>
    </p>
    <p class="downbutton">
    <a href="index.xml">Balisage 2011 Feedback Forms</a>
    </p>
    <p class="downbutton">
    <a href="conference.xhtml">Conference Feedback</a>
    </p>
    <p class="downbutton">
    <a href="symposium.xhtml">Symposium Feedback</a>
    </p>
    <p class="downbutton">
    <a href="speaker.xhtml">Speaker Feedback</a>
    </p>
    </div>

It turns out to be psychologically helpful to have the form appropriately styled, both for me in developing it and for those whom I ask to review draft versions of the form. It’s so helpful, in fact, that one of the first things I do, in developing a form for a particular site, is to create an XForms template for the site, with

  • the namespace declarations for XHTML, XForms, XML Events, XSD, and any other namespaces that may be needed (extra namespace declarations cause no trouble, and missing ones cause a lot);
  • a link to the standard site stylesheet, or (if the standard site stylesheet uses tables) an equivalent stylesheet that provides the correct look
  • the high-level document structure needed by the site stylesheet
  • dummy headings and body text, using Greeking (‘lorem ipsum …’)

XForms is designed to allow forms to be integrated nicely into a site’s normal look and feel; the template makes it easier to do that consistently across all the forms used on a given site.

Another example of XForms

[9 January 2012]

In recent years I’ve spent a fair amount of time telling people about XForms as a method for making special-purpose XML editors. From time to time people ask me what sorts of things it’s possible to do in XForms, prothesis and by implication what sorts of things don’t fit very well in XForms. It’s a good question, prosthetic but I don’t know a good way to answer in words. The only way I know to answer is to show some examples of things done in XForms.

In that connection, perhaps it’s worth while to point to an XForms application I put together last year for Balisage 2011. The organizers of Balisage place great weight on feedback from participants, and we’ve always used paper feedback forms distributed to participants on the last day of the conference. Paper forms have the drawback of only being in one place at any given time, and since the organizers don’t all work in the same location, most of the organizers only got a chance to look at the feedback forms on the afternoon after the conference ended, before everyone went home. (Yes, we could have had them photocopied, but we never got around to it.) So last year I suggested we do electronic forms, in addition to the paper forms (which we entered into the electronic system by hand, afterwards).

The forms we ended up with (all pointed to from http://balisage.blackmesatech.com/) illustrate one kind of application it’s easy to do in XForms. To answer, up front, a frequently asked question: no, they don’t do anything you couldn’t do in Javascript or HTML Forms plus a little bit of Javascript. (Javascript is a Turing-complete language; how could any technology do anything that could not, in principle, be done in Javascript? Asking if XForms can do things you couldn’t do in Javascript is like asking whether the XForms spec contains a refutation of Gödel’s Theorem.) Doing them in XForms made these forms easier to develop and debug, and XForms provide XML output, which means the forms are easier to summarize and analyse than they would otherwise be. I have written conventional HTML forms interfaces to generate XML, so I know it’s possible. I also know it’s tedious; XForms is much much more convenient.

These feedback forms posed a few interesting design challenges, and I went through several iterations, bugging my colleagues on the organizing committee for feedback until I suspect most of them were heartily sick of the whole thing. The various alternatives are worth some discussion; in subsequent posts I’ll discuss the issues and some of the alternative ways of handling them in XForms.

Pierazzo on digital diplomatic editions

[2 January 2012]

A new issue of Literary and Linguistic Computing arrived not long ago. I’ve been meaning to note that it contains an article I hope will become standard reading for anyone involved with the digitization of texts and manuscripts: Elena Pierazzo’s thoughtful essay under the title “A rationale of digital documentary editions”. (LLC 26.4 (Dec 2011): 463-477.)

Dr. Pierazzo is a member of the team responsible for the Jane Austen’s Fiction Manuscripts Digital Edition and she has used her experience with that edition to reconsider the question are digital editions different from printed ones? … Do they represent an advancement of textual scholarship or just a translation of the same scholarship into a new medium? My own inclination is almost always to stress the continuity of scholarly concerns here as more important than the discontinuities of the technologies pressed into the service of scholarship, practitioner but Dr. Pierazzo reaches the different conclusion that digital editions are “substantially different” from print editions. She begins her argument by defining her terms and discussing some central questions (“What is a diplomatic edition?” “What does a diplomatic edition contain?” “Once you start identifying features to encode, illness where do you stop?”). She takes a firm stand against the view that it’s possible for an edition to encode everything about a source document (so what does an edition contain? A selection of the available information. She concludes her survey of recent thinking about the topic with the remark that

It is only with the advent of digital editions that we have started to understand that what we need is scholarly guidance …, buy more about that we need to rethink the reasons why we make our transcriptions, and that this approach should apply to print and digital editions alike.

She then proceeds to make a start on the task she has identified, listing a number of features (or rather, classes of features) which may be recorded, or elided, in a digital edition, and proposing five criteria for deciding whether to include or exclude them:

  • the purpose of the edition
  • the needs of prospective readers
  • the nature of the document
  • the capabilities of the publishing medium
  • cost

In the section on the purpose of the edition, she uses the decisions made by the Austen project to illustrate the kinds of considerations that arise, and devotes a useful couple of paragraphs to “What was not encoded”. As the last two items in her list of criteria suggest, while she argues that we need scholarly guidance about what information about a manuscript can usefully be recorded, she also takes a pragmatic view of the limitations imposed on any project by finite resources. The essay is, in the nature of things, overtly concerned only with text-bearing objects like manuscripts. But many of the concerns addressed will bear also on other forms of cultural artefact.

A wonderful piece; no one engaged in digitization of cultural heritage materials should miss it.

Balisage 2011

[11 August 2011]

Last week, cheapest Balisage 2011 took place in Montréal.

As one of the organizers, I should not brag overmuch about the conference, but I can’t resist saying that on the whole it seemed to go fairly well this week. (But hey, you don’t have to take my word for it. Cornelia Davis of EMC, who gave a well received talk about Programming Application Logic for RESTful Services Using XML Technologies has written a blog post describing her experiences at Balisage 2011. Read it!)

Allen Renear (with collaborators) and Walter Perry gave thoughtful and thought-provoking papers on the nature of identity and the role of identifiers (rather dense, and probably not to everyone’s taste — I suspect some found them less thought-provoking than just provoking). There were case-study reports on ebook deployments, the markup of supplementary material in electronic journals (huge issue for the maintainers of the scholarly record in science), and the revival in XML of an old SGML project whose server died. Michael Kay and O’Neil Delpratt talked about their work measuring the performance benefits of byte-code generation in Saxon. Eric van der Vlist described a small Javascript library he has written for the support of multi-ended hyperlinks. Eric Freese reported on the state of EPUB3. Michael Kay gave an impromptu evening session on SaxonCE (XSLT 2.0 in the browser, now out for alpha testing). Jean-Yves Vion-Dury talked about a method of encrypting XML documents in such a way that a service provider can store them and perform certain operations on them (like running a restricted class of XSLT stylesheets or XQueries over them) without decrypting them, by the ingenious technique of encrypting appropriate parts (that’s the tricky bit, what exactly are the appropriate parts?) of the stylesheet or query. And there was much, much more.

Balisage 2012 will be in Montréal in August 2012. As Patrick Durusau put it:

If you see either < or > at work or anyone talks about them, you need to be at Balisage ….

Mark your calendars.

Bilingualism as a specification tool

[16 May 2011]

The other day I discovered that ISO, pulmonologist the International Organization for Standardization, here makes two documents called ISO/IEC Directives, Part 1: Procedures for the technical work and ISO/IEC Directives, Part 2: Rules for the structure and drafting of International Standards available on the Web. Since ISO (and IEC, the International Electrotechnical Commission) have been doing technical specifications for a long time, their rules are informative for anyone interested in standards or in standards development.

In Part 2 of the ISO/IEC directives we find the following astute observation:

4.5 Equivalence of official language versions

The texts in the different official language versions shall be technically equivalent and structurally identical.

The use of bilingualism from the initial stage of drafting is of great assistance in the preparation of clear and unambiguous texts.

Or as they say in French:

4.5 Équivalence des versions dans les langues officielles

Les versions dans les différentes langues officielles doivent être équivalentes sur le plan technique et avoir une structure identique.

L’utilisation du bilinguisme dès la phase initiale de la rédaction facilite l’élaboration de textes clairs et sans ambiguïté.

This coincides with my own experience: the XML specification became much clearer when Murata Makoto translated a draft into Japanese and raised question after question about the precise meaning of this passage and that. Some of the clarifications were purely editorial, but sometimes his questions revealed that some aspect of the spec had not yet been fully thought through.

I think the practice of describing systems both in prose and in a formal specification language like Z, VDM, Alloy, or ACL2 counts as another form of bilingualism and produces some of the same benefits. (And unfortunately, when not all members of the responsible working group are bilingual in the languages used, it also poses similar challenges.) This is on my mind because recently I followed a reference to John Guttag and J. J. Horning’s paper “Formal specification as a design tool”, which makes a very readable case for the use of formal specifications in the early stage of design for any system. Some of the arguments for using formal specification don’t transfer to other natural languages (or multiple programming languages), but it’s striking that Guttag and Horning recommend formalization at the design stage, in a way that strikes me as similar to ISO’s urging bilingual development of the document, not just translation of the finished product.

XForms course 16-17 June 2011, San Carlos, California

[30 March 2011]

I’m happy to announce that in June, one health I’ll be offering a two-day hands-on XForms course in San Carlos, California, in facilities generously provided by MarkLogic Corporation (itself a user of XForms technology).

The organization of the material will be substantially the same as in the earlier iterations of the course. And the rationale for XForms, as I understand it, remains as described in an earlier post to this blog; the course is directed to users of XML who want to use XForms to work with XML representations of information.

In case it matters to potential attendees, I’ll point out that the course will occur just before the Digital Humanities 2011 conference down the road in Palo Alto.