Versioning Thoughts (in HTML)
David G. Durand (dgd@cs.bu.edu)
Wed, 5 Jun 1996 14:56:22 -0400
The following is a sort of little "position paper" on versioning in the
WWW. It's in HTML and its longish so you can also look at:
http://cs-www.bu.edu/students/grads/dgd/HTML_versions.html
{Ed: I have included the HTML source directly into the mail message
for the Hypermail archive. - Jim}
<H1>Versioning and HTTP</H1>
<P>This note includes a number of points that reflect a somewhat different
perspective about how and why versioning should be integrated into the WWW.
At a few places, I will argue that less-restrictive assumptions be made to
accomodate variant styles of versioning, and at a few other places, I will
argue that more precise recommendations be prmoulgated to enhance
meaningful interoperability.
<p>My personal agenda is that I'm interested in version control as a way to
relax concurrency to allow write-anytime collaboration. I'm also
interested in automatic merge tools that will let users manage such
collaboration, and finding fundamental models of versioning that capture
the widest range of possible editing behaviours, as a basis for
implementing generalized systems. I am personally convinced that this is
best done by tracking user operations, (typically editing operations) and
constructing versions as sets of non-interfering edits. This makes merging
and distribution easy, at the expense of making the notion of version
trees only one of a variety of styles.
<h2>Versions in URLs</h2>
<p>Accordingly, I believe that version identifiers should be opaque to
editing systems, and managed by servers. The paper on "VTML" that Fabio
Vitali and I wrote (referenced in the page for this group) identify a few
key notions for version management on the web.
<ul>
<li>A server must be able to
serve up a "current version" of a document, as well as to serve up a
particular version on demand. The default version is server-determined and
is supplied when several versions of a document exist, and no version
parameter is specified on the request.
<li>An application should be able to determine what the version parameter
of an URL is, to enable user decisions about whether to follow a link into
the "current version" of a document, or into a particular version specified
in the link, or even a specific version specified by the user. </ul>
<p>These features allow the option of browsers that present a single
up-to-date view of web sites, while still being able to reflect changes in
documents that have been made after a link was created. It could also
enable intelligent bookmark management (which is, in principle, just a
special case of link anchor management). I have no objection to the
<tt>;version=</tt> syntax proposed so far.
<h2>HTTP issues</h2>
<p>The use of HTTP headers to specify version information is acceptable, if
they are not too restricted in their semantics. As I don't have access to
a postscript printer right now, I must react to the direction of the
discussion without being able to read
the details of Jim's proposal, so pardon any <i>faux pas</i> I may
inadvertently commit.
<p>Use of versioning operation should not depend on operations such as LOCK and
UNLOCK. I at least, am taking great pains to avoid the logical or
practical necessity for such operations by making the free creation of
variant versions (and their later merging, if desired) as easy as possible.
I'd like it if we can find a specification for lock and Unlock such that a
server like the one I am implementing will be able to work with editors
that expect LOCK and UNLOCK.
<p>The semantics should not assume that there is a single predecessor
version, or that if there are multiple predecessors, one of them is the
"main" one. The semantics should not assume that every derived version
even has a meaningful predecessor version. In my model, a user might want
to designate a new "top-level version" for the result of a complicated
merge involving many manual decisions about which changes to keep and how
conflicts are to be resolved.
<p>It should be a server decision as to what version identifier should be
assigned to a document revision when it is submitted. This follows from
the opaqueness of version parameters in URLs. It should be a server
decision (not mandated by the protocol) whether to accept a new revision.
It should also be a server decision whether or not the "current version" is
changed when a version is submitted. Setting the current version should
also be an available operation, subject to server-specific access and
configuration policy. I don't object to servers deciding to enforce a
particular notion of consistency by refusing updates, but I don't want the
protocol to require that from my server.
<h2>Edit tracking</h2>
<p>I'd like to see something like VTML in place to allow detailed version
information to be propagated to intelligent clients if they want it. Fabio Vitali
and I are already modifying VTML to remove some pointless
HTML-dependencies from it, and make it look more like a byte-stream
revision system. I think that there is enough practical need to manage
HTML documents, however, that adressing version control of HTML
specifically would still be worthwhile.
<p>I'd like to discuss notions such as VTML as part of the overall
approach to versioning on the web, thus creating a tripartite front for
proper support: <tt>Content-type:</tt>, HTTP protocol, and URL format.
These correspond to the fundamental versioning notions of naming, access
control, and differencing.
----------------------------------------------+----------------------------
David Durand dgd@cs.bu.edu | david@dynamicDiagrams.com
Boston University Computer Science | Dynamic Diagrams
http://cs-www.bu.edu:80/students/grads/dgd/ | http://dynamicDiagrams.com/