Description

The script rtf2xml converts Microsoft's proprietary rich text format (RTF) format to XML. It preserves as much information the RTF files as possible, giving an XML author the choice of what elements to use for further transformations.

Why bother converting with such a transformation?

Microsoft's proprietary format is constantly changing at the whims the the MS corporation, and can only be transformed with a herculean effort. A RTF file consists of indecipherable code that can be read only with a Microsoft product. In contrast, XML's open format, as well as the number of tools available to transform it, make it an easy format to work with.

Using the rtf2xml script and other XML tools, you can transform any RTF file to different types of documents such as docbook, XSLFO, LyX, XHTML, or a man page. You could use RTF editors such as TED (in Linux) or Text Edit (in Mac OS X) to write a technical document, and then, with some help from XSLT and SAX, transform this document to docbook. In this way, you would have a tagless editor.

Another use of the script involves transforming old documents. You might have used MS Word to generate many documents before switching over to an open publishing system that doesn't rely on Microsoft and its licensing practices. What to do with the old files? Simply run them through rtf2xml and then transform them to the format you use.

The raw XML file that results from an rtf2xml transformation seeks to be identical to the original file--with the exception that the transformation is structured XML. Ideally, only one could use rtf2xml and XSLT to change a file to and from RTF with not data loss.