X Marks the Spot: Let’s Take Today’s XML Content-Creation Tools for a Spin

By Bob Doyle - June 2006 Issue, EContent Magazine, Posted Jun 13, 2006 

Tim Berners-Lee's dictum, "If it's not on the Web, it doesn't exist," may now be supplemented with, "And if your business is not creating XML content, it may soon cease to exist."

XML is changing the way content is created - for print publishing, for the Web, and for other new distribution channels like ubiquitous PDAs and WAP-based 3G cell phones - all of which are connected by ever-faster Internet connections. In recent years, XML has also become a leading method for moving content, with RSS feeds providing stories from blogger sites, transmitting audio and video from new podcasting sites, and managing Web Services, which communicate via XML-RPC, SOAP, and other behind-the-scenes protocols that have largely replaced CORBA, D-COM, and even EDI as data-exchange standards.

Now XML is returning to its roots as a markup language for creating and structuring content. From high-end publishing tools like Adobe FrameMaker, InDesign, and Quark Xpress to Web-content tools like Adobe GoLive and Macromedia Dreamweaver, and on to Microsoft Office 12 and OpenOffice (not to mention Google's stunning entry into document creation with its acquisition of Writely), XML export and import and even XML creation and storage are now becoming commonplace. Yet XML content creation and delivery is at least three times more difficult than HTML, which in turn is many times harder than writing a Word document. So the success of the new content tools depends on making structured authoring, or guided writing, appear to be much easier than HTML: "As easy as Word" is a popular marketing slogan.

In the high-end print publishing industry, XML content is advancing on several fronts. Microsoft Windows Vista will offer its XML Paper Specification (XPS) to the print industry as a rival to Adobe's Portable Document Format (PDF). Best-of-class desktop publisher Quark Xpress has lost significant market share to Adobe InDesign and hopes to recover by supporting XML Job Definition Format (JDF). IBM has abandoned decades of DocBook and SGML in favor of DITA (Darwin Information Typing Architecture). IBM donated DITA XML technology to OASIS, which has made it a standard. And Adobe announced that it is moving its technical documentation to DITA, starting with its Creative Suite docs. Finally, as markets become global, companies localizing their content will be doing it with XML standards like Translation Memory eXchange (TMX) and XML Localization Interchange File Format (XLIFF).

To get a handle on the tools that make this all happen, we reviewed 12 tools commonly known as XML Editors. We found some terminology confusion in the market because some of these "Editors" are clearly aimed at the authors, writers, and editors who create the core content, while others are for the application developers who build the structure and styles within which the former can do their guided writing. We will distinguish the two as XML Author Editors and XML Developer Editors (an IDE, or Integrated Development Environment). And as you'll see, we found that one product suite does both.

Our XML Editors were selected from nearly 80 listed on the CMS Review site, and we are confident they are the best XML content-creation tools on the market. We looked far and wide to find American, Austrian, Canadian, Dutch, Irish, Japanese, Romanian, and Russian software development teams. Collectively, these tools are in use by millions of content creators around the world working in XML.

 

The Three Layers of Content Management
Separating presentation (layout, style, format) from content is a CMS cliché. This can be done quickly with cascading style sheets (CSS) and HTML (or better, XHTML). Separating off the presentation into style sheets is a lot harder than straight HTML, but the result is much more manageable and reusable content.

XML further extends the amount of separation so that three basic things result: 1) the XML document containing the pure content itself, 2) a DTD or XML Schema Document (XSD) with the allowable structural elements and their attributes to validate the XML document, and 3) XSL style sheets (possibly XSL_FO and XSLT transformations) to convert the core content document to multiple output channels with different presentations.

The three layers correspond to different types of markup. In the content core, the markup is semantic, with meaningful tags like name and price, the RDF properties that will drive the Semantic Web. In the structural layer, markup is syntactic, with tags like div and span. In the outer layer, markup is stylistic or presentational, with tags like font, b, and i.

The layers also correspond to different professional skill sets—designers on the outside, architects building the middle layer, and content authors in the core. XSD, XML, and XSL are three components like the red, green, and blue signals of component television. Separating CSS from XHTML is, like S-Video, good but not HDTV.

And finally, in an effort to make the objective of XML content-creation tools more readily understandable, here's a railroad metaphor: we can say the Developer Editor tools build the structure rail on one side (the content model) and the style rail on the other (the presentation model) that keep the author/editor on track with guided writing.

Author or Developer? Both
Since almost every major content-creation tool (e.g., Word) can now export documents in XML format, why do you need a dedicated XML Author Editor and XML Developer Editor? Because creating well-formed XML documents is not enough. Your organization needs XML documents that are structured properly. That means documents validated against your content model—the schemas and DTDs that describe allowable content elements and attributes. An XML Author tool will create valid content immediately, instead of starting from Word files, whose XML output must be painfully converted to valid XML. Many organizations create Word templates for authors, then parse the Word XML to match their schemas. XML Author Editors eliminate this time-wasting step.

Second, your XML will need to be re-purposed to feed ever more types of output channels as you exploit multiple opportunities to publish your content. The XML Developer Editor will let your application developers design and build the several XSLT transformations that will be needed behind the scenes to publish your content. So now that we've framed the whys and wherefores of XML tools, it's time to take some out for a spin.

Test Method
We downloaded trial versions of all 12 tools and installed them on a new dedicated test platform at CMS Labs. We read the online documentation (many PDFs we downloaded and printed out). Overall, existing online training was good, though video training from Altova, Stylus Studio, and SyncRO Soft really stood out. When companies offered us personal online training (with screen-sharing sessions), we accepted.

We joined online company forums where they were offered and read some posts to get a sense of the user communities. We also joined independent mailing lists of user groups. When we encountered problems, we first Googled the error message or situation, then we sent in questions to vendor support. We found the company is usually a user's last resource for help.

Once all the tools were installed, we created a test set of XML documents, XML schemas, and XSLT style sheets. We took them from CM Pros' Design Patterns for reporting Best Practices initiative and we simplified one of these to make a DITA XML test document.

We recommend you follow a similar methodology, working with test documents drawn from your own content. If you have no structured content, you have three options:

  1. You can get a consultant to structure your content (or learn yourself). This may be long and difficult, but in the end, a rewarding process.
  2. All the XML Developer Editor tools can extract structure from a collection of similar-instance documents. They infer a content model (allowable elements—you may need to fine-tune the order and frequency). They then give you a schema document you can use to create more documents structured to be consistent with your existing content.
  3. You can work within proposed industry-standard content models like the new DITA structures for topics, concepts, tasks, and references.

 

Feature Evaluation
We would like to make a point that, given the space limitations of this print article, we can cover only a small number of the key features that distinguish these tools. We plan to create an online report that will provide more details, including many screenshots of key features, a glossary of terms, links to documentation on all these XML Editors, and mention of dozens of other XML Editors that did not make the cut for EContent but might fit your niche.

For a sophisticated developer, any good text editor can be used to create XML documents, XML schemas, and XSLT style sheets. Among the most popular are HomeSite and UltraEdit (Windows), BBEdit (Mac), Emacs (Unix), and jEdit (cross-platform Java), our favorite tool. These typically have validation, tag-completion, elements in context, etc. And jEdit recognized the ditabase.dtd properly from its DOCTYPE declaration, where our reviewed XML Developer Editors surprisingly could not.

Here are the key features we focused on in our examination of these tools:

IDE. An XML Developer Editor that is the central part of an Integrated Development Environment is a must-have tool for any organization moving its content to XML. Since that now includes all content, at least one person in your organization needs an IDE. We judge the IDE tools by how many different aspects of XML they support. They should not only work with XML schemas and DTDs, they should provide full creation and debugging tools for these schemas. Perhaps even more important, they should let you design, debug, and deploy the many different style sheet transformations (XSLT) to repurpose your content for multichannel publishing. The best of them are part of a coordinated suite of tools that implement the many other options for XML development, which includes an alphabet soup of acronyms.

WYSIWYG. A simple XML Author tool that enforces the structural and styling constraints of XML schemas and XSL transformations can do this without revealing complex behind-the-scenes machinery. It should provide a comfortable and familiar interface for content creators used to working in standard what-you-see-is-what-you-get word processors. If the ratio of content creators to content editors and designers is high, you will need many more XML Author tools than XML Editor tools.

Validation. Guided writing, or structured authoring, works best when the writer is constantly assisted in doing the right thing. The ideal is continuous real-time validation against the content model rules in the schema. Even more rigid is to not allow the possibility of error. So validation comes in a range of settings. It can be turned off for experts. It may be done by clicking a request for validation. It may provide only warnings. It may actually correct or prevent errors. Some tools only allow both start and end tag sets to be inserted.

Elements in context. When adding structural elements, the best tools display a context-dependent list of the "allowed" or "available" elements that can be added at the current insertion point in the document. This can be in a separate window pane or a floating palette, a drop-down menu when you type an open tag, or revealed by right-clicking at the insertion point. Some Editors allow you to turn this down, or completely off for power users.

Tags-on view. Though they disrupt the pure WYSIWYG look, optional visual representations of the start and end tags for structural elements are very helpful.

Structure, or Tree, view. A hierarchical view of the document, which expands and contracts elements like an outline tool, letting you move around quickly in large documents. More powerful Editors let you move structural elements in this view and synchronize changes with other views.

Grid view. An arrangement of your content in something like spreadsheet cells. Cells can be moved, preserving their internal structure.

Drag/drop structure. The best Editors allow selection of the whole structural element, then drag-and-drop of the element—only to locations that are valid for the specific element, of course.

Source-code view. Some pure WYSIWYG Editors can show the source code for changes best made while looking at the XML source. Top tools highlight the syntax with your choice of colors so authors can easily distinguish the code from the content.

Tag auto-completion. When typing tags yourself, auto-completion is one way of enforcing correct structure.

Line indenting. Source code that is hard to scan quickly is useless. The best tools indent child elements to reveal the document structure. They may do this as you type or on-demand. But watch out for tools that add white space where you don't want them in mixed XML content. They should "roundtrip" your code cleanly.

XSLT processor. Built-in XSLT processing lets you view the multiple transforms to publishing output channels.

XPath/XQuery. Quickly find elements anywhere in a document that share a property and may be targets for special repurposing in your XSLT outputs. Some tools populate the XPath with your contextual position. As you click in the document your XPath appears automatically in the search box.

Native XML databases. Berkeley and eXist databases store your XML as is.   Many content management systems do as well.

Specific schemas. Support for well-known XML "vocabularies" like DocBook, DITA, etc. Support also for different schema standards, like DTD, XSD (XML Schema Document), Relax NG, NRL, etc.

DITA. The Darwin Information Typing Architecture standardizes schema components that can be specialized to a variety of technical documentation needs.

DocBook. While DITA accomplishes most of what the SGML DocBook standard provides, some tools continue support for DocBook.

Package size. The amount of code downloaded gives you some idea of the amount of work put into these great tools.

Communities online. Judging a book by its cover is a bad idea, but listening to what users are saying about these tools is a solid criterion for judgement.

Google PageRank. Another objective measure of a company is its Google ranking on a logarithmic scale from one to ten. Note that every additional step up in rank means roughly ten times the importance of the site.

Templates. A library of "stationery" resources, XML documents that authors can start with, knowing they are valid for certain schemas and support specific output style sheets.

Developer tools. All our Developer tools can extract schemas from well-formed XML documents. Some offer visual schema designers. Some provide schema management to assist in the creation of compound XML documents that contain elements from diverse schemas and namespaces, e.g., Dublin Core, SVG (scalable vector graphics), MathML, etc. Some visual XSLT style sheet designers are synchronized with their code view. Look for new XSLT 2.0 support.

Spell check. This can range from simple checking with a customizable word list to dynamic word and phrase completion to insure that writers use terminology consistently throughout the organization.

Multilingual. Unicode support for integration with translation tools.

There are other important features you should evaluate when considering the selection of an XML tool; however, we didn't include these in our matrix because all 12 tools we evaluated have them.

INTRODUCING THE XML AUTHOR EDITORS

Adobe FrameMaker 7.2
Perhaps a million technical writers work with FrameMaker (FM). The latest versions allow structured writing, but only a small percentage of FM documents are structured. Unfortunately, FM stores the structured content model (allowed elements, their frequency, and order) in a proprietary EDD document combining structural (schema) information and style information, which are best kept separate. Export to an XML document presently loses any format information. FrameMaker has the best PDF rendering, based on the page description language, called PostScript. Adobe will be a leader in making future pure XML page description languages, using technology like SVG. A vast amount of unstructured content is being converted to FrameMaker with special conversion tools, which impose structure. FrameMaker can then export to XML, especially DITA XML content, which can be styled and repurposed through FrameMaker. InTech and Mekon have released DITA publishing adapters for FrameMaker.

PTC Arbortext Editor 5.2 with Styler
Arbortext has been supporting many formats, from SGML in the 1980s to XML in the '90s. It has supported DITA for years, and we opened our test DITA documents easily with the shipping version. All other XML Authors required the latest DITA OT files to operate. Arbortext is integrated into top content management systems like Documentum and Oracle. Major clients depend on its expensive backend Arbortext Publishing Engine as the content-delivery environment.

XMetaL Author 4.6 DITA Edition
XMetaL descends from SoftQuad HoTMetaL and the team that wrote the very first SGML editor. It integrates the DITA Open Toolkit to provide a modest publishing solution with its FOP processor. Many XMetaL clients now deliver structured content through the popular Adobe FrameMaker. XMetaL Developer is a limited-feature plug-in to Microsoft Visual Studio.NET. XMetaL can't customize content models (schemas) and design style sheets to publish that content, so it seems limited to the technical-documentation market without a partner IDE tool. XMetaL recommends Stylus Studio as its IDE. That may change. XMetaL was recently acquired by Japanese XML powerhouse Justsystems.

Syntext Serna 2.5
Serna has one of the best WYSIWYG tools because of its pure XSL-driven interface with on-the-fly XSLT processing. Native DITA support, with direct editing of DITA maps and specializations. Multiplatform, written in C++. PDF rendering with FOP or optional Antenna House tools.

INTRODUCING THE XML DEVELOPER EDITORS

Cladonia Exchanger XML 3.2
We found Exchanger a robust and powerful IDE. The XPath/XQuery was quite intuitive; anywhere you click in a document, the XPath window shows the path to the element in context. You can then click to find all elements with similar path structures, for use in reports or new XSLT transforms. User interface look and feel is the plain gray of standard tools like jEdit. It can split window views many times to allow simultaneous work on schemas, XSLT, XQuery results, etc. Excellent online help with lots of sample projects.

Stylus Studio 2006
Combines an IDE Editor, style sheet and schema designers, XQuery development, and XSLT mapping tools into a single package. A home version costs $119. Elegant visual design tools are synchronized with code views in all tools. Suggests XMetaL as its XML Author tool. Great training videos online. Most active developer forum.

SyncRO Soft 7.1
This XML Editor (IDE) has most of the functionality of the better-known tools, without their elegant visual designers, but with a remarkably low cost. It validates against a wide range of schemas, such as Relax NG, NRL, and Schematron. It supports XSLT 1.0 and 2.0 with debuggers and profiling. Very good CSS integration. The UI is handsome and very configurable to your work style. It has many dockable and resizeable panels to perform many tasks that can be hidden behind convenient side tabs. Many training videos. Its forum is third-most active.

INTRODUCING THE XML AUTHOR AND DEVELOPER EDITOR

Altova XMLSpy/Suite 2006
Altova makes the world's most popular XML Editor (IDE), XMLSpy Professional. The company boasts more than 2.3 million registered users. XMLSpy is part of a suite of tools including Stylevision, SchemaAgent, MapForce, and SemanticWorks, a visual RDF ontology editor. The free XMLSpy home version is in wide use. Altova offers a free XML Author tool (Authentic) that works with Stylevision style sheets (several are included), an excellent choice for an organization with hundreds of content contributors and just a few application developers. Altova offers free instructor-led online training sessions and great self-paced online video training. \

BROWSER-BASED XML AUTHOR EDITORS

We briefly looked at four browser-based XML Author tools, but difficulties integrating them into the necessary Web server and CMS prevented us from studying them properly at this time. Worth a look are Ektron eWebEditPro + XML, integrated with Ektron's CMS400.NET, and top CMSs like Documentum, FatWire, Interwoven, Microsoft, Percussion, SilkRoad, and Vignette; Ephox EditLive! for XML, a Java applet that specializes in XML output from form-style Web pages to feed data into your XML publishing stream; XMAX (XMetaL for ActiveX), an IE plug-in; and Xopus, written entirely in JavaScript and adaptable to most any CMS on the backend. Xopus is integrated into Arbortext Publishing Engine as its Contributor product. All these are WYSIWYG Style interfaces that your content contributors can work with. But remember that none of them will work without skilled application developers integrating them into your publishing system with XML Schema Documents and XSLT style sheets. For this they will need an XML Developer Editor (IDE) tool.

OASIS/IBM DITA Open Toolkit
We successfully installed the DITA Open Toolkit (OT) and used it to test our DITA documents, both using the toolkit's Java: Ant build tools and by pointing to its ditabase.dtd from inside the reviewed tools. Opening a known good DITA XML document fails in most of these Editors, even if it refers to ditabase.dtd in the DOCTYPE declaration statement. (Note that jEdit handles this perfectly.) But when we selected a schema from inside an Editor and pointed to the toolkit's ditabase.dtd, all the error messages disappeared. Even where XMetaL integrates the OT, we found that pointing to the latest OT release worked better because new dependencies had developed. As more tools integrate the DITA OT (and they all are considering it), you may still find yourself better off using the latest schemas and supporting files of the Open Toolkit directly.




XML AUTHOR/EDITOR FEATURE MATRIX

ToolAltova XMLSpy Suite Cladonia Exchanger Stylus Studio SyncRO Soft Adobe FrameMaker Arbortext Editor XMetaL Author DITA EditionSyntext Serna
Web site www.altova.com www.exchanger xml.com http://stylus studio.com www.oxygenxmlwww.adobe.com/ framemaker www.ptc.com www.xmetal.com www.syntext.com
IDE Excellent Good Very good Very good No No No No
WYSIWYG ExcellentNo No No Very good Excellent Excellent Very good
Type of validation (1) RT, W, OD RT, OD RT, OD RT, OD RT, W, OD RT, OD RT, OD RT, OD
Elements in context ExcellentVery good Fair* Very good Excellent Very good Very good Very good
Tags-on view Very good No No No Very good Excellent Excellent Yes
Drag/drop structure Very good Good Good* Very good Very good Excellent Very good No
Structure/Tree view SeparateSeparateSynchronizedSynchronizedSynchronizedSynchronizedSynchronizedSynchronized
Grid view YesYesYes No No Coming No Mp
Source-code view ExcellentGoodGood Excellent No Fair Very good Fair
Tag auto-completion YesYesYesYes No Yes Yes Yes
Line indenting Very Good GoodGood Very good No No No Yes
XSLT processor YesXalan, SaxonYesYes No API option DITA OT Yes
Native XML DB NoNoYesYes No NoNo Yes
Xpath/Xquery Very good Excellent Excellent Excellent No API optionNo Fair
DITA support GoodNo Good Coming Fair Very good Excellent Very good
DocBook NoYes NoYes Yes Yes Yes Yes
Install base (est.) 2,300,000 40,000 1,000,000- 200,000 800,000 50,000 40,000 Undisclosed
Platform WindowsAll (Java) Windows All (Java) Windows Windows,Solaris Windows All (C++)
Price$745 $130 $450 $180 $799 $695 $895 $199
Package size 61MB32MB 78MB 33MB 340MB 206MB 101MB 20MB
Communities online ForumForum Forum Forum, mailing list Multiple forums, lists Mailing list Dita-users listBlog
Google PageRank868 8 9 7 7 7
Notes: RT=real-time, W=warnings, OD=on-demand *In Tree view only

BROWSER-BASED XML AUTHOR/EDITOR  FEATURE MATRIX

Tool Ektron eWebEditPro + XMLEphox EditLive! For XML Xopus XMAX
Web site www.ektron.com www.ephox.com www.xopus.com www.xmetal.com
Type of validation (1) RT RT, W RT RT, W, OD
Elements in context YesNoYesYes
DITA support NoNext versionNext versionYes
Spell check (1) YesYesNoYes
Platform IE, Netscape, MozillaAll (Java) IE (Javascript)IE
Install base (est.) 16,000 5,000 1,000s 60
Price $599/ten users $7,950/CPU $100/user $495/named user
Package Size 25MB33MB3MB60MB
Google PageRank 7667

Notes: RT=real-time, W=warnings, OD=on-demand

CONCLUSION
And the winners are . . . all of you who are thinking of moving your content to XML!

Every one of our XML Author Editors and Developer Editors can be downloaded for 30-day trials so you can see for yourself our reasons for selecting them. It's an amazing no-risk deal. The only cost is your time; hands-on experience will be much more informative than this brief article. Start with the one that appeals most, but please do work with the others. Many vendors provide excellent sample files and tutorials that will help you to understand the complete XML content-publishing process. Only after I had studied a few of them did I scratch the surface on how powerful they all are. I don't think you can make your wisest selection without going from XML document to PDF or Web page in most of these eight tools.

Yes, content remains king. But without the structure, format, layout, and presentation made possible by XML schemas and style sheets, it's more like an emperor with no clothes.


Other Companies Featured in this Article

EMC Documentum www.documentum.com
FatWire Corporation www.fatwire.com
IBM Corporation www.ibm.com
Integrated Technologies, Inc. www.intech.com
Interwoven, Inc. www.interwoven.com
Justsystems www.justsystems.com
Mekon, Ltd. www.mekon.com
Microsoft Corporation www.microsoft.com
Oasis www.oasis.org
SilkRoad Technology, Inc. www.silkroadtech.com
Vignette Corporation www.vignette.com
Edit  |  workFlow  |  Subscribe (2)  |
 Comments (0)
Language: en  | fr  | it  | de  | es  | pt  | ar  | he  | da  | nl  | zh  | ja  | ko  | none 
Author: bobdoyle

This Version:
This version is archived at: https://www.cmsreview.com/XML/Editors/index.1.en.html
Next Version:

Requests
 Version: 5622 | Series: 99209 

Search: Site | Web | Groups