Washington Technologies White Papers
Norma Haakonstadt, ArborText Inc.
Mark-up for HTML (or paper) is presentation or
format focused. Content mark-up (for example, SGML) is
our preferred choice for supporting multiple delivery (and presentation)
requirements.
Markup for Electronic Presentation
In markup for electronic presentation the objective or goal for creating
data in HTML is to support a single form of electronic
delivery. The start-up costs are low because there are free or low-cost
tools available, training is minimal because of a small tag set, and there
is a minimal change to author productivity.
The return on investment is quite low for three main reasons. First,
because HTML itself continues to change, your mark-up
(and almost by default, the data) becomes obsolete very quickly.
HTML 1 is different from HTML 2,
etc. Because the mark-up becomes obsolete, you are actually creating
legacy data. Most important is the inability to support reuse or recycling
of information.
Although start-up costs may be low, the cost to produce an alternative
output type is quite high because it requires conversion (which can cost
anywhere from $1 to $50 a page) or reauthoring. Both are high in cost, and
from a quality, accuracy, and efficiency standpoint should be avoided.
Because there is usually no easy way to get from HTML
to a two-column fully formatted paper delivery, for example, what
generally results is having to support multiple sources of the same
information -- one for each of the output types. Keeping these information
sources in sync (because of last minute tweaking or the time it takes to
get updates made to each source), is an expensive document maintenance
issue.
The value to the customer is moderate. HTML provides
only limited ways to traverse the data, no support for unique data
presentations (for example, based on the reader skill or security level),
and is limited to how much control, if any, you have on what amount or
what combination of data appears on the screen at one time. If you have
liability issues and concerns (for example, the warning notice must appear
on the screen at the same time as the step to which it applies), this can
be extremely problematic and could make HTML useless
as a delivery mechanism.
Markup for Content
In comparing HTML to SGML, we
state our goal for SGML as creating a single source of
reusable information objects that can be combined to create a variety of
publications and delivered in a variety of formats.
Compared to an HTML start-up, the cost to start an
SGML-based system can be high. The costs are
associated with the need for an in-depth up-front document analysis, the
new tools that need to be purchased, retraining for new processes and new
tools, and the cost of converting into SGML any
existing data you wish to use.
While the start-up costs are higher when we compare SGML
to HTML, the return on investment is very high with
SGML. Improvements in author productivity have been
reported between 30% and 50% because of the shift from focusing on format
to focusing on content. Redundant authoring is also eliminated which
reduces unnecessary time and costs. One company reported their average
page took 8 hours to author, but only 5 minutes to search for and retrieve
for reuse in their new system. Reuse or recycling of information can be
quite significant. Another company reported that on the average 80% of any
given publication was information that appeared in other publications.
Because redundant authoring is minimized, author productivity is
improved, and concurrent processing is supported, reductions in total
production times of 20% to 75% have been reported. This can be quite
significant if time to market is critical for your company or readiness is
important to your customer. If the lifespan of your data is moderate to
long, then the elimination of conversion costs as you change/upgrade
hardware or software becomes a significant contributor to a high rate of
return on your initial investment.
With SGML data, the cost to produce alternative
output types is low because it is supported by a large number of delivery
tool providers and because it is easy to automate production. For example,
a large European company reported their CD-ROM
production went from 2 to 4 weeks to 1 to 2 days.
To produce HTML is an on-the-fly transformation
process (HTML is a very small, flat DTD)
and you can automatically apply formatting and keep together requirements
for lights-out printing. For example, one site was able to eliminate their
need to review printed output page-by-page when they implemented an
SGML-based publishing system.
Because SGML mark-up is meta data (information about
your information) in a neutral format, your data is optimized for future
output needs.
The value to your customers is extremely high. Because you can add an
unlimited variety of information about your information to your
information, SGML is supported by knowledge-based
systems, such as those that can deliver information based on skill level,
security level, and past searches. In addition, this meta data is what is
used to keep information together under specified conditions, which
minimizes liability.
A single source of reusable information objects results in more
consistent, accurate data being available to your customers. And, it makes
it easier for you to configure publications to meet your customers' unique
requirements.
Comparing the Two Approaches
The following table summarizes the differences between the HTML
and SGML approaches:
|
HTML |
SGML |
Start-up Costs |
Low |
High |
Return on Investment |
Low |
High |
Cost to Produce Alternative Output |
High |
Low |
Value to Customer |
Moderate |
High |
|