From HTML to XML

Wendell Piez

Organization [slide 1]
What is HTML? [slide 2]
What do HTML Tags Say They Do? [slide 3]
HTML Tags Really Do Nothing [slide 4]
What is XML? [slide 5]
XML Documents [slide 6]
XML Tags Really Do Nothing [slide 7]
Primary Difference Between XML and HTML [slide 8]
XML has no Pre-defined Tags [slide 9]
Converting Documents from HTML to XML [slide 10]
Objectives of Conversion to XML [slide 11]
Relation between XML and HTML [slide 12]
Map of Exhibits [slide 13]
Many Levels of Conversion [slide 14]
Case 1: HTML to Well-formed HTML [slide 15]
Consider Some Bad Code [slide 16]
What We Do to Fix It [slide 17]
What Did We Achieve? [slide 18]
A Trivial Document Conversion (Usually) [slide 19]
What Hasn't Changed? [slide 20]
What Do You Gain [slide 21]
Case 2: HTML to Structured HTML-XML [slide 22]
What Conversion Will Mean [slide 23]
What's the Difference? [slide 24]
A Moderate Conversion (Usually) [slide 25]
Processing: Structured Data is More Useful [slide 26]
Gains From Explicit Structure [slide 27]
Case 3: HTML to XHTML [slide 28]
XHTML provides [slide 29]
What Conversion Will Mean [slide 30]
Sorts of Questions that Need to be Asked [slide 31]
Example of XHTML [slide 32]
Three examples: [slide 33]
HTML Tidy: An Off-the-shelf Tool [slide 34]
A Moderate Conversion (Usually) [slide 35]
Automation and Tools [slide 36]
Gains from Valid XHTML [slide 37]
Case 4: HTML to User-defined Structure with a DTD/Schema [slide 38]
“Valid” is a Step Beyond “Well-formed” [slide 39]
How a DTD/Schema Helps Conversion [slide 40]
What Conversion will Mean [slide 41]
Sorts of Questions that Need to be Asked [slide 42]
Descriptive versus Prescriptive [slide 43]
Example: A Very High-level Structural XML [slide 44]
A More Descriptive Structural XML [slide 45]
Medium to Difficult Document Conversion [slide 46]
What Can You Now Do with the Data [slide 47]
Case 5: HTML to User-defined Content with a DTD/Schema [slide 48]
A Fully-tagged Herbal [slide 49]
Difficult Conversion [slide 50]
The Downside of Subject Tagging [slide 51]
Data Can Now Be Used to Provide [slide 52]
The W3C has Provided Tools [slide 53]
Many Other Tools Also Available [slide 54]
How Do I Translate from HTML to XML? [slide 55]
Is Round Trip Possible? XML to HTML [slide 56]
What's Difficult About HTML to XML? [slide 57]
From HTML to XML... [slide 58]