XML: Separating Data and Its Structure
from Presentation
- Why
XML - the problem with HTML
- No
way to easily identify data in documents
- Difficult
to identify meta data
- General
purpose of XML
- XML
gives us a way to create and maintain structured documents in plain
text that can be rendered in a variety of different ways.
- A
primary objective of XML is to separate content from presentation.
- XML
describes data through the use of markup tags - Example
- Works
well for exchanging data between applications
- XML
documents can be presented in different ways through the use of extensible
Stylesheet Language transformation (XSL)
- Basics
of XML
- Dialect
of SGML
- Standards
defined by World Wide Web Consortium (W3C)
- Plain-text
markup language
- Extensible
- Complements,
not replaces, HTML
- Not
just for the Web
- XML
Documents
- XML elements - Example
- Well-formed
- properly nested, end tags used
- Validated
- matches the data schema defined by DTD or Schema
- DTD
(Document Type Definition)
- Defines
the legal building blocks (elements) of an XML document
- Acts
as the data schema
- Can
be included inline with XML document or as separate document
- Standard
DTDs are evolving
- Schema
- Adds
typing, default values, constraints on values to DTD
- XSL
- A
Transformation language (XSLT) and a formatting language (XSL-FO)
- XSLT
most developed
- XSLT
used to transform XML document into another form:
- Different
XML
- HTML
- PDF
- Etc.
- Example: XML --->XSL
---> Result
- Declarative
language – analogous to SQL
- Based
on matching templates and selecting data
Diagram of XML
Processing
Sample
DTD from RosettaNet
Sample XML
(to really see the XML, view the source)
Sample XSL
- XML
Parser
- Microsoft
parser is a COM component that comes with IE5
- Netscape
does not include a parser, but other parsers are available
- List
of parsers
- Namespaces
- Method
for qualifying element names
- Allows
for mixing of “vocabularies” (DTDs)
- URI:
Universal Resource Identifier (more
info, still more info)
- URL - defines specific location
- URN - assumes persistence through
either location or through agent that will find location
- Document Object Module
- Provides
a standard programming interface to an XML document
- Creates
a tree data structure of document in memory
- Nodes
(elements) can be accessed, added, deleted, or modified