UNIT I : XML BASICS
XML structure – Elements – Creating Well-formed XML -Basic XML- Document
Type Definition Name Spaces – Schema Elements, Types, Attributes –X Files
XPath
XML
∙ XML stands for eXtensible Markup Language.
∙ XML was designed to store and transport data.
∙ XML was designed to be both human- and machine-readable.
<?xml version="1.0" encoding="UTF-8"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
What is XML?
∙ XML stands for eXtensible Markup Language
∙ XML is a markup language much like HTML
∙ XML was designed to store and transport data
∙ XML was designed to be self-descriptive
∙ XML is a W3C Recommendation
Difference Between XML and HTML
XML and HTML were designed with different goals:
∙ XML was designed to carry data - with focus on what data is
∙ HTML was designed to display data - with focus on how data looks ∙
XML tags are not predefined like HTML tags are
XML Tree
XML Tree Structure
,<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="children">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="web">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
XML Syntax Rules
∙ XML Declaration is Mandatory
<?xml version="1.0" encoding="UTF-8"?>
∙ XML Documents Must Have a Root Element
∙ All XML Elements Must Have a Closing Tag
∙ XML Tags are Case Sensitive
∙ XML Elements Must be Properly Nested
∙ XML Attribute Values Must be Quoted
<person gender="female">
XML Naming Rules
, XML elements must follow these naming rules:
∙ Element names are case-sensitive
∙ Element names must start with a letter or underscore
∙ Element names cannot start with the letters xml (or XML, or Xml, etc) ∙ Element
names can contain letters, digits, hyphens, underscores, and periods ∙ Element
names cannot contain spaces
DTD
What is a DTD?
A DTD is a Document Type Definition.
A DTD defines the structure and the legal elements and attributes of an XML document.
Why Use a DTD?
With a DTD, independent groups of people can agree on a standard DTD for
interchanging data.
An application can use a DTD to verify that XML data is valid.
An Internal DTD Declaration
If the DTD is declared inside the XML file, it must be wrapped inside the <!DOCTYPE>
definition:
XML document with an internal DTD
<?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend</body>
</note>
The DTD above is interpreted like this:
∙ !DOCTYPE note defines that the root element of this document is note ∙ !ELEMENT
note defines that the note element must contain four elements:
"to,from,heading,body"