Friday, August 24, 2012

XML CDATA


All text in an XML document will be parsed by the parser.
But text inside a CDATA section will be ignored by the parser.

PCDATA - Parsed Character Data

XML parsers normally parse all the text in an XML document.

When an XML element is parsed, the text between the XML tags is also parsed:

<message>This text is also parsed</message>

The parser does this because XML elements can contain other elements, as in this example, where the <name> element contains two other elements (first and last):

<name><first>Bill</first><last>Gates</last></name>

and the parser will break it up into sub-elements like this:

<name>
  <first>Bill</first>
  <last>Gates</last>
</name>

Parsed Character Data (PCDATA) is a term used about text data that will be parsed by the XML parser.

CDATA - (Unparsed) Character Data

The term CDATA is used about text data that should not be parsed by the XML parser.
Characters like "<" and "&" are illegal in XML elements.
"<" will generate an error because the parser interprets it as the start of a new element.
"&" will generate an error because the parser interprets it as the start of an character entity.
Some text, like JavaScript code, contains a lot of "<" or "&" characters. To avoid errors script code can be defined as CDATA.

Everything inside a CDATA section is ignored by the parser.
A CDATA section starts with "<![CDATA[" and ends with "]]>":

<script>
<![CDATA[
function matchwo(a,b)
{
if (a < b && a < 0) then
  {
  return 1;
  }
else
  {
  return 0;
  }
}
]]>
</script>

In the example above, everything inside the CDATA section is ignored by the parser.

Notes on CDATA sections:

A CDATA section cannot contain the string "]]>". Nested CDATA sections are not allowed.
The "]]>" that marks the end of the CDATA section cannot contain spaces or line breaks.

XML Namespaces


XML Namespaces provide a method to avoid element name conflicts.

Name Conflicts

In XML, element names are defined by the developer. This often results in a conflict when trying to mix XML documents from different XML applications.
This XML carries HTML table information:

<table>
  <tr>
    <td>Apples</td>
    <td>Bananas</td>
  </tr>
</table>

This XML carries information about a table (a piece of furniture):

<table>
  <name>African Coffee Table</name>
  <width>80</width>
  <length>120</length>
</table>

If these XML fragments were added together, there would be a name conflict. Both contain a <table> element, but the elements have different content and meaning.

An XML parser will not know how to handle these differences.

Solving the Name Conflict Using a Prefix

Name conflicts in XML can easily be avoided using a name prefix.

This XML carries information about an HTML table, and a piece of furniture:

<h:table>
  <h:tr>
    <h:td>Apples</h:td>
    <h:td>Bananas</h:td>
  </h:tr>
</h:table>

<f:table>
  <f:name>African Coffee Table</f:name>
  <f:width>80</f:width>
  <f:length>120</f:length>
</f:table>

In the example above, there will be no conflict because the two <table> elements have different names.

XML Namespaces - The xmlns Attribute

When using prefixes in XML, a so-called namespace for the prefix must be defined.

The namespace is defined by the xmlns attribute in the start tag of an element.

The namespace declaration has the following syntax. xmlns:prefix="URI".

<root>

<h:table xmlns:h="http://www.w3.org/TR/html4/">
  <h:tr>
    <h:td>Apples</h:td>
    <h:td>Bananas</h:td>
  </h:tr>
</h:table>

<f:table xmlns:f="http://www.w3schools.com/furniture">
  <f:name>African Coffee Table</f:name>
  <f:width>80</f:width>
  <f:length>120</f:length>
</f:table>

</root>

In the example above, the xmlns attribute in the <table> tag give the h: and f: prefixes a qualified namespace.

When a namespace is defined for an element, all child elements with the same prefix are associated with the same namespace.

Namespaces can be declared in the elements where they are used or in the XML root element:

<root xmlns:h="http://www.w3.org/TR/html4/"
xmlns:f="http://www.w3schools.com/furniture"
>

<h:table>
  <h:tr>
    <h:td>Apples</h:td>
    <h:td>Bananas</h:td>
  </h:tr>
</h:table>

<f:table>
  <f:name>African Coffee Table</f:name>
  <f:width>80</f:width>
  <f:length>120</f:length>
</f:table>

</root>

Note: The namespace URI is not used by the parser to look up information.

The purpose is to give the namespace a unique name. However, often companies use the namespace as a pointer to a web page containing namespace information.


Uniform Resource Identifier (URI)

Uniform Resource Identifier (URI) is a string of characters which identifies an Internet Resource.
The most common URI is the Uniform Resource Locator (URL) which identifies an Internet domain address. Another, not so common type of URI is the Universal Resource Name (URN).

In our examples we will only use URLs.

Default Namespaces

Defining a default namespace for an element saves us from using prefixes in all the child elements. It has the following syntax:

xmlns="namespaceURI"

This XML carries HTML table information:

<table xmlns="http://www.w3.org/TR/html4/">
  <tr>
    <td>Apples</td>
    <td>Bananas</td>
  </tr>
</table>

This XML carries information about a piece of furniture:

<table xmlns="http://www.w3schools.com/furniture">
  <name>African Coffee Table</name>
  <width>80</width>
  <length>120</length>
</table>


Namespaces in Real Use

XSLT is an XML language that can be used to transform XML documents into other formats, like HTML.

In the XSLT document below, you can see that most of the tags are HTML tags.

The tags that are not HTML tags have the prefix xsl, identified by the namespace xmlns:xsl="http://www.w3.org/1999/XSL/Transform":

<?xml version="1.0" encoding="ISO-8859-1"?>

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">
<html>
<body>
  <h2>My CD Collection</h2>
  <table border="1">
    <tr>
      <th align="left">Title</th>
      <th align="left">Artist</th>
    </tr>
    <xsl:for-each select="catalog/cd">
    <tr>
      <td><xsl:value-of select="title"/></td>
      <td><xsl:value-of select="artist"/></td>
    </tr>
    </xsl:for-each>
  </table>
</body>
</html>
</xsl:template>

</xsl:stylesheet>

If you want to learn more about XSLT, please find our XSLT tutorial at our homepage.

XML Applications


This chapter demonstrates some small XML applications built on XML, HTML, XML DOM and JavaScript.

The XML Document Used

In this application we will use the "cd_catalog.xml" file.

Display the First CD in an HTML div Element

The following example gets the XML data from the first CD element and displays it in an HTML element with id="showCD". The displayCD() function is called when the page is loaded:

Example

x=xmlDoc.getElementsByTagName("CD");
i=0;

function displayCD()
{
artist=(x[i].getElementsByTagName("ARTIST")[0].childNodes[0].nodeValue);
title=(x[i].getElementsByTagName("TITLE")[0].childNodes[0].nodeValue);
year=(x[i].getElementsByTagName("YEAR")[0].childNodes[0].nodeValue);
txt="Artist: " + artist + "<br />Title: " + title + "<br />Year: "+ year;
document.getElementById("showCD").innerHTML=txt;
}

Try it yourself »


Navigate Between the CDs

To navigate between the CDs in the example above, add a next() and previous() function:

Example

function next()
{ // display the next CD, unless you are on the last CD
if (i<x.length-1)
  {
  i++;
  displayCD();
  }
}

function previous()
{ // displays the previous CD, unless you are on the first CD
if (i>0)
  {
  i--;
  displayCD();
  }
}

Try it yourself »


Show Album Information When Clicking On a CD

The last example shows how you can show album information when the user clicks on a CD:

For more information about using JavaScript and the XML DOM, visit our XML DOM tutorial.

XML to HTML


Add HTML to XML Data

In the following example, we loop through an XML file ("cd_catalog.xml"), and display the contents of each CD element as an HTML table row:

Example

<html>
<body>

<script type="text/javascript">
if (window.XMLHttpRequest)
  {// code for IE7+, Firefox, Chrome, Opera, Safari
  xmlhttp=new XMLHttpRequest();
  }
else
  {// code for IE6, IE5
  xmlhttp=new ActiveXObject("Microsoft.XMLHTTP");
  }
xmlhttp.open("GET","cd_catalog.xml",false);
xmlhttp.send();
xmlDoc=xmlhttp.responseXML;

document.write("<table border='1'>");
var x=xmlDoc.getElementsByTagName("CD");
for (i=0;i<x.length;i++)
  {
  document.write("<tr><td>");
  document.write(x[i].getElementsByTagName("ARTIST")[0].childNodes[0].nodeValue);
  document.write("</td><td>");
  document.write(x[i].getElementsByTagName("TITLE")[0].childNodes[0].nodeValue);
  document.write("</td></tr>");
  }
document.write("</table>");
</script>

</body>
</html>

Try it yourself »

For more information about using JavaScript and the XML DOM, visit our XML DOM tutorial.