Try an online XML class for free!
Additional Resources

DTDs

In this lesson of the XML tutorial, you will learn...
  1. The difference between well-formed and valid XML documents.
  2. The purpose of DTDs.
  3. To create internal and external DTDs.
  4. To validate an XML document according to a DTD.
  5. The limitations of DTDs.

Well-formed vs. Valid

A well-formed XML document is one that follows the syntax rules described in "XML Syntax Rules". A valid XML document is one that conforms to a specified structure. For an XML document to be validated, it must be checked against a schema, which is a document that defines the structure for a class of XML documents. XML documents that are not intended to conform to a schema can be well-formed, but they cannot be valid.

The Purpose of DTDs

A Document Type Definition (DTD) is a type of schema. The purpose of DTDs is to provide a framework for validating XML documents. By defining a structure that XML documents must conform to, DTDs allow different organizations to create shareable data files.

Imagine, for example, a company that creates technical courseware and sells it to technical training companies. Those companies may want to display the outlines for that courseware on their websites, but they do not want to display it in the same way as every other company who buys the courseware. By providing the course outlines in a predefined XML format, the courseware vendor makes it possible for the training companies to write programs to read those XML files and transform them into HTML pages with their own formatting styles (perhaps using XSLT or CSS). If the XML files had no predefined structure, it would be very difficult to write such programs.

Creating DTDs

DTDs are simple text files that can be created with any basic text editor. Although they look a little cryptic at first, they are not terribly complicated once you get used to them.

A DTD outlines what elements can be in an XML document and the attributes and subelements that they can take. Let's start by taking a look at a complete DTD and then dissecting it.

Code Sample: DTDs/Demos/Beatles.dtd

<!ELEMENT beatles (beatle+)>
<!ELEMENT beatle (name)>
<!ATTLIST beatle
 link CDATA #IMPLIED
 real (yes|no) "yes">
<!ELEMENT name (firstname, lastname)>
<!ELEMENT firstname (#PCDATA)>
<!ELEMENT lastname (#PCDATA)>

The Document Element

When creating a DTD, the first step is to define the document element.

<!ELEMENT beatles (beatle+)>

The element declaration above states that the beatles element must contain one or more beatle elements.

Child Elements

When defining child elements in DTDs, you can specify how many times those elements can appear by adding a modifier after the element name. If no modifier is added, the element must appear once and only once. The other options are shown in the table below.

Modifier Description
? Zero or one times.
+ One or more times.
* Zero or more times.

It is not possible to specify a range of times that an element may appear (e.g, 2-4 appearances).

Other Elements

The other elements are declared in the same way as the document element - with the <!ELEMENT> declaration. The Beatles DTD declares four additional elements.

Each beatle element must contain a child element name, which must appear once and only once.

<!ELEMENT beatle (name)>

Each name element must contain a firstname and lastname element, which each must appear once and only once and in that order.

<!ELEMENT name (firstname, lastname)>

Some elements contain only text. This is declared in a DTD as #PCDATA. PCDATA stands for parsed character data, meaning that the data will be parsed for XML tags and entities. The firstname and lastname elements contain only text.

<!ELEMENT firstname (#PCDATA)>
<!ELEMENT lastname (#PCDATA)>

Choice of Elements

It is also possible to indicate that one of several elements may appear as a child element. For example, the declaration below indicates that an img element may have a child element name or a child element id, but not both.

<!ELEMENT img (name|id)>

Empty Elements

Empty elements are declared as follows.

<!ELEMENT img EMPTY>

Mixed Content

Sometimes elements can have elements and text intermingled. For example, the following declaration is for a body element that may contain text in addition to any number of link and img elements.

<!ELEMENT body (#PCDATA | link | img)*>

Location of Modifier

The location of modifiers in a declaration is important. If the modifier is outside of a set of parentheses, it applies to the group; whereas, if the modifier is immediately next to an element name, it applies only to that element. The following examples illustrate.

In the example below, the body element can have any number of interspersed child link and img elements.

<!ELEMENT body (link | img)*>

In the example below, the body element can have any number of child link elements or any number of child img elements, but it cannot have both link and img elements.

<!ELEMENT body (link* | img*)>

In the example below, the body element can have any number of child link and img elements, but they must come in pairs, with the link element preceding the img element.

<!ELEMENT body (link, img)*>

In the example below, the body element can have any number of child link elements followed by any number of child img elements.

<!ELEMENT body (link*, img*)>

Using Parentheses for Complex Declarations

Element declarations can be more complex than the examples above. For example, you can specify that a person element either contains a single name element or a firstname and lastname element. To group elements, wrap them in parentheses as shown below.

<!ELEMENT person (name | (firstname,lastname))>

Declaring Attributes

Attributes are declared using the <!ATTLIST > declaration. The syntax is shown below.

<!ATTLIST ElementName
 AttributeName AttributeType State DefaultValue?
 AttributeName AttributeType State DefaultValue?>
  • ElementName is the name of the element taking the attributes.
  • AttributeName is the name of the attribute.
  • AttributeType is the type of data that the attribute value may hold. Although there are many types, the most common are CDATA (unparsed character data) and ID (a unique identifier). A list of options can also be given for the attribute type.
  • DefaultValue is the value of the attribute if it is not included in the element.
  • State can be one of three values: #REQUIRED, #FIXED (set value), and #IMPLIED (optional).

The beatle element has two possible attributes: link, which is optional and may contain any valid XML text, and real, which defaults to yes if it is not included.

<!ATTLIST beatle
 link CDATA #IMPLIED
 real (yes|no) "yes">

Validating an XML Document with a DTD

The DOCTYPE declaration in an XML document specifies the DTD to which it should conform. In the code sample below, the DOCTYPE declaration indicates the file should be validated against Beatles.dtd in the same directory.

Code Sample: DTDs/Demos/Beatles.xml

<?xml version="1.0"?>
<!DOCTYPE beatles SYSTEM "Beatles.dtd">
<beatles>
 <beatle link="http://www.paulmccartney.com">
  <name>
   <firstname>Paul</firstname>
   <lastname>McCartney</lastname>
  </name>
 </beatle>
 <beatle link="http://www.johnlennon.com">
  <name>
   <firstname>John</firstname>
   <lastname>Lennon</lastname>
  </name>
 </beatle>
 <beatle link="http://www.georgeharrison.com">
  <name>
   <firstname>George</firstname>
   <lastname>Harrison</lastname>
  </name>
 </beatle>
 <beatle link="http://www.ringostarr.com">
  <name>
   <firstname>Ringo</firstname>
   <lastname>Starr</lastname>
  </name>
 </beatle>
 <beatle link="http://www.webucator.com" real="no">
  <name>
   <firstname>Nat</firstname>
   <lastname>Dunn</lastname>
  </name>
 </beatle>
</beatles>

Exercise: Writing a DTD

Duration: 60 to 90 minutes.

In this exercise, you will write a DTD for the business letter shown below. You will then give your DTD to another student, who will mark up the business letter as a valid XML document according to your DTD. Likewise, you will markup the business letter according to someone else's DTD. Make sure that the XML file contains a DOCTYPE declaration.

Both documents should be saved in the DTDs/Exercises folder. To test whether the XML file is valid, visit http://www.softwareadjuvant.com/services/validate/validate.html and upload your XML document and DTD.

Code Sample: DTDs/Exercises/BusinessLetter.txt

November 29, 2005

Joshua Lockwood
Lockwood & Lockwood
291 Broadway Ave.
New York, NY 10007
United States

Dear Mr. Lockwood:

Along with this letter, I have enclosed the following items:

 - two original, execution copies of the Webucator Master Services Agreement
 - two original, execution copies of the Webucator Premier Support for 
  Developers Services Description between 
  Lockwood & Lockwood and Webucator, Inc.
 
Please sign and return all four original, execution copies to me at your
earliest convenience.  Upon receipt of the executed copies, we will 
immediately return a fully executed, original copy of both agreements to you.

Please send all four original, execution copies to my attention as follows:

 Webucator, Inc.
 4933 Jamesville Rd.
 Jamesville, NY 13078  USA
 Attn: Bill Smith
 
If you have any questions, feel free to call me at 800-555-1000 x123 
or e-mail me at bsmith@webucator.com.

Best regards,

Bill Smith
VP, Operations

DTDs Conclusion

In this lesson of the XML tutorial, you learned to created DTDs to validate XML documents.

To continue to learn XML go to the top of this page and click on the next lesson in this XML Tutorial's Table of Contents.

Use of http://www.learn-xml-tutorial.com (Website) implies agreement to the following:

Copyright Information

All pages and graphics on Website are the property of Webucator, Inc. unless otherwise specified.

None of the content on Website may be redistributed or reproduced in any way, shape, or form without written permission from Webucator, Inc.

No Printing or saving of pages or content on Website

This content may not be printed or saved. It is for online use only.


Linking to Website

You may link to any of the pages on Website; however, you may not include the content in a frame or iframe without written permission from Webucator, Inc.


Warranties

Website is provided without warranty of any kind. There are no guarantees that use of the site will not be subject to interruptions. All direct or indirect risk related to use of the site is borne entirely by the user. All code and explanations provided on this site are provided without warranties to correctness, performance, fitness, merchantability, and/or any other warranty (whether expressed or implied).


For individual private use only

You agree not to use this online manual to deliver or receive training. If you are delivering or attending a class that is making use of this online manual, you are in violation of our terms of service. Please report any abuse to courseware@webucator.com. If you would like to deliver or receive training using this manual, please fill out the form at http://www.webucator.com/Contact.cfm