"Using XmlReaderSettings, XmlReader, and the Static
Create Methods"
It must be tough for companies that develop software for working
with XML. No sooner do they get a product out of the door, the World Wide Web
Consortium (W3C) changes the recommendations and standards so that their
product is out of date. Yet the manufacturers still have to maintain backward
compatibility with their previous releases, while attempting to encompass all
the new standards. We've seen this several times before in Microsoft's XML
product space, and the process shows little sign of stabilizing yet.
OK, so the base specification for XML itself, version 1.0,
is complete, stable and implemented in almost all products now. But recent
advances in technologies such as XML Query Language (XQuery - see http://www.w3.org/XML/Query) and the XML
Information Set (XML InfoSet - see http://www.w3.org/TR/xml-infoset/)
require changes to core classes in the System.Xml namespace with each release of
the Framework, to keep up with evolving standards.
When version 1.0 of the .NET Framework was introduced, it brought
with it a whole raft of new techniques for working with XML. This included a
new pull-model parser, the XmlReader,
new XML document objects such as XmlDocument, XmlDataDocument and XPathDocument, new
classes for working with schemas, and a brand new XSL-T processor. Now, at the
time of writing, version 2.0 has just appeared (this article is based on the
Beta 2 release). And after the preamble above, you won't be surprised to learn
that there are a great many changes in the release compared to version 1.x.
In this series of three articles, we'll look in detail at
how the new features of the XmlReader and XmlWriter classes in version 2.0 of the .NET Framework can be used to read and write XML
documents, and interact with the new XML document store objects. This includes:
- The new "settings" classes and static Create methods
for XmlReader and XmlWriter
- Creating and using an XmlReader to read and validate XML documents
and fragments
- Two of the useful new features of the XmlReader class
- Creating and using an XmlWriter to write XML documents and
fragments
- Some useful new features of the XmlWriter class
- How the XmlReader and XmlWriter can be used with the XmlDocument class
- Some of the useful new features of the XmlDocument class
Along the way, we'll look into the issues involved in using
the new classes, the reasoning behind the changes, and how the new features
simplify your code and provide better overall efficiency for your applications.
This first article concentrates on the XmlReader class, and how the new XmlReaderSettings class makes it easy to create XmlReader instances with specific properties such as
validation and access control for use in your applications.
The New "Settings" Classes for XmlReader and XmlWriter
To read or write XML in version 1.x,
you can create an instance of a class that inherits from XmlReader or XmlWriter, such as XmlTextReader or XmlTextWriter, and then set various properties before using that reader or
writer. The XmlReader and XmlWriter classes are abstract, and so you cannot create instances of them directly. And
each time you need a reader or writer, you have to go through the same process
of creating an instance and setting the properties.
In version 2.0, the fundamental technique for creating
readers and writers has changed. There are two new classes named XmlReaderSettings and XmlWriterSettings that you use as a "factory" to generate instances of readers and
writers on demand, without having to repeatedly set their properties. This has
several benefits in that it:
- Reduces the code you have to write
- Allows the framework to make optimizations in the reader
or writer based on the settings, for example omitting validation support
if this is not required
- Provides classes that can execute more efficiently in
circumstances where the extra features are not required
- Allows you to create instances of the abstract base
classes, rather than having to instantiate classes that inherit from XmlReader or XmlWriter
- Allows the XmlReader and XmlWriter to be extended in future
releases without breaking your code, and therefore removes the need for
multiple concrete implementations aimed at different scenarios
The version 2.0 XmlReader and XmlWriter classes expose a new Static/Shared method in version 2.0 called Create, which allows you
to create instances by specifying an XmlReaderSettings or XmlWriterSettingsclass instance that defines the behaviour you want. We'll look at
how this works with the XmlReaderin this article, and XmlWriterin the net article.
However, first, it's useful to see how the XmlReader and XmlWriter fit into the whole scheme of things in .NET version 2.0. Figure 1
shows the main data flows that involve the three types of XML document store
and manipulation classes in System.Xml 2.0 and its subsidiary namespaces. You
can see that theXmlReader and XmlWriter are a fundamental part of the flow when reading XML into, and
saving it from other classes such as the document stores.

Figure 1 - How the XmlReader and
XmlWriter can be used with the XML Document Stores in v2.0
Not shown here are other areas where the XmlReader and XmlWriter are used, for example when reading XML using the SQLXML technology
in SQL Server via an ADO.NET Command instance, or
reading and writing XML with the new XslCompiledTransform class that performs XSL-T transformations. And, of course, you can use the
methods of the XmlReader and XmlWriter classes directly to read and expose nodes from an XML document, or to
create new XML documents.
The XmlReaderSettings Class
The XmlReaderSettings class is used to specify the behavior you want for XmlReader instances that you will create
and use in your code. Figure 2 shows a schematic overview of the XmlReaderSettings class. You can see that the set of properties available is broadly similar to
that you will be used to in the version 1.xXmlReader class. You can specify a range
of properties that control the way XML is handled: including ignoring
white-space and processing instructions, specifying the schema validation type
and conformance level, preventing DTDs from being processed, and closing the underlying
input stream automatically when the reader is closed.

Figure 2 - The XmlReaderSettings
Class
There are also properties that return the current line
number and character offset when reading a document, and the ability to switch
on and off strict checking of the characters in the input stream (for example
characters that are outside the legal range for XML documents). The XmlReaderSettings class also exposes a reference to an XmlResolver that is used to safely read
external schemas, DTDs and entities; plus a reference to an ICredentials collection that contains the network credentials to be presented to the server
when accessing a remote document.
To resolve namespaces within the XML document, the XmlReaderSettings class also exposes a reference to an XmlNameTable. This is basically a
collection of name/value pairs that specify the namespace prefixes and the
corresponding namespace identifier declarations.
You can also read an XML stream that doesn't contain the <?xml
version="1.0"?> declaration, and read fragments of
XML that are not - on their own - valid documents. You specify the conformance
level, so that the reader will accept input that is not actually a complete XML
document, for example a fragment that contains un-declared namespace prefixes.
Some of the ways that you can use the XmlReaderSettings class are discussed next. We'll look at:
- Creating an XmlReader with the XmlReaderSettings class
- Validating XML with the XmlReaderSettings and XmlReader classes
- Handling XML validation errors
- Using a custom handler to trap XML validation
errors and warnings
- Reading fragments
of XML with an XmlReader
- Validating fragments of XML with an XmlReader
- Using an XmlResolver to limit access to resources
- Wrapping or "pipelining" XmlReader instances
The example page shown in Figure 3
demonstrates most of the features listed above. You can run or download all of
the samples at our Website at http://www.daveandal.net/articles/readwritexml/.
This first example, named readersettings.aspx, allows you to turn
on and off validation (including using a custom validation handler and trapping
validation warnings), set the conformance level for a document or a fragment,
and use an XmlResolver to limit access to the XML disk file. It also demonstrates reading
typed values, as you'll see later in the article. There is a [view source] link at the bottom of the page that you can use to see the source
code, which is fully commented to help you understand how it all works.

Figure 3 - The Example Page that
Demonstrates Using the XmlReaderSettings Class
Creating an XmlReader with the XmlReaderSettings Class
To create an XmlReader instance, you
first instantiate an instance of the XmlReaderSettings class, set the properties you want, and then call the Create method of the XmlReader class. For example, this code
creates an XmlReader that closes the underlying input stream when the reader is closed,
ignores comments in the XML document, and reads the XML disk file named myfile.xml:
Dim rs As New XmlReaderSettings()
rs.CloseInput = True
rs.IgnoreComments = True
Dim xr As XmlReader = XmlReader.Create("C:\temp\myfile.xml",
rs)
Other overloads of the Create method allow you to generate an XmlReader over a Stream, or wrap an existing TextReader or XmlReader which is then used as the input to the new XmlReader. You can also pass an XmlParserContext instance as the third parameter of the Create method, which
allows you to declare the namespaces and prefixes used in the document, and
specify the language and the white-space handling options that the reader will
use when reading the XML. Finally, you can use the Create method without specifying an XmlReaderSettings instance if you just want to create a single instance of an XmlReader, and set the various properties of the reader directly afterwards.
The example page shown in Figure 3 provides
a drop-down list where you can select from a range of XML disk files. It also
declares a variable to hold an XmlParserContext instance, which is
populated if you select the option to read an XML fragment instead of a
complete and well-formed XML document. The XmlReader is then created using the
static Create method against the XML file you select in the drop-down list:
Dim xpc As XmlParserContext = Nothing
...
' create and populate the XmlParserContext
here if reading an XML fragment
...
Dim xr As XmlReader = Nothing
Dim sPath As String = Server.MapPath("data/"
& lstDocument.SelectedItem.Text)
xr = XmlReader.Create(sPath, rs, xpc)
If there is an error creating the XmlReader, for example a security exception or if the XML file or stream you
specify does not exist, the exception is raised when you call the Create method. Therefore you should always use a Try..Catch construct to trap any such errors.
Validating XML with the XmlReaderSettings and XmlReader Classes
One of the stranger features in version 1.x of the System.Xml implementation is that you have to use a special class, XmlValidatingReader, to validate an XML
document. And you have to create this XmlValidatingReader from an existing XmlReader instance. This is because validation adds an overhead to the reader class that
wastes resources if validation is not required (although the readers do check
that the document is well-formed).
In version 2.0, you can validate a document directly when
using an XmlReader.
A range of properties on the XmlReaderSettings class allow you to specify one or more external
XML schemas or DTDs using the XmlSchemaSet class (a collection of XmlSchema instances), and these are applied to the XML as it is read - depending on the
settings you specify for the ValidationType and ValidationFlags property. The ValidationFlags property is combination of flag values from the XmlSchemaValidationFlags enumeration, as
shown earlier in Figure 2. This enumeration contains five values:
- None:
none of the validation flags are active - this is the default
- ProcessIdentityConstraints:
all constraints specified by xs:ID, xs:IDREF, xs:key, xs:keyref, xs:unique elements in the document
are processed
- ProcessInlineSchema:
any inline schema within the document is processed
- ProcessSchemaLocation:
any elements that specify external schema locations, such as xsi:schemaLocation, xsi:noNamespaceSchemaLocation,
are processed
- ReportValidationWarnings:
any warnings encountered during validation are detected, and the
corresponding validation events will be raised.
To enable validation in an XmlReaderSettings class, before you
create the XmlReader instances you need from it, you must perform two tasks. The first is to create
an XmlSchemaSet and assign the schemas that will be used for validating the XML to it (unless
the XML document contains an inline schema). In the example page we use an XML
document that refrences two schemas - one that defines the main elements in the
document and one that defines the reviewed element with the namespace prefix "rv". This is
the standard and valid XML document:
<?xml version="1.0"
encoding="utf-8"?>
<root xmlns="http://myns/slidesdemo"
xmlns:rv="http://myns/slidesdemo/reviewdate">
<session name="All about
XML">
<slides>
<slide position="1">
<title>Agenda</title>
<rv:reviewed>2004-05-10T00:00:00</rv:reviewed>
</slide>
<slide position="2">
<title>Introduction</title>
<rv:reviewed>2003-10-22T00:00:00</rv:reviewed>
</slide>
<slide position="3">
<title>Code
Examples</title>
<rv:reviewed>2004-03-02T00:00:00</rv:reviewed>
</slide>
</slides>
</session>
</root>
You can see the two namespace declarations in the root element, and
these are used in the targetNamespace attribute of the two schemas. So we need to add both of these schemas to the XmlSchemaSet, and
then assign the XmlSchemaSet to the Schema property of the XmlReaderSettings instance:
Dim ss As New XmlSchemaSet()
ss.Add("http://myns/slidesdemo", Server.MapPath("data/schema/slides.xsd"))
ss.Add("http://myns/slidesdemo/reviewdate",
Server.MapPath("data/schema/slidesrev.xsd"))
rs.Schemas = ss
Then we turn on validation by setting the ValidationType and specifying the ValidationFlags we want
to be active. In this case, we've specified that validation should be carried
out against an XML schema, though you could use ValidationType.Auto, in
which case the reader will detect which type of schema or DTD is being used:
rs.ValidationType = ValidationType.Schema
rs.ValidationFlags = (rs.ValidationFlags + XmlSchemaValidationFlags.ProcessSchemaLocation)
Handling XML Validation Errors and Warnings
Now any validation error will raise an XmlSchemaException when the XML is read. So you can handle this error to find out what
happened, either when loading another object with the XmlReader (for example passing it to the Load method of an XmlDocument instance), or when reading individual nodes directly. In the
example page, we've previously created a StringBuilder to hold
the results of processing the XML disk file, and it can be populated with the
validation error details like this:
Try
While xr.Read()
' ... handle and display XML document
content here ...
End While
Catch xsx As XmlSchemaException
' document failed validation against
schema so display details
builder.Append("<p><b>ERROR
validating XML document against schema:</b><br />")
builder.Append("Message = "
& xsx.Message & "<br />")
builder.Append("LineNumber = "
& xsx.LineNumber.ToString())
builder.Append(" LinePosition
= " & xsx.LinePosition.ToString() & "</p>")
...
Figure 4 shows the result of validating an XML document that
contains invalid content. This document contains the element <slideposition="two">,
which is invalid because the data type defined in the schema for this element
is xs:unsignedByte.
Notice that processing of the XML document stops when the error is encountered (if
you do not tick the first checkbox in the page, it will read the XML without
validating it and you'll be able to see the values of all the nodes).

Figure 4 - Validating a Document
with an XmlReaderSettings and XmlReader Class
However, the XmlReader may also raise
other types of exception when reading the XML document, for example if the file
becomes unavailable or the input stream is disrupted. In this case, you should
also include a generic error handler section, and remember to close the XmlReaderas well when you have finished using it:
...
Catch ex As Exception
' error reading document so display
details
builder.Append("<p><b>ERROR
reading XML document:</b><br />")
builder.Append("Message = "
& ex.Message & "</p>")
Finally
Try
xr.Close()
Catch
End Try
End Try
Another approach is to use a Using construct, now available in VB.NET as well as C#, to ensure that
the reader is correctly disposed when you have finished with it. You don’t have
to remember to call Close in this case, though it's still good practice to do so. For
example:
Using xr As XmlReader = XmlReader.Create("test.xml",
rs)
' ... use the XmlReader here ...
' ... still good practice to call Close when
complete ...
End Using
Using a Custom Handler to Trap XML Validation Errors and Warnings
Trapping validation errors, as shown above,
is useful, but sometimes you want to handle validation errors yourself, without
having processing stop when the first one is encountered. As in version 1.x,
you can add a custom handler to the ValidationEventHandler property of the XmlReader (in
version 2.0, this is done via the XmlReaderSettings class), which is called
when any validation error is raised. In VB.NET, you can use the following to
specify the event handler named MyValidationHandler for this event:
AddHandler rs.ValidationEventHandler, AddressOf
MyValidationHandler
In C#, you would use:
rs.ValidationEventHandler += MyValidationHandler;
A simple event handler is used in the example page, which
adds details of the validation error to the StringBuilder so that they can be
displayed in the page afterwards. And, because we are handling the validation
event ourselves, processing of the XML document continues when each error is
detected:
Sub MyValidationHandler(ByVal sender As
Object, ByVal e As ValidationEventArgs)
' display error details
builder.Append("<p><b>ValidationEventHandler
detected an error:</b><br />")
builder.Append("Message = "
& e.Message & "<br />")
builder.Append("Severity = "
& e.Severity.ToString() & " ")
' get line number and character offset
from exception
builder.Append("LineNumber = "
& e.Exception.LineNumber.ToString() & " ")
builder.Append("LinePosition =
" & e.Exception.LinePosition.ToString() & "</p>")
End Sub
By default, only validation errors are reported when you validate an XML document. However, validation can also
raise warnings that indicate a problem with the XML, but do not
necessarily mean it is invalid. A prime example is when you are reading a
fragment of XML that does not contain the matching namespace declaration. To
see these warnings, you must handle the validation event yourself, as
demonstrated in the previous section, and also turn on validation warnings by
setting the ReportValidationWarnings flag in the ValidationFlags property of the XmlReaderSettings instance before you create the XmlReader:
rs.ValidationFlags = (rs.ValidationFlags _
+ XmlSchemaValidationFlags.ReportValidationWarnings)
Now the custom event handler can report the
validation warnings as well as validation errors. When a warning is
encountered, the value of the Severity property of the ValidationEventArgs instance passed to the event handler will be "Warning".