Thursday, November 5, 2009

XML Validation

Next post in the line of XML processing hints with C#. This time we want to validate a xml file.

Assume a sample:
You've got an XML-File books.xml as follows:

   1:  <books xmlns="urn:books">
   2:   <book>
   3:    <author>It's just me</author>
   4:    <publisheddate>2009-09-25T09:06</publisheddate>
   5:   </book>
   6:  </books>

And a xml schmema (books.xsd) like this:

   1:  <?xml version="1.0" encoding="utf-8"?>
   2:  <xs:schema targetNamespace="urn:books" xmlns:xs="http://www.w3.org/2001/XMLSchema">
   3:   <xs:element name="books">
   4:    <xs:complexType>
   5:     <xs:sequence>
   6:      <xs:element name="book">
   7:       <xs:complexType>
   8:        <xs:sequence>
   9:         <xs:element name="author" type="xs:string" />
  10:         <xs:element name="publisheddate" type="xs:dateTime" />
  11:        </xs:sequence>
  12:       </xs:complexType>
  13:      </xs:element>
  14:     </xs:sequence>
  15:    </xs:complexType>
  16:   </xs:element> 
  17:  </xs:schema>

Now there is more than one way to validate the xml file.

The first way I want to show is to load the whole xml file into a XmlDocument instance like.

   1:  // load xml
   2:  XmlDocument xmlDoc = new XmlDocument();
   3:  xmlDoc.Load("books.xml");

Now we need the schema to validate against ...

   1:  // load schema XmlSchema schema; using(XmlReader schemaReader = XmlReader.Create("books.xsd")) schema = XmlSchema.Read(schemaReader, (p, q) => if(q.Exception != null) throw q.Exception); 

Add this to the schemas of the XmlDocument to provide it for validation. Be carefull, that the schema is added for the namespace of the document nodes.

   1:  // add schema xmlDoc.Schemas.Add(xmlDoc.DocumentElement.NamespaceUri, schema);

Now the document could be validated. (Just providing a lamda expression as ValidationEventHandler.)

   1:  // validate document
   2:  xmlDoc.Validate((p, q) => if(q.Exception != null) throw q.Exception);

There is only one handycap: Validation Warnings are not passed to the ValidationEventHandler.

The other way to validate a xml file is to use a XmlReader (or just one of it's derived classes). This is the favored way for large xml files.

First let us initialize such an reader. To use validation functionality of the reader we must pass in some XmlReaderSettings to configure that reader.

   1:  // loading schema
   2:  XmlSchemaSet schemaSet = new XmlSchemaSet();
   3:  schemaSet.Add("urn:books", "books.xsd");
   4:  XmlReaderSettings settings = new XmlReaderSettings();
   5:  settings.ValidationType = ValidationType.Schema;
   6:  settings.ValidationFlags |= XmlSchemaValidationFlags.ReportValidationWarnings;
   7:  settings.Schemas = schemaSet;
   8:  settings.ValidationEventHandler += (p, q) =>
   9:  {
  10:     if (args.Severity==XmlSeverityType.Warning)
  11:        Console.WriteLine("Warning: Matching schema not found. No validation occurred." + args.Message);
  12:     else if(q.Exception != null)
  13:        throw q.Exception;
  14:     else
  15:        Console.WriteLine("\tValidation error: " + args.Message);
  16:  };
  17:  // create reader
  18:  using(XmlReader reader = XmlReader.Create("books.xml", settings))
  19:  {
  20:     // validate
  21:     while(reader.Read());
  22:  }

This is it, now you receive validation errors and warnings.

No comments:

Post a Comment