One of the most powerful XML Schema features is its hability to validate d

How does the XmlSerializer work? What ACL permissions does a
process using it require?

Question Posted / ketan s mayangar

2 Answers
12058 Views
I also Faced
E-Mail Answers

Answer Posted / ketan s mayangar

One of the most powerful XML Schema features is its hability
to validate documents based on element types, instead of
element names. That is no matter which element name is used
in an instance document, say Customer, customer and
CRMCustomer, as far as our XSD Schema makes them all inherit
from say CustomerDef, the document will be valid. This is
very important in interoperability scenarios, of course.
That said, one of the most versatile and performat ways to
handle XML in .NET (forget about XmlDocument) is using the
XmlSerializer class. Coupled with XSD.EXE or the technique I
exposed in a previous post, you can easily autogenerate the
classes from that schema. So far so good.
+ For the curious, here's such a schema (a trivial one here
of couse).

<?xml version="1.0" encoding="utf-8" ?>
<xs:schema id="Customers"
targetNamespace="http://www.lagash.com/schemas/customers"
elementFormDefault="qualified"
xmlns="http://www.lagash.com/schemas/customers"
xmlns:mstns="http://www.lagash.com/schemas/customers"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType name="CustomerDef">
<xs:sequence>
<xs:element name="Name" type="xs:string" />
<xs:element name="EMail" type="xs:string" />
</xs:sequence>
</xs:complexType>
<xs:element name="customer" type="CustomerDef"></xs:element>
<xs:element name="Customer" type="CustomerDef"></xs:element>
<xs:element name="CustomerCRM"
type="CustomerDef"></xs:element>
</xs:schema>

+ And here are the different instance documents.

<?xml version="1.0" encoding="utf-8" ?>
<Customer xmlns="http://www.lagash.com/schemas/customers">
<Name>Daniel Cazzulino</Name>
<EMail>customer@example.com</EMail>
</Customer>

<?xml version="1.0" encoding="utf-8" ?>
<customer xmlns="http://www.lagash.com/schemas/customers">
<Name>Daniel Cazzulino</Name>
<EMail>customer@example.com</EMail>
</customer>

<?xml version="1.0" encoding="utf-8" ?>
<CustomerCRM xmlns="http://www.lagash.com/schemas/customers">
<Name>Daniel Cazzulino</Name>
<EMail>customer@example.com</EMail>
</CustomerCRM>

However, only one of the three versions will work, the one
with the element "customer". The other versions, which are
equally valid according to the schema, and which are of the
desired type CustomerDef will fail with an exception saying
the element was not expected. As I explained while
discussing XmlSerializer speed, it creates a temporary
assembly for reading and writing the serialized version of a
type. We're interested in the reader now.
When the XSD shown above is used to generate the
XmlSerializable class, we get a class definition like the
following:

/// <remarks/>
[System.Xml.Serialization.XmlTypeAttribute(Namespace="http://www.lagash.com/schemas/customers")]
[System.Xml.Serialization.XmlRootAttribute("customer",
Namespace="http://www.lagash.com/schemas/customers",
IsNullable=false)]
public class CustomerDef
{
/// <remarks/>
public string Name;

/// <remarks/>
public string EMail;
}

From the definition above, the XmlSerializer will create the
temporary reader. The reader will contain a set of Read
methods according to those serialization attributes. Using
the technique explained in the aforemetioned post, I got the
temporary generated class. The reader contains a
Read4_customer method which is the one that tries to load
the XML. The problem is that this method uses a stored
string (taken from the serialization attributes) and
performs an element name/namespace string comparison.
Therefore, it will always fail with the other two valid root
elements.

I found a very interesting thing though, while digging
inside the generated reader. It has a method with the
signature CustomerDef Read1_CustomerDef(bool isNullable,
bool checkType) which is perfectly capable of loading the
object. However, getting this far was very difficult. First,
I had to add this temporary class to my project and make
that method public, as it's private by default, and second,
there's no "public" way of initializing this reader. You
have to call an internal Init method on the base
XmlSerializationReader class. Thanks GOD we still have
reflection to test these things!

MethodInfo m = typeof(XmlSerializationReader).GetMethod(
"Init", BindingFlags.Instance | BindingFlags.NonPublic);

using (FileStream c = new FileStream(@"C:\CustomerCRM.xml",
FileMode.Open))
{
XmlValidatingReader vr = new XmlValidatingReader(new
XmlTextReader(c));
vr.Schemas.Add(xsd);

// Create the temp. reader manually

Microsoft.Xml.Serialization.GeneratedAssembly.CustomerDefReader
cr =
new
Microsoft.Xml.Serialization.GeneratedAssembly.CustomerDefReader();

// Call Init through reflection
m.Invoke(cr, new object[] { vr, null, null, null } );

// Read with the method that checks the type
object cust = cr.Read1_CustomerDef( false, false );
Console.WriteLine(cust);
}

That method will sucessfully load any of the three versions
for the root element, either if they have the xsi:type
attribute set, in which case the Read1_CustomerDef could use
a true for the second parameter (checkType), or not. Another
method that is generated and could work is Read2_Object, if
it receives checkType=true and the instance document uses
xsi:type to specify that it's a CustomerDef instance (which
is not always possible if you're receiving the document from
a third party). Unfortunately, like I said above, the code
that calls Read1_CustomerDef, and which is the one called by
the serializer to load the XML, only checks for names:

public object Read4_customer() {
object o = null;
Reader.MoveToContent();
if (Reader.NodeType == System.Xml.XmlNodeType.Element) {
if (((object) Reader.LocalName ==
(object)id5_customer &&
(object) Reader.NamespaceURI ==
(object)id2_httpwwwlagashcomschemascustomers)) {
o = Read1_CustomerDef(false, true);
}
else {
throw CreateUnknownNodeException();
}
}
else {
UnknownNode(null);
}
return (object)o;
}

Note the very efficient use of string reference comparison,
by casting them to Object.

One way to solve this would be if the XmlRootAttribute could
be specified multiple times, so that the generated code
checks for multiple names.
The other, more XSD-compliant and certainly more flexible as
it wouldn't require regeneration of the serializable class
(CustomerDef in this case) to reflect new element names,
would be to check if the current Reader is actually an
XmlValidatingReader and read the customer if the type
matches. The previous code can be modified as follows to
make this work:

public object Read4_customer() {
object o = null;
Reader.MoveToContent();
if (Reader.NodeType == System.Xml.XmlNodeType.Element) {

// Check for validating reader with schema type determined
if (Reader is System.Xml.XmlValidatingReader &&
((System.Xml.XmlValidatingReader) Reader).SchemaType
!= null) {
System.Xml.Schema.XmlSchemaType type =
(System.Xml.Schema.XmlSchemaType)
((System.Xml.XmlValidatingReader) Reader).SchemaType;
// We would have to check the inheritance chain too.
if (((object) type.Name == (object)id1_CustomerDef &&
(object) type.QualifiedName.Namespace ==
(object)id2_httpwwwlagashcomschemascustomers)) {
o = Read1_CustomerDef(false, true);
}
}

else if (((object) Reader.LocalName ==
(object)id5_customer &&
(object) Reader.NamespaceURI ==
(object)id2_httpwwwlagashcomschemascustomers)) {
o = Read1_CustomerDef(false, true);
}
else {
throw CreateUnknownNodeException();
}
}
else {
UnknownNode(null);
}
return (object)o;
}

Of course this would require a validating reader with the
appropriate schema loaded, by why would you create an XSD
otherwise? Would you loose all those powerful validation
capabilities and instead use it only to save you some lines
of class definition code and Xml serialization attributes?
If you do, I urge you to think twice, you're really missing
something that can greatly improve your code (no more
validation of ranges, patterns, etc.).

Another interesting thing I found is that the XmlSerializer
can be inherited, and it has a protected method that allows
you to deserialize directly from an XmlSerializationReader.
That would have solve my previous reflection problems, as I
could simply inherit the serializer and make a public method
receiving my modified reader and pass it through to it. That
would even make for maybe more efficient custom
deserialization, for example one where the temporary
assemblies are not generated each time the application
starts but rather stored in a permanent location for reuse
across AppDomains (maybe a DB?). Remember the initial
generation and compilation performance hit is significant.
Unfortunately, that method implementation throws a
NotImplementedException :((((((. But that makes me wonder if
in Whidbey it's possible.... :)

In a future post, maybe here, maybe on MSDN online, I will
explain how to take advantage of IXmlSerializable interface
to implement custom serialization but also to gain automatic
XSD validation right from your assembly-embedded schema.

Is This Answer Correct ?

3 Yes

1 No

Post New Answer View All Answers

Please Help Members By Posting Answers For Below Questions

How do you change the session time-out value?

579

Disable Mouse right click on web page in asp.net?

580

What is windows active directory authentication?

567

How do I use a proxy server when invoking a web service?

525

Define application state variable and session state variable?

555

Can asp.net work on an nt server?

553

Which method is used to perform all validation at the page level?

532

Describe the diffeerence between inline and code behind?

507

What is active web page?

562

What is the significance of ASP.NET routing?

632

What is datagrid asp.net?

495

What is query string in asp.net?

559

How use Xml data in ASP.net?

557

How to sign out from forms authentication?

541

What are app services?

528