Typed XML table columns in SQL Server 2005 are columns that
have been associated with an XML schema and untyped XML table columns are those
that have no XML schema associated with them. An XML schema provides the
following:
·
Validation constraints
·
Data type information about the instance data
An untyped XML
table column is best suited to save XML data for XML schemas that are not known
in advance or in cases where an XML schema is rejected by SQL Server (see Limitations of XML schema collections later on in this
article).
You can use XML schema collections in SQL Server to define
XML schemas. XML schema collections must be created first before you can
associate them with XML table columns. You can use the following T-SQL syntax
to create an XML schema collection:
CREATE XML SCHEMA COLLECTION [<relational_schema>.]sql_identifier AS Expression
For example,
CREATE XML SCHEMA COLLECTION IPFormSchemaCollection AS
'<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<xsd:schema targetNamespace="<span class=Italic>namespace_of_form</span>">
...
</xsd:schema>'
creates an XML schema collection for an InfoPath form.
For more information on creating XML schema collections see CREATE XML SCHEMA
COLLECTION (Transact-SQL) in SQL Server 2005 Books Online.
Saving forms with different XML schemas to one table
column
You can define more than one XML schema in one XML schema
collection. Therefore, you can also save multiple forms with different XML
schemas in one XML table column by creating an XML schema collection that
consists of the XML schemas for multiple InfoPath forms and associating this
XML schema collection with the XML table column to save the forms in.
The following sample T-SQL statement defines the XML schemas
for two different InfoPath forms in one XML schema collection:
CREATE XML SCHEMA COLLECTION IPFormSchemaCollection AS
'<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<xsd:schema targetNamespace="<span class=Italic>namespace_first_form</span>">
...
</xsd:schema>
<xsd:schema targetNamespace="<span class=Italic>namespace_second_form</span>">
...
</xsd:schema>'
Limitations of XML schema collections
XML schema collections have a few known limitations which
you can read more about in Guidelines and
Limitations in Using XML Schema Collections on the Server. This means that
you cannot just copy and paste the XML schema of your InfoPath form to create an
XML schema collection for it, but that you will have to modify it in such a way
that SQL Server will accept it. Please note that these modifications are only
required for the XML schema you will be using to create the XML schema
collection in SQL Server and not for the XML
schema of your InfoPath form itself. The latter may remain intact. Any
modifications you apply to the schema should ensure that the XML data of filled
out InfoPath forms conforms to both the XML schema collection in SQL Server as
well as the original XML schema that was defined for your InfoPath form. In
simple terms: One XML document should "fit" two XML schemas.
Alternatively, you could create an XML schema that is accepted by SQL Server
and then base your InfoPath form on this schema, which would result in only one
schema being used for both SQL Server and InfoPath.
The form in Figure 1 has two controls that are likely to
cause XML validation errors in SQL Server:
1.
The Date Picker control
2.
The Rich Text Box control
In the next 3 sections we will take a closer look at how you
can modify the XML schema in Listing 1 so that it passes validation within SQL
Server 2005, while XML data that is derived from this schema still adheres to the
original XML schema of the InfoPath form.
The processContents attribute
The first error message that might appear when you try to
create an XML schema collection using the XML schema as shown in Listing 1 is:
The XML Schema syntax
'processContents="lax"' is not supported.
Since SQL Server does not support lax
validation for elements of the anyType type, strict or skip validation should be
used instead of lax. The Rich Text
Box control in InfoPath uses lax validation, because
the elements contained within a field representing a Rich
Text Box control cannot be defined in advance. Therefore, if you are
using a Rich Text Box control on your form, set its processContents attribute in the XML schema to skip. Modify the XML schema in Listing 1 so that
<xsd:anyAttribute processContents="lax" namespace="http://www.w3.org/XML/1998/namespace"/>
becomes
<xsd:anyAttribute processContents="strict" namespace="http://www.w3.org/XML/1998/namespace"/>
and
<xsd:any minOccurs="0" maxOccurs="unbounded" namespace="http://www.w3.org/1999/xhtml" processContents="lax"/>
becomes
<xsd:any minOccurs="0" maxOccurs="unbounded" namespace="http://www.w3.org/1999/xhtml" processContents="skip"/>
The XML document encoding
XML documents can be encoded using encodings such as UTF-8,
UTF-16, etc. SQL Server 2005 stores data in Unicode (UTF-16) by default. If XML
encoding information is specified using the encoding
attribute in the XML declaration, this encoding must be compatible with the
string's code page. For example, XML data that has a UTF-16 encoding should be
passed in as an nvarchar string, while XML data that
has a UTF-8 encoding should be passed in as a varchar
string. InfoPath forms use UTF-8 encoding, so you must use a varchar
string when creating the XML schema collection for your InfoPath form.
Prefixing a string with an N defines
it as a Unicode string. For example, N'Double-byte'
defines the text "Double-byte" as an nvarchar
string, while 'Single-byte' defines the text "Single-byte"
as a varchar string. If you use a Unicode string when
creating the XML schema collection for your InfoPath form, you will receive the
following error:
XML parsing: line ..., character ...,
unable to switch the encoding.
This is because UTF-16 encoded strings are inconsistent with
the encoding defined in the XML schema for InfoPath forms (UTF-8). You can get
around this by not using the N prefix when creating the
XML schema collection.
Date and dateTime elements
While the XML parser might not complain when you create an
XML schema collection that contains an XML element with a date
or dateTime type, it will complain when you try to save
date/time strings produced by InfoPath into such elements. InfoPath's date/time
strings have the format YYYY-MM-DD or YYYY-MM-DDThh:nn:ss and do not contain time zone
information, so are rejected by SQL Server.
You can get around this by changing the data type for the
XML element that represents a Date Picker control to string and use a regular expression to add a restriction on
the format of text that may be entered into this field. So you must modify the XML
schema for the Date Picker from
<xsd:element name="field1" nillable="true" type="xsd:date"/>
into
<xsd:element name="field1">
<xsd:simpleType>
<xsd:restriction base="xsd:string">
<xsd:maxLength value="10" />
<xsd:pattern value="\d\d\d\d-\d\d-\d\d" />
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
There are several more limitations to the XML schema
collection database object which have not been discussed here and are beyond
the scope of this article, but you are encouraged to read the Guidelines and
Limitations in Using XML Schema Collections on the Server to find
workarounds whenever possible or advised to use untyped XML table columns when
workarounds cannot be found.