CQRS Archetecture

FOR XML PATH and CQRS

In a recent twitter exchange with Colin Jack I claimed that SELECT {…} FROM […] FOR XML PATH() is pretty much all you need for providing the Reporting side of CQRS.  Ok, so maybe I overstated it a tad, but it is a valuable technique.  He suggested I blog about it, so I figured even if people don’t use it for CQRS, some of the neat things you can do with FOR XML PATH may be interesting to a few people.

CQRS Archetecture

Command Query Responsibility Separation(CQRS) is an interesting architecture that is suitable for highly scalable applications with complex business logic.  Instead of me doing a horrible job of trying to explain it, here are links to people who know what they are talking about.

Grey Young

UDI Dahan

The reporting side of CQRS requires a simple mechanism for getting read-only data from a server down to the presentation tier with minimal effort.  I am suggesting the use of XML as a wire format and I am going to make the assumption that we have chosen a relational database as the store for the reporting database. More specifically, we require Microsoft SQL Server because, from what I can find, MS SQL Server is the only database engine that supports this level of flexibility when generating XML from relational tables.  

Ideally we should be able to create any desired structure of XML based on data in our relational database.  We need the flexibility to generate data as attributes or elements, create nested structures, use namespaces or not, at a minimum.

I’m going to start showing some of the basics of the FOR XML PATH syntax and build on it until we are creating relatively complex xml documents in a single SQL query.

This is pretty much the simplest query that you can create:

SELECT Code AS 'Code',
       Name AS 'Name'
FROM tblCustomer
FOR XML PATH('Customer')

It produces XML that looks like this:

<Customer>
    <Code>CUSTOMER-ONE</Code>
    <Name>Hypro Networks Inc.</Name>
    </Customer>
<Customer>
    <Code>CUSTOMER-A</Code>
    <Name>Customer A</Name>
</Customer>

The only problem with this is that it is not a valid XML document as there is more than one root element.  To create a proper XML document, you need to do:

SELECT Code AS 'Code',  
        Name AS 'Name' 
FROM tblCustomer 
FOR XML PATH('Customer'), ROOT('Customers')

which then gives you,

<Customers>  
  <Customer> 
    <Code>CUSTOMER-ONE</Code> 
    <Name>Hypro Networks Inc.</Name> 
  </Customer>
  <Customer> 
    <Code>CUSTOMER-A</Code> 
    <Name>Customer A</Name> 
  </Customer> 
</Customers>

These examples showed how to create data as XML elements, but sometimes it is desirable to use attributes.  Attributes take quite a bit less space and suitable for most content that is small and does not contain significant whitespace.

SELECT Code AS '@Code', 
        Name AS '@Name'
FROM tblCustomer
FOR XML PATH('Customer'), ROOT('Customers')

By adding an @ symbol to the front of the column name, the data will be output as attributes:

<Customers> 
  <Customer Code="CUSTOMER-ONE" Name="Hypro Networks Inc." />
  <Customer Code="CUSTOMER-A" Name="Customer A" />
</Customers>

One of the things that I like about using XML is that it is easy to group related pieces of data into other elements.  The following query demonstrates how to do this:

SELECT Code AS '@Code', 
        Name AS '@Name',
        Tel1 AS 'Contact/@Tel',
        Fax1 AS 'Contact/@Fax1'
FROM tblCustomer
FOR XML PATH('Customer'), ROOT('Customers')

Adding the slash into the column name directs SQL Server to create a new element to contain other nodes.

<Customers> 
  <Customer Code="CUSTOMER-ONE" Name="Hypro Networks Inc.">
    <Contact Tel="1-993-345-4146" Fax1="1-993-345-4140" />
  </Customer>
  <Customer Code="CUSTOMER-A" Name="Customer A">
    <Contact Tel="872-494-2000" Fax1="872-494-3008" />
  </Customer>
</Customers>

Just one small warning, columns must be adjacent in the SQL query to exist under the same child element and all attributes of the parent element must be specified in the SQL query before the columns that will be in a child element.

So far we have only considered data from a single table.  Using a simple SQL join it is easy to pull in data from other tables, but we can also create parent/child documents.

SELECT  Code AS '@Code', 
        Invoice_Date AS '@Date', 
        (SELECT ii.qty AS 'Quantity', 
                LEFT(ii.description,20) AS 'Description' 
         FROM tblinvoitem ii WHERE ii.invoice_id = iv.id FOR XML PATH('Item'),type) 
FROM tblInvoice iv 
FOR XML PATH('Invoice'), ROOT('Invoices')
<Invoices> 
  <Invoice Code="21201     " Date="2005-05-26T00:00:00"> 
    <Item> 
      <Quantity>1.0000</Quantity> 
      <Description>To supply material a</Description> 
    </Item> 
  </Invoice> 
  <Invoice Code="21200     " Date="2005-05-18T00:00:00"> 
    <Item> 
      <Quantity>25.0000</Quantity> 
      <Description>S.S. Control Box Cov</Description> 
    </Item> 
    <Item> 
      <Quantity>0.0000</Quantity> 
      <Description>S.S. Main Control Bo</Description> 
    </Item> 
  </Invoice> 
</Invoices>

The extra “type” parameter on the end of the subquery indicates to SQL that we want to embed the contents of the sub query as XML rather than just an escaped string.

One limitation that you will probably run into once you create this type of document is that there is no obvious way to insert attributes and elements into the root node.  This can be overcome with a small trick:

SELECT '1.0' AS '@Version', 
        'List of Invoices' AS 'Summary', 
( SELECT  Code AS '@Code', 
        Invoice_Date AS '@Date', 
        (SELECT ii.qty AS 'Quantity', 
                LEFT(ii.description,20) AS 'Description' 
         FROM tblinvoitem ii WHERE ii.invoice_id = iv.id FOR XML PATH('Item'),type) 
FROM tblInvoice iv 
FOR XML PATH('Invoice'),type) 
FOR XML PATH('Invoices')

Using an outer query you can specify the attributes and elements that you want to appear in the root.  Note that there is no alias for the sub query column. 

<Invoices Version="1.0"> 
  <Summary>List of Invoices</Summary> 
  <Invoice Code="21201     " Date="2005-05-26T00:00:00"> 
    <Item> 
      <Quantity>1.0000</Quantity> 
      <Description>To supply material a</Description> 
    </Item> 
  </Invoice> 
  <Invoice Code="21200     " Date="2005-05-18T00:00:00"> 
    <Item> 
      <Quantity>25.0000</Quantity> 
      <Description>S.S. Control Box Cov</Description> 
    </Item> 
    <Item> 
      <Quantity>0.0000</Quantity> 
      <Description>S.S. Main Control Bo</Description> 
    </Item> 
  </Invoice> 
</Invoices>

And of course, no discussion on XML is complete without discussing namespaces.  Here is how you can add namespaces to the XML result:

WITH XMLNAMESPACES(DEFAULT 'http://example.org/Invoices') 
SELECT '1.0' AS '@Version', 
        'List of Invoices' AS 'Summary', 
( SELECT  Code AS '@Code', 
        Invoice_Date AS '@Date', 
        (SELECT ii.qty AS 'Quantity', 
                LEFT(ii.description,20) AS 'Description' 
         FROM tblinvoitem ii WHERE ii.invoice_id = iv.id FOR XML PATH('Item'),type) 
FROM tblInvoice iv 
FOR XML PATH('Invoice'),type) 
FOR XML PATH('Invoices')

I’m not going to show you the output of this, because it is ugly!  Actually, the annoying part is that it puts the namespace on the root and then on each row of the sub query. 

Hopefully these examples have shown some of the capability of FOR XML PATH but I find that sometimes technologies work well while you are working on arbitrary examples, but then you hit the real world and they seem to fall short.  So, I decided to try and create an Atom feed using FOR XML PATH.  Here is what I came up with.

WITH XMLNAMESPACES(DEFAULT 'http://www.w3.org/2005/Atom') 
SELECT 'Customers' AS 'title',
        'List of customers' AS 'subtitle',
        'http://tavis.net/Customers' AS 'id',
        REPLACE(CONVERT(varchar,(SELECT MAX(COALESCE(ModifiedDate,CreatedDate)) FROM tblCustomer), 121),' ','T') + 'Z' AS 'updated',
    (SELECT cu.Code AS 'title',
            REPLACE(CONVERT(varchar,COALESCE(ModifiedDate,CreatedDate) , 121),' ','T') + 'Z' AS 'updated',
            'Darrel Miller' AS 'author/name',
            'http://tavis.net/Customer/' + LTRIM(STR(cu.ID)) AS 'id',
            (SELECT * 
                FROM (SELECT 'alternate' AS '@rel',
                    'http://tavis.net/Customer/1.html' AS '@href'
              UNION
              SELECT 'edit' AS '@rel',
                    'http://tavis.net/Customer/1.html' AS '@href') l
             FOR XML PATH('link'),type ),
            cu.ModifiedDate AS 'updated',
            cu.Name AS 'summary'    
        FROM tblCustomer cu
        FOR XML PATH('entry'),type  ) 
FOR XML PATH('feed')

Ok, so it is not pretty, but with a few T-SQL functions it could be cleaned up quite a bit.  This query produces an atom feed that validates at validator.w3.org/feed and it does it in 8ms on my cheapo server with 20 entries .

The trickiest part of the above query was creating the two link elements.  When you try and create two child elements with the same name, by default SQL Server will attempt to merge the two elements.  By using the UNION and a sub query I was able to create the two separate elements.

So you really can create real world XML documents using FOR XML PATH. I believe this is a useful tool to have under your belt when you are trying to quickly get data out of a database down to a client tier.  It would also be really nice if other database vendors picked up on this feature and implemented in their engines.  One area that I didn’t cover but that is also very useful is that you can create T-SQL Functions that return XML and then call them recursively.  This allows you to build hierarchical XML documents.   My tests so far have also shown that building trees this way is very quick.

I’m hoping to do a follow up article that shows how you can use the XML from FOR XML PATH as input to XSLT to do all sorts of other interesting things like create audit triggers, data import scripts, create HTML pages, XAML  and JSON documents. 

Related Blog