When it comes to system integration, Microsoft provides such a plethora of options it is far too easy to forget about some of them and/or think about their practical application. This is particularly true of the SQL Server BI stack which is so huge in its scope.
Our need was to populate a relational database using the results of an MDX query so we could do some data reconciliation to prove the new cube provided the results we wanted. Given my skills in C#, my immediate thought was to write a simple console application which performed an MDX query and populate a database table using SQL Bulk Insert. However, this option would not be too easy the rest of the team to support as they do not have the relevant C# skills or experience of ADOMD.NET.
Vincent Rainardi (author of Building a Data Warehouse: with examples in SQL Server) pointed me the direction of OpenQuery and linked servers. This is a remarkably simple and flexible solution to a great many system integration problems, so I thought I had better share it with a wider audience.
OpenQuery and OpenRowSet both execute pass-through queries against a specific server. The 'query string' to be executed is not evaluated by the local server but simply 'passed through' to the target server which is expected to return a row set. The 'query string' can be anything, including MDX, so long as it is understood by the target server. Well this works a treat with MDX and Analysis Services.
To use this technique, you will first need to create a linked SSAS server using either SQL Server Management Studio or in script using the stored procedure master.dbo.sp_addlinkedserver. Here is an example:
Once linked, you can then query your cube using MDX and combine the results with the content of a SQL Server table. Alternatively you can simply insert the results of the MDX query into a database table. For example, the following screenshot shows the MDX executed in SQL Server Management Studio:
Whereas the following screenshot shows the results of the equivalent OpenQuery:
Note now the column names contain the unique name of the attribute. Before showing how to deal with these, let's just look one surprising element of the functionality offered by this technique.
If you use a hierarchy in your query, then you will get extra columns describing the values of each hierarchy level. For example, if we change the above query to use the [Customer].[Customer Geography] hierarchy then you get an extra column in the query results describing the [Customer].[Customer Geography].[Country] level.
This is even more dramatic when using a parent-child hierarchy; a query against the Accounts dimension will bring back all six levels.
Of course, you may find the attribute unique name a little long and cumbersome to use in your T-SQL. The easiest way to remove them is to use an alias. However, you still need to type that lengthy name in the first place.
The easy way around this is to copy the query results into Excel and then obtain the column header text from there. However, by default, the copying of column headers is switched off, so you may want to switch on the "Include column headers when copying or saving results" option in SQL Server Management Studio which can be found under Tools->Options->Query Results->Results to Grid. Here is a screenshot to make it easier:
As you will know, the square bracket in T-SQL is used as a delimiter in much the same way as it is in MDX. So simply typing:
[Customer].[Customer Geography].[Country].[MEMBER_CAPTION] AS Country
will cause a "multi-part identifier could not be bound" error. To avoid this, use double quotes around the attribute unique name as follows:
"[Customer].[Customer Geography].[Country].[MEMBER_CAPTION]" AS Country,
The problem with OpenQuery is that every column has a data type of string, including the numerical columns. This means that the data type must be changed if you want to perform numerical computations on the figures using T-SQL. So our two numerical columns need to be wrapped in CAST & CONVERT as follows:
CAST(CONVERT(float, "[Measures].[Internet Gross Profit]") AS money) AS GrossProfit
In the case where you are creating a management report that combines the profitability figures of each division with the relevant manager's comments as to why the figure have (or have not) been achieved, then the use of OpenQuery becomes invaluable. This is because the only way to handle such a situation in the cube would be to create another dimension which links at the relevant level of granularity and enable cube write-back so the manager can update dimensional properties. A lot of hassle for a few pieces of textual information!
With OpenQuery, this becomes trivial as the text can be held in a supplementary database table and combined with the results of the MDX query.
Now the problem you may have spotted is that OpenQuery returns the MEMBER_CAPTION which will probably not be sufficiently unique to match an entry in the relational database. What you really need is the MEMBER_KEY. This can be added as a column using a calculated member as follows:
WITH MEMBER [Measures].[OrganisationID] AS [Organization].[Organizations].MEMBER_KEY
As the Adventure Works database does not contain a 'comments' table, let's create one and insert some rows using the new INSERT feature in SQL Server 2008.
CREATE TABLE dbo.ManagersComments(
OrganisationID int NOT NULL,
[Manager's Name] nvarchar(50) NULL,
[Manager's Comments] nvarchar(255) NULL
);
INSERT INTO ManagersComments(OrganisationID, [Manager's Name], [Manager's Comments])
VALUES (6, 'John Doe', 'My team failed again. You know how it is...'),
(5, 'John Smith', 'My team are all stars.');
So let's bring MDX and SQL data together into one query. This is best achieved with a common table expression as the layout can be far more readable:
WITH MdxQuery
(
[Region],
OrganisationID,
[Net Sales],
[Cost of Sales],
[Gross Margin],
[Gross Margin%]
)
AS
(
SELECT
"[Organization].[Organizations].[Organization Level 04].[MEMBER_CAPTION]" AS [Region],
CONVERT(int, "[Measures].[OrganisationID]") AS OrganisationID,
CAST(CONVERT(float, "[Measures].[Net Sales]") AS money) AS [Net Sales],
--"[Measures].[Net Sales]" AS [Net Sales],
CAST(CONVERT(float, "[Measures].[Cost of Sales]") AS money) AS [Cost of Sales],
CAST(CONVERT(float, "[Measures].[Gross Margin]") AS money) AS [Gross Margin],
CONVERT(decimal(4, 2), (CONVERT(decimal(19, 17), "[Measures].[Gross Margin%]") * 100)) AS [Gross Margin%]
FROM OPENQUERY(AdvertureWorksServer,
'WITH MEMBER [Measures].[OrganisationID] AS
[Organization].[Organizations].MEMBER_KEY
MEMBER [Measures].[Net Sales] AS
([Account].[Accounts].&[50], [Measures].[Amount])
MEMBER [Measures].[Cost of Sales] AS
([Account].[Accounts].&[55], [Measures].[Amount])
MEMBER [Measures].[Gross Margin] AS
([Account].[Accounts].&[49], [Measures].[Amount])
MEMBER [Measures].[Gross Margin%] AS
[Measures].[Gross Margin] / [Measures].[Net Sales]
SELECT
{
[Measures].[OrganisationID], [Measures].[Net Sales], [Measures].[Cost of Sales],
[Measures].[Gross Margin], [Measures].[Gross Margin%]
} ON 0,
{
[Organization].[Organizations].&[14].Children
}
ON 1
FROM [Finance]
WHERE ([Date].[Fiscal].[Fiscal Quarter].&[2004]&[1])'
)
),
CombineMdxWithTable (
[Region],
[Manager's Name],
[Net Sales],
[Cost of Sales],
[Gross Margin],
[Gross Margin%],
[Manager's Comments]
)
AS
(
SELECT
A.[Region],
B.[Manager's Name],
A.[Net Sales],
A.[Cost of Sales],
A.[Gross Margin],
A.[Gross Margin%],
B.[Manager's Comments]
FROM MdxQuery A
LEFT JOIN dbo.ManagersComments B ON
A.OrganisationID = B.OrganisationID
)
SELECT * FROM CombineMdxWithTable
Here are the results of our query:
What I found most surprising about the functionality offered by OpenQuery is that it will deal with extra dimensions/axes sensibly. Of course any multi-dimensional query can always be transformed into a two-dimensional recordset; each cell simply becomes a row in the recordset, with one column for each dimension. So for example, a four dimensional MDX query returning a two measures can be easily transformed into a two dimensional recordset containing six columns. Each row represents a cell and has two numerical columns.
Of course, if you try adding more than two axes to an MDX query in SQL Server Management Studio, this is the resulting error you will see:
Results cannot be displayed for cellsets with more than two axes.
Given that OpenQuery does such a good job of flattening the resultset into two dimensions, SQL Server Management Studio could really make more of an effort!
In reality, similar results are returned by a two-dimensional MDX query where the rows axes is a CrossJoin of all the relevant dimensions. It also has the advantage that the query would be able to take advantage of the server's AutoExists performance optimization and therefore fewer rows will be returned. However, it does not sound quite so sexy as a true multi-dimensional query. Ah, well. Technology cannot always be sexy!