The Multidimensional Expressions (MDX) syntax is similar to the syntax of Structured Query Language (SQL). In many ways, the functionality supplied by MDX is also similar to that of SQL; with effort, you can even duplicate some of the functionality provided by MDX in SQL.
However, there are some striking differences between SQL and MDX. Here we provide a guide to these conceptual differences between SQL and MDX, from the perspective of an SQL developer.
The principal difference between SQL and MDX is the ability of MDX to reference multiple dimensions. Although it is possible to use SQL exclusively to query cubes in SQL Server 2000, MDX provides commands that are designed specifically to retrieve data as multidimensional data structures with almost any number of dimensions.
SQL refers to only two dimensions when processing queries: columns and rows. Because SQL was designed to handle only two-dimensional tabular data, the terms "column" and "row" have meaning in SQL syntax.
MDX, in comparison, can process one, two, three, or more dimensions in queries. Each dimension is referred to as an axis. The terms "column" and "row" in MDX are used only as aliases for the first two axis dimensions in an MDX query. Other dimensions are also aliased, but the aliases "column" and "row" hold no meaning to MDX. MDX supports such aliases for display purposes; many OLAP tools are incapable of displaying a result set with more than two dimensions. MDX can even process "zero-axis" queries that return only one cell from a cube, determined by a tuple constructed from the default member of each dimension in the cube.
In SQL, the SELECT clause is used to define the column layout for a query, and the WHERE clause is used to define the row layout. However, in MDX the SELECT clause can be used to define several axis dimensions, while the WHERE clause is used to restrict multidimensional data to a specific dimension or member.
In SQL, the WHERE clause is used to filter the data returned by a query. In MDX, the WHERE clause is used to provide a slice of the data returned by a query. While the two concepts are similar, they are not equivalent.
The SQL query uses the WHERE clause to contain an arbitrary list of items that should (or should not) be returned in the result set. Although a long list of conditions in the filter can narrow the scope of the data that is retrieved, there is no requirement that the elements in the clause will produce a clear and concise subset of data.
In MDX, however, the concept of a slice means that each member in the WHERE clause identifies a distinct portion of data from a different dimension. Because of the organizational structure of multidimensional data, it is not possible to request a slice for multiple members of the same dimension. Because of this, the WHERE clause in MDX can provide a clear and concise subset of data.
The process of creating an SQL query is also different from that of creating an MDX query. The creator of an SQL query visualizes and defines the structure of a two-dimensional rowset and writes a query on one or more tables to populate that rowset. In contrast, the creator of an MDX query usually visualizes and defines the structure of a multidimensional dataset and writes a query on a single cube to populate it. This could result in a multidimensional dataset with any number of dimensions; a one-dimensional dataset is possible, for example.
The visualization of an SQL result set is very intuitive; the set is a two-dimensional grid of columns and rows. The visualization of an MDX result set is not necessarily intuitive, however. Because a multidimensional result set can have more than three dimensions, it can be challenging to visualize the structure. To refer to such two-dimensional data in SQL, the name of a column and the unique identification of a row are used to refer to a single cell of data, called a field. However, MDX uses a very specific and uniform syntax to refer to cells of data, whether the data forms a single cell or a group of cells.
Although SQL and MDX share similar syntax, the MDX syntax is remarkably robust, and it can be complex. However, because MDX was designed to provide a simple, effective way of querying multidimensional data, it addresses the conceptual differences between two-dimensional and multidimensional querying in a consistent and easily understood fashion.
To specify a dataset, a Multidimensional Expressions (MDX) query must contain the following information:
This information can be complex. MDX syntax provides the information in a simple and straightforward manner, using the MDX SELECT statement.
In MDX, the SELECT statement is used to specify a dataset that contains a subset of multidimensional data. A basic MDX query is structured in the following way:
SELECT [<axis_specification> [, <axis_specification>...]] FROM [<cube_specification>] [WHERE [<slicer_specification>]]
The basic MDX SELECT statement contains a SELECT clause and a FROM clause, with an optional WHERE clause. These syntax elements are shown in the following example:
SELECT { [Measures].[Unit Sales], [Measures].[Store Sales] } ON COLUMNS, { [Time].[1997], [Time].[1998] } ON ROWS FROM Sales WHERE ( [Store].[USA].[CA] )
The SELECT clause determines the axis dimensions of an MDX SELECT statement. Two axis dimensions are defined in this MDX query example. The FROM clause determines which multidimensional data source is to be used when extracting data to populate the result set of the MDX SELECT statement.
A WHERE clause optionally determines which dimension or member to use as a slicer dimension; this restricts the extracting of data to a specific dimension or member. The WHERE clause in this example restricts the data extract for the axis dimensions to a specific member of the Store dimension. An MDX SELECT statement supports other optional syntax elements, such as the WITH keyword, and the use of MDX functions to construct members by calculation for inclusion in an axis or slicer dimension.
The syntax format of the MDX SELECT statement is similar to that of SQL syntax; however, there are several primary differences:
SELECT { [Measures].[Unit Sales], [Measures].[Store Sales] } ON AXIS(0), { [Time].[1997], [Time].[1998] } ON AXIS(1) FROM Sales WHERE ( [Store].[USA].[CA] )