mondrian schema学习(1)

From Mondrian-     2.2.2-technical-guide.pdf

What is a schema? 

A schema defines a multi-dimensional database. It contains a logical model, consisting of cubes, hierarchies, and members, and a mapping of this model onto a physical model.

The logical model consists of the constructs used to write queries in MDX language: cubes, dimensions, hierarchies, levels, and members.

The logical model consists of the constructs used to write queries in MDX language: cubes,

dimensions, hierarchies, levels, and members.

Schema files 

Mondrian schemas are represented in an XML file. An example schema, containing almost all of the constructs we discuss here, is supplied as demo/FoodMart.xml in the mondrian distribution. The dataset to populate this schema is also in the distribution.

Currently, the only way to create a schema is to edit a schema XML file in a text editor. The XML syntax is not too complicated, so this is not as difficult as it sounds, particularly if you use the FoodMart schema as a guiding example.

NOTE: The order of XML elements is important. For example, <UserDefinedFunction> element has to occur inside the <Schema> element after all collections of <Cube>, <VirtualCube>, <NamedSet> and <Role> elements. If you include it before the first <Cube> element, the rest of the schema will be ignored.

 

Logical model 

The most important components of a schema are cubes, measures, and dimensions:

•  A cube is a collection of dimensions and measures in a particular subject area. 

•  A measure is a quantity that you are interested in measuring, for example, unit sales of a product, or cost price of inventory items.

•  A dimension is an attribute, or set of attributes, by which you can divide measures into sub-categories. For example, you might wish to break down product sales by their color, the gender of the customer, and the store in which the product was sold; color, gender, and store are all dimensions.

Cube 

A cube (see <Cube>) is a named collection of measures and dimensions. The one thing the measures and dimensions have in common is the fact table, here "sales_fact_1997". As we shall see, the fact table holds the columns from which measures are calculated, and contains references to the tables which hold the dimensions.

<Cube name="Sales">

<Table name="sales_fact_1997"/>

...

</Cube>  

The fact table is defined using the <Table> element. If the fact table is not in the default schema, you can provide an explicit schema using the "schema" attribute, for example

<Table schema=" dmart" name="sales_fact_1997"/> 

You can also use the <View> and <Join> constructs to build more complicated SQL statements.

Measures 

The Sales cube defines several measures, including "Unit Sales" and "Store Sales".

<Measure name="Unit Sales" column="unit_sales"

aggregator="sum" datatype="Integer" formatString="#,###"/>

<Measure name="Store Sales" column="store_sales"

aggregator="sum" datatype="Numeric" formatString="#,###.00"/>  

Each measure (see <Measure>) has a name, a column in the fact table, and an aggregator. The aggregator is usually "sum", but "count", "mix", "max", "avg", and "distinct count" are also allowed; "distinct count" has some limitations if your cube contains a parent child hierarchy.

The optional datatype attribute specifies how cell values are represented in Mondrian's cache, and how they are returned via XML for Analysis. The datatype attribute can have values "String", "Integer" and "Numeric". The default is "Numeric", except for "count" and "distinct-count" measures, which are "Integer".

An optional formatString attribute specifies how the value is to be printed. Here, we have chosen to output unit sales with no decimal places (since it is an integer), and store sales with two decimal places (since it is a currency value). The ',' and '.' symbols are locale sensitive, so if you were running in Italian, store sales might appear as "48.123,45". You can achieve even more wild effects using advanced format strings.

A measure can have a caption attribute to be returned by the Member.getCaption() method instead of the name. Defining a specific caption does make sense if special letters (e.g. Σ or Π) are to be displayed:

<Measure name="Sum X" column="sum_x" aggregator="sum" caption="&#931; X"/> 

Rather than coming from a column, a measure can use a cell reader, or a measure can use a SQL expression to calculate its value. The measure "Promotion Sales" is an example of this.

<Measure name="Promotion Sales" aggregator="sum"

formatString="#,###.00">  

<MeasureExpression>

<SQL dialect="generic">

(case when sales_fact_1997.promotion_id = 

0 then 0 else sales_fact_1997.store_sales end)

</SQL>

</MeasureExpression>

</Measure>  

In this case, sales are only included in the summation if they correspond to a promotion sales. Arbitrary SQL expressions can be used, including subqueries. However, the underlying database must be able to support that SQL expression in the context of an aggregate. Variations in syntax between different databases is handled by specifying the dialect in the SQL tag.

In order to provide a specific formatting of the cell values, a measure can use a cell formatter.

你可能感兴趣的:(mondrian)