Analysis Services Optimizing Cube Schemas
Analysis Services多维数据集优化计划
In many situations Microsoft® SQL Server™ 2000 Analysis Services can optimize a cube's schema to significantly reduce cube processing time by eliminating joins between dimension tables and fact tables.
在很多情形下,Microsoft® SQL Server™ 2000 Analysis Services可以通过消除维度表和事实表之间的连接来优化多维数据集的执行,显著减少多维数据集处理时间。
During dimension processing, the Analysis server creates an internal representation of the dimension data and hierarchy. When processing a cube, the dimension member keys identified in the member key column property are used to access the information in the internal representation of the processed dimension. Under certain conditions, the dimension member's foreign key in the fact table can be used for this lookup, thereby eliminating the need to join the dimension table to the fact table in the database query. This significantly reduces the complexity of the query, the amount of data accessed in the relational database, and network traffic between the Analysis server and the relational database.
在处理维度时,Analysis server创建一个维度数据和层级的内部表述。当处理一个多维数据集时,在成员键列属性中确定的维度成员键通常存取已处理的维度数据和层级内部表述中的信息。在特定条件下,事实表中的维度表外键可以用于这种查询,因此消除在数据库查询时连接维度表和事实表的需要。这显著地减少了查询的复杂性、在关系数据库中存取的数据量、在Analysis server和关系数据库之间的网络交通。
To take advantage of cube schema optimization, when you design a cube in Cube Editor, click the Optimize Schema command on the Tools menu. Analysis Services then modifies the schema to eliminate joins between the fact table and dimension tables, where possible. Certain conditions must be met for Analysis Services to eliminate a join between a dimension and the fact table. These are:
要利用多维数据集执行优化,可在多维数据集编辑器中设计多维数据集时,点击工具菜单中的Optimize Schema命令。此时Analysis Services会在可能的地方修改执行以消除事实表和维度表之间的连接。Analysis Services消除事实表和维度表之间的连接必须满足某些特定条件,它们是:
1) The dimension must be a shared dimension, and must have been processed before you optimize the cube schema.
1) 维度必须是共享维度,并且在优化多维数据集执行之前必须已经被处理过。
2) The member key column for the lowest level of the dimension must contain the keys that relate the fact table and the dimension table. This must be the only key necessary to relate the fact table to the dimension table.
2) 对应着维度最低级别的成员键列必须包含关联事实表和维度表的键。此键必须是需要关联事实表和维度表的唯一的键。
3) The keys in the member key column for the lowest level of the dimension must be unique.
3) 在“对应着维度最低级别的成员键列”中的键,必须是唯一的。
4) The lowest level of the dimension must be represented in the cube, that is, the level's Disabled property must be set to No. The level can be hidden.
4) 维度最低级别必须在多维数据集中被表述,即:该级别的“Disabled”属性必须被设置为“No”。该级别可被隐藏。
If these conditions are met, and the cube's schema is optimized using the Optimize Schema option, the Analysis server ignores the dimension table in the database when processing the cube. If these conditions are met for all dimensions in the cube, the Analysis server needs to read only the fact table to process the cube. Processing time reductions often can be substantial when this optimization technique is used.
如果具备上述条件,且多维数据集已被优化,则Analysis server在处理多维数据集时会忽略数据库中的维度表。如果在多维数据集中的所有维度都具备这些条件,则Analysis server在处理多维数据集时只需要读取事实表。当这项优化技术被使用后,处理时间会有实质性的减少。
Cube schema optimization applies to all partitions of the cube whether the partitions are processed independently or as a group.
不管分区是单独还是作为一个整体被处理,多维数据集优化执行可以在所有分区中被应用。
Note You should not optimize a cube's schema if you depend on inner joins between the fact table and dimension tables to exclude fact rows for the cube content. The entire fact table is read if all dimension table joins are removed by this optimization.
备注:如果依赖在事实表和维度表之间的内部连接来为多维数据集内容排除事实行,则不应该对该多维数据集执行进行优化。如果所有的维度表连接被该优化移除,则会读取全部事实表记录。
Because schema optimization can eliminate joins, a cube with an optimized schema may not display all available tables for use when specifying drillthrough options. You can join a table to the schema for drillthrough when specifying drillthrough options by adding the table and defining a SQL WHERE clause to establish the join. For more information, see Specifying Drillthrough Options.
因为执行优化可以消除连接,所以当指定钻取选项时,一个已优化的多维数据集可能无法为使用而展现所有可用的表。当通过增加表和定义SQL WHERE子句来建立连接而指定钻取选项时,就可以连接表而达到钻取的目的。更多信息,请参考“Specifying Drillthrough Options”。
Member Key Column
成员键列
Analysis Services uses the Member Key Column property of the lowest level of a dimension to control cube schema optimization. During cube schema optimization, each dimension is evaluated to determine if it meets the conditions for optimization. If the dimension meets the required conditions, the Member Key Column property of the lowest level of the dimension is changed to refer to the foreign key in the fact table instead of the key in the dimension table. For example, before optimization the dimension level's Member Key Column is "Products"."SKU_ID", which is joined to the key, "Facts"."SKU_Key", in the fact table. After optimization, the Member Key Column property value is "Facts"."SKU_Key". This signals the Analysis server to use the key from the fact table during processing instead of issuing queries that join the dimension to the fact table in the relational database.
Example
示例
A dimension for time contains the levels Year, Quarter, Month, and Day. There is a dimension member for each day in the dimension, and each day member has a unique key, which is specified as the member key column for the Day level. The Member Keys Unique property for the Day level is set to True.
The dimension's Day level member keys are used as foreign keys in the fact table to relate the dimension table to the fact table. No other keys are required to uniquely relate a fact table row to a row in the dimension table.
A cube is designed that uses this fact table and this time dimension. It is preferable that the cube contain summarized data at the Month level and above but not at the Day level. In the cube, the Disabled property for the dimension's Day level is set to No, so the level keys will be available for the cube processing optimization. The Visible property for the dimension's Day level is set to False, so the cube will not display data for the Day level.
When the Optimize Schema command on the Tools menu is selected, the cube schema is optimized. Then, when the cube is saved and processed, the SQL query issued by the Analysis server to read the fact table will not need to join or access the dimension table.
Modifying Cube Schema Optimization
修改多维数据集执行优化
You can remove the optimization for one or more dimensions in a cube by changing the Member Key Column property for the lowest level of each of the dimensions to refer to its original column in the dimension table. This will cause the Analysis server to issue a query that joins the dimension table to the fact table during processing.
Note A cube's schema optimization can be affected by adding or deleting dimensions, or by modifying dimension properties in the cube. You should check the cube's schema optimization or redo the optimization whenever you make such changes.
Unknown Dimension Member Error
未知维度成员错误
This error indicates that a dimension member's key is not found in the internal representation of a dimension when processing a cube that contains the dimension. The cause can be either that a dimension has not been processed after new members were added, or that the dimension table does not contain a key that matches a key found in the fact table.
This error occurs regardless of whether a cube's schema has been optimized if new members are added to a dimension and related facts are added to the fact table but the dimension has not been processed. It makes no difference whether the member keys are read from the joined dimension (schema not optimized) or from the member foreign keys in the fact table (schema optimized). The internal representation of the dimension will not contain the new keys until the dimension has been processed.
There is one situation where this error is triggered if the cube's schema has been optimized, but is not be triggered if the schema has not been optimized. This condition occurs when a fact has been added to the fact table but no corresponding member exists in a dimension table. If the cube's schema has been optimized, the key for the new fact will be read from the fact table but not found in the internal representation of the dimension, even if the dimension has been processed. However, if the cube's schema has not been optimized, the query that joins the dimension table to the fact table causes any facts that do not have corresponding dimension members to be ignored and not read during processing, so the error is not triggered.
You can avoid these errors by maintaining referential integrity between dimension tables and fact tables, and by always processing a dimension after making changes to the dimension table and before processing cubes that use the dimension.
(待续......)