Introduction
Pentaho Metadata is a feature of the Pentaho BI Platform designed to make it easier for users to access information in business terms.
With Pentaho's open source metadata capabilities, administrators can define a layer of abstraction that presents database information to business users in familiar business terms. Administrators identify relationships between tables in the database, create business-language definitions for complex or cryptic database tables and columns, set security parameters to limit data access to appropriate users, specify default formatting for data fields, and provide additional translations for business terms for multi-lingual deployments. Business users can then use Pentaho's new ad hoc query capabilities to choose the specific elements they would like to include in a given report, such as order quantities and total spending by customer grouped by region. Then SQL required to retrieve the data is generated automatically.
Scope and Usage of the Metadata Layer:
1. Metadata input from the database, as well as user-defined metadata, is defined using the Pentaho Metadata Editor(PME) and stored in the metadata repository
2. Metadata can be exported from the repository and stored in the form of .xmi files, or in a database. The metadata is associated with a Pentaho solution on the Pentaho server, where it can be used as a resource for metadata-based reporting services
3. Using the Pentaho report design tools, end users can create reports on the metadata. This allows report to be built without knowledge of the physical details of the underlying database, and without any knowledge of SQL. Instead, the report contains a high-level specification of the query result, which is defined using a graphical user interface
4. When running reports based on Pentaho metadata, the reporting engine interprets the report. Query specifications are stored in the report in a format called Metadata Query Language(MQL), which is resolved against the metadata. At this point, the corresponding SQL is generated and sent to the database. Beyond this point, report processing is quite similar to "normal" SQL-based reporting. The database responds to the query by sending a data result, which is rendered as report output.
Business Model Overview
Pentaho Metadata Terminology
The metadata business model is a actually one major component in a Pentaho metadata domain. The domain encapsulates both the physical descriptions of your database objects and the logical model(the business model), the abstract representation of the database.
Business objects and relationships within the Pentaho metadata domain:
The Physical Layer of a Pentaho domain encompasses connections, physical tables and physical columns.
In the Abstract Business Layer, you have business tables, business columns, and business relationships.
The Business View is the part of business model that applications will operate against, and end-users will see. The Business View is nothing more than "buckets"(called categories) for you to re-arrange and re-organize your business columns in a fashion that makes sense to the consumers of the data.
The business model encompasses the Abstract Business Layer and the Business View.
Common Warehouse Model(CWM)
The Common Warehouse Metamodel(CWM) is a specification that descriptions metadata interchange among data warehousing, business intelligence, knowledge management and portal technologies. The Pentaho Metadata Layer is based on the
CWM Specification of the Object Management Group and is able to generate SQL from a query written in the Metadata Query Language(MQL). The MQL query in turn is created by an end user by building the desired selection from a set of objects exposed in a metadata model.
CWM is based on three standards:
1. UML - Unified Modeling Language, and OMG modeling standard
2. MOF - Meta Object Facility, an OMG metamodeling and metadata repository standard
3. XMI - XML Metadata Interchange, an OMG metadata interchange standard
Pentaho Metadata Projects
Pentaho Metadata consists of three sub-projects:
1.
pentaho-metadata, core metadata engine
2.
pentaho-mql-editor, a SWT tool for building an MQL query
3.
pentaho-metadata-editor, tool for building metadata models
The Report Designer and BI Platform also contain client side metadata code that demonstrate using metadata in apps.
Pentaho Metadata MQL Schema
MQL is the syntax Pentaho Metadata uses for generating SQL queries based off of metadata. Normally a user would generate MQL via Pentaho's Design Studio or Pentaho's Metadata Editor.
MQL XML format:
<mql> - top level element for mql query
<domain_type> - text element that contains the domain type, currently only "relational" is supported
<domain_id> - text element that contains the domain id to query
<model_id> - text element that contains the model id to query
<model_name> - not required - text element with the model name
<parameters> - not required - element that contains a list of parameters for the query
<parameter> - element that contains information about an individual parameter
<name> - the name of the parameter
<type> - the data type of the parameter, valid values include boolean, numeric
<defaultValue> - the default value of the parameter as a string
<selections> - element that contains a list of business column selections
<selection> - business column selection element
<table> - text element that contains the id of the business table to select
<column> - text element that contains the id of the business column to select
<aggregation> - text element that contains aggregate type
<constraints> - element that contains a list of constraints for the MQL query
<constraint> - constraint container element
<operator> - text element that describes how to join the constraint to the query
<condition> - text element that contains an MQL formula
<orders> - element that contains a list of business columns to order by
<order> - order container element
<direction> - text element containing either "sac" or "desc"
<table_id> - text element containing the id of the business table to order by
<column_id> - text element containing the id of the business column to order by
<aggregation> - text element that contains aggregate type
Pentaho Metadata Editor
Pentaho Metadata Editor documentation