Pentaho 资料

1.Pentaho Big Data BI Knowledage @  http://wiki.pentaho.com/display/BAD/How+To%27s

Hadoop

  • Loading Data into a Hadoop Cluster — How to load data into HDFS (Hadoop's Distributed File System), Hive and HBase.
    • Loading Data into HDFS — How to use a PDI job to move a file into HDFS.
    • Loading Data into Hive — How to use a PDI job to load a data file into a Hive table.
    • Loading Data into HBase — How to use a PDI transformation that sources data from a flat file and writes to an HBase table.
  • Transforming Data within a Hadoop Cluster — How to transform data within the Hadoop cluster using Pentaho MapReduce, Hive, and Pig.
    • Using Pentaho MapReduce to Parse Weblog Data — How to use Pentaho MapReduce to convert raw weblog data into parsed, delimited records.
    • Using Pentaho MapReduce to Generate an Aggregate Dataset — How to use Pentaho MapReduce to transform and summarize detailed data into an aggregate dataset.
    • Transforming Data within Hive — How to read data from a Hive table, transform it, and write it to a Hive table within the workflow of a PDI job.
    • Transforming Data with Pig — How to invoke a Pig script from a PDI job.
  • Extracting Data from the Hadoop Cluster — How to extract data from Hadoop using HDFS, Hive, and HBase.
    • Extracting Data from HDFS to Load an RDBMS — How to use a PDI transformation to extract data from HDFS and load it into a RDBMS table.
    • Extracting Data from Hive to Load an RDBMS — How to use a PDI transformation to extract data from Hive and load it into a RDBMS table.
    • Extracting Data from HBase to Load an RDBMS — How to use a PDI transformation to extract data from HBase and load it into a RDBMS table.
    • Extracting Data from Snappy Compressed Files — How to configure client-side PDI so that files compressed using the Snappy codec can be decompressed using the Hadoop file input or Text file input step.
  • Reporting on Data in Hadoop — How to report on data that is resident within the Hadoop cluster.
    • Reporting on HDFS File Data — How to create a report that sources data from a HDFS file.
    • Reporting on HBase Data — How to create a report that sources data from HBase.
    • Reporting on Hive Data — How to create a report that sources data from Hive.
  • Unit Test Pentaho MapReduce Transformation — How to unit test the mapper and reducer transformations that make up a Pentaho MapReduce job.
  • Advanced Pentaho MapReduce — Advanced how-tos for developing Pentaho MapReduce.
    • Using Compression with Pentaho MapReduce — How to use compression with Pentaho MapReduce.
    • Using a Custom Partitioner in Pentaho MapReduce — How to use a custom partitioner in Pentaho MapReduce.
    • Using a Custom Input or Output Format in Pentaho MapReduce — How to use a custom Input or Output Format in Pentaho MapReduce.
    • Processing HBase data in Pentaho MapReduce using TableInputFormat — How to use HBase TableInputFormat in Pentaho MapReduce.

MapR

  • Loading Data into a MapR Cluster — How to load data into CLDB (MapR’s distributed file system), Hive and HBase.
    • Loading Data into CLDB — How to use a PDI job to move a file into CLDB.
    • Loading Data into MapR Hive — How to use a PDI job to load a data file into a Hive table.
    • Loading Data into MapR HBase — How to use a PDI transformation that sources data from a flat file and writes to an HBase table.
  • Transforming Data within a MapR Cluster — How to leverage the massively parallel, fault tolerant MapR processing engine to transform resident cluster data.
    • Using Pentaho MapReduce to Parse Weblog Data in MapR — How to use Pentaho MapReduce to convert raw weblog data into parsed, delimited records.
    • Using Pentaho MapReduce to Generate an Aggregate Dataset in MapR — How to use Pentaho MapReduce to transform and summarize detailed data into an aggregate dataset.
    • Transforming Data within Hive in MapR — How to read data from a Hive table, transform it, and write it to a Hive table within the workflow of a PDI job.
    • Transforming Data with Pig in MapR — How to invoke a Pig script from a PDI job.
  • Extracting Data from the MapR Cluster — How to extract data from the MapR cluster and load it into an RDBMS table.
    • Extracting Data from CLDB to Load an RDBMS — How to use a PDI transformation to extract data from MapR CLDB and load it into a RDBMS table.
    • Extracting Data from Hive to Load an RDBMS in MapR — How to use a PDI transformation to extract data from Hive and load it into a RDBMS table.
    • Extracting Data from HBase to Load an RDBMS in MapR — How to use a PDI transformation to extract data from HBase and load it into a RDBMS table.
  • Reporting on Data in the MapR Cluster — How to report on data that is resident within the MapR cluster.
    • Reporting on CLDB File Data — How to create a report that sources data from a MapR CLDB file.
    • Reporting on HBase Data in MapR — How to create a report that sources data from HBase.
    • Reporting on Hive Data in MapR — How to create a report that sources data from Hive.

Cassandra

  • Write Data To Cassandra — How to read data from a data source (flat file) and write it to a column family in Cassandra using a graphic tool.
  • How To Read Data From Cassandra — How to read data from a column family in Cassandra using a graphic tool.
  • How To Create a Report with Cassandra — How to create a report that uses data from a column family in Cassandra using graphic tools.

MongoDB

  • Write Data To MongoDB — How to read data from a data source (flat file) and write it to a collection in MongoDB
  • Read Data From MongoDB — How to read data from a collection in MongoDB.
  • Create a Report with MongoDB — How to create a report that uses data from a collection in MongoDB.
  • Create a Parameterized Report with MongoDB — How to create a parameterize report that uses data from a collection in MongoDB.

Instaview

  • Google Analytics Instaview Sample template — Instaview template for use with Google Analytics
  • MongoDB Instaview Sample template — Sample Instaview template for use with MongoDB

 

2. Pentaho ABC's are @ http://docs.huihoo.com/pentaho/pentaho-business-analytics/4.8/

User Guides

  • Pentaho User Console Guide
    Reference material and task-based documentation on Pentaho Dashboard Designer, Pentaho Analyzer, Pentaho Interactive Reporting, and the content scheduling and authorization features in the Pentaho User Console.
  • Analysis Guide
    Guidance and theory on creating ROLAP schemas with Schema Workbench and Pentaho Data Integration; Pentaho Analyzer and JPivot user documentation; Mondrian engine and Pentaho Analyzer configuration instructions. Also includes an MDX element reference.
  • Report Designer User Guide
    Reference material and task-based documentation on creating, editing, and publishing reports with Pentaho Report Designer. Includes a complete chart property reference.
  • Metadata Editor User Guide
    Reference material and task-based documentation for creating, editing, and publishing metadata models with Pentaho Metadata Editor.
  • Pentaho Data Integration User Guide
    Reference material and task-based documentation that covers the majority of the functionality in Pentaho Data Integration.
  • Big Data Guide
    How to install and configure PDI for various Hadoop distributions, along with procedural documentation on how to use the Big Data steps and entries in Pentaho Data Integration.
  • Aggregation Designer User Guide
    Information on creating aggregate tables for Pentaho Analysis.
<!-- EVALUATION GUIDES -->

Tutorials and Walkthroughs

  • Getting Started with Pentaho Business Analytics
    Detailed walkthroughs that show how to create content with Pentaho User Console design tools.
  • Getting Started with Pentaho Data Integration
    Evaluation document showcasing the high-value features of PDI.
  • Getting Started with Pentaho Data Integration Instaview
    Detailed walkthroughs that show how to use PDI's Instaview to quickly generate transform and analyze data from a variety of sources.
<!-- INSTALLATION GUIDES -->

Installation and Upgrade Guides

  • Pentaho Business Analytics Graphical Installation Guide
    Complete instructions for performing a production installation using the Business Analytics graphical installation utility for Windows, Linux, or OS X. This method is recommended and encouraged for evaluation deployments, but is not typical for production installations.
  • Pentaho Business Analytics Archive-Based Installation Guide
    Deployment instructions for the premade Business Analytics archive packages in .tar or .zip format. Packages are available for all individual parts of Business Analytics. This is a common production deployment scenario for both servers and workstations.
  • Pentaho BA Server Manual Deployment Guide
    Instructions for building a custom BA Server J2EE WAR for Tomcat or JBoss. This is a typical production deployment scenario for servers.
  • Pentaho Data Integration Installation Guide
    Complete instructions for performing a production installation of Pentaho Data Integration for servers and workstations. Covers both archive package deployment and graphical installation utility execution. Installation to Hadoop nodes is also covered in detail.
  • Business Analytics Upgrade Guide
    Upgrade instructions for migrating from the previous major Business Analytics release to the newest. This guide also includes all of the content from the PDI Upgrade Guide.
  • Pentaho Data Integration Upgrade Guide
    Instructions for upgrading from the previous version of Pentaho Data Integration to the newest. This pertains to both the client tool (Spoon) and the Data Integration Server (DI Server).
<!-- ADMINISTRATOR GUIDES -->

Administrator Guides

  • Business Analytics Administrator's Guide
    Explains system configuration and administration tasks for the Pentaho BA Server and DI Server. (Please see the Security Guide for detailed instructions on implementing different user authentication methods).
  • Business Analytics Troubleshooting Guide
    A collection of troubleshooting topics from all other Pentaho guides. You may find this useful if you have encountered some kind of error but don't know where in Business Analytics to look for the root cause.
  • Business Analytics Security Guide
    Instructions and guidance for implementing a different user authentication method, or for implementing SSL on the BA Server. Covers Active Directory, LDAP, single sign-on, and custom JDBC authentication.
  • Business Analytics Performance-Tuning Guide
    Guidance and instructions for improving performance in most areas of Business Analytics. Covers modification of Business Analytics, guidelines for content streamlining, application server clustering, and advice on performance monitoring and testing.
  • Pentaho Data Integration Administrator's Guide
    Explains system configuration and administration tasks for the DI Server.
<!-- DEVELOPER GUIDES -->

Developer Guides

  • Customizing Pentaho Business Analytics
    Instructions for localization and basic customization of the Pentaho User Console, including Pentaho Analyzer, Interactive Reporting, and Dashboard Designer.
  • Creating Pentaho Dashboards
    Design theory for creating dashboards that use Pentaho content. Covers Pentaho Dashboard Designer, Community Dashboard Framework (CDF), and basic guidance for custom JSPs.
  • Integrating With the BA Server
    Code samples and URL parameter reference material that shows how to interact with or embed (in an existing Web application) Pentaho Analyzer, Dashboard Designer, and Interactive Reporting.
  • Creating Action Sequences
    Reference material, guidance, and code samples for creating action sequences to run on the Pentaho BI Platform. Includes user documentation for Pentaho Design Studio.
  • Extending and Embedding Pentaho Data Integration
    Instructions, Java classes and methods, as well as Eclipse-based sample plugin projects that show you how to programatically extend PDI functionality or embed the PDI engine into your own applications.


已有 0 人发表留言,猛击->> 这里<<-参与讨论


ITeye推荐
  • —软件人才免语言低担保 赴美带薪读研!—



你可能感兴趣的:(资料,pentaho)