原文地址:http://developer.teradata.com/database/articles/say-yes-to-no-primary-index-no-pi-tables
Say Yes to No Primary Index (No PI) Tables
The purpose of the new Teradata 13.0 feature that allows you to create tables with no primary index is improved performance of FastLoad and Teradata Parallel Data Pump Array INSERT data loading operations.
Because there is no primary index for the rows of a NoPI table, its rows are not hashed to an AMP based on their primary index value. Instead, Teradata Database either hashes on the Query ID for a row, or it uses a different algorithm to assign the row to its home AMP once it reaches the AMP onto which it has been FastLoaded. The Teradata Database then generates a RowID for each row in a NoPI table by randomly selecting an arbitrary hash bucket that an AMP owns and using it to generate a RowID.
The result is faster and more efficient data loading.
NoPI tables are particularly useful as staging tables for bulk data loads. When a table has no primary index, its rows can be dispatched to any given AMP arbitrarily, so the system can load data into a staging table faster and more efficiently using FastLoad or Teradata Parallel Data Pump Array INSERT operations.
By storing bulk loaded rows on any arbitrary AMP, the performance impact for both CPU and I/O is reduced significantly. After having been received by Teradata Database, all of the rows can be appended to a NoPI table without being checked for duplicates or needing to be redistributed to their hash-owning AMPs.
Because there is no requirement for such tables to maintain their rows in any particular order, the system need not sort them. The performance advantage realized from NoPI tables is achieved optimally for applications that load data into a staging table, which must first undergo a conversion to some other form, and then be redistributed before they are stored in a secondary staging table.
Using a NoPI table as a staging table for such applications avoids the row redistribution and sorting required for primary-indexed staging tables. Another advantage of NoPI tables is that you can quickly load data into them and be finished with the acquisition phase of the utility operation, which frees client resources for other applications.
NoPI tables are also useful as so-called sandbox tables when an appropriate primary index has not yet been defined for the primary-indexed table they will eventually populate. This use of a NoPI table enables you to experiment with several different primary index possibilities before deciding on the most optimal choice for your particular application workloads.
The syntax for the CREATE TABLE statement has been changed to permit user data tables to be created without a primary index. Such tables are referred to as NoPI (No Primary Index) Tables. A NoPI table can be created in several different ways:
• When the newly supported NO PRIMARY INDEX clause is explicitly specified in the CREATE TABLE statement as shown in the example below, table Sales_Temp1 will be created as a NoPI table.
CREATE MULTISET TABLE Sales_Temp1, FALLBACK,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT
(
ItemNbr INTEGER NOT NULL,
SaleDate DATE FORMAT 'MM/DD/YYYY' NOT NULL,
ItemCount INTEGER)
NO PRIMARY INDEX;
• When the newly supported NO PRIMARY INDEX clause is explicitly specified in the CREATE TABLE AS statement as shown in the example below, table Sales_Temp2 will be created as a NoPI table. In this example, table Sales acting as the source table can be a PI table or a NoPI table.
CREATE MULTISET TABLE Sales_Temp2 AS (SELECT * FROM Sales)
WITH DATA NO PRIMARY INDEX;
• When none of PRIMARY INDEX, NO PRIMARY INDEX, PRIMARY KEY and UNIQUE Constraint is specified in the CREATE TABLE statement as shown in the example below, table Sales_Temp3 will be created based on the setting of the new DBSControl General Field, PrimaryIndexDefault. If PrimaryIndexDefault is set to N, Sales_Temp3 will be created as a NoPI table.
CREATE MULTISET TABLE Sales_Temp3, FALLBACK,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT
(
ItemNbr INTEGER NOT NULL,
SaleDate DATE FORMAT 'MM/DD/YYYY' NOT NULL,
ItemCount INTEGER);
• When neither PRIMARY INDEX nor NO PRIMARY INDEX is specified in the CREATE TABLE AS statement as shown in the example below, table Sales_Temp4 will be created based on the setting of PrimaryIndexDefault. If PrimaryIndexDefault is set to N, Sales_Temp4 will be created as a NoPI table. In this example, table Sales acting as the source table can be a PI table or a NoPI table.
CREATE MULTISET TABLE Sales_Temp4 AS (SELECT * FROM Sales) WITH DATA;
The new DBSControl General Field, PrimaryIndexDefault, was introduced to control whether a table is created with or without a primary index when none of PRIMARY INDEX, NO PRIMARY INDEX, PRIMARY KEY and UNIQUE Constraint is specified. The setting of PrimaryIndexDefault is as follows:
D - This is the default setting. This setting works the same as the P setting.
P - The first column in the table will be selected as the non-unique primary index. This setting works the same as that in the past when PRIMARY INDEX was not specified.
N – The table will be created without a primary index (NoPI table).
With the NoPI Table feature, the system default setting essentially remains the same as that in previous Teradata releases where the first column was selected as the non-unique primary index when the user did not specify a PRIMARY INDEX or a PRIMARY KEY or a UNIQUE Constraint. Users can change the default setting for PrimaryIndexDefault to P or N and not rely on the system default setting which can possibly be changed in a future release.
To change from a NoPI table to PI table, users can run one of the following:
• CREATE TABLE AS - The example shown below will create table Sales_Temp_PI with primary index ItemNbr and all of the data from NoPI table Sales_Temp1.
CREATE MULTISET TABLE Sales_Temp_PI AS (SELECT * FROM Sales_Temp1) WITH DATA PRIMARY INDEX (ItemNbr);
• CREATE TABLE and INSERT-SELECT - The example shown below will create table Sales_Temp_PI with primary index ItemNbr and then copy over all of the data from NoPI table Sales_Temp1.
CREATE MULTISET TABLE Sales_Temp_PI, FALLBACK,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT
(
ItemNbr INTEGER NOT NULL,
SaleDate DATE FORMAT 'MM/DD/YYYY' NOT NULL,
ItemCount INTEGER)
PRIMARY INDEX (ItemNbr);
INSERT INTO Sales_Temp_PI SELECT * FROM Sales_Temp1;
Users can also change from a PI table to a NoPI table using one of the above approaches.
The rules and limitations for NoPI tables are the same as those for primary-indexed tables with the following exceptions:
• You cannot create a NoPI queue table.
• You cannot create a NoPI error table.
• You cannot create a NoPI table as a SET table. The default table type for NoPI tables in both Teradata and ANSI session modes is always MULTISET.
• If neither PRIMARY INDEX (column_list) nor NO PRIMARY INDEX is specified explicitly in a CREATE TABLE request, then whether the table is created with or without a primary index generally depends on the setting of the DBS Control flag DisableNoPI
• NoPI tables cannot specify a permanent journal.
• NoPI tables cannot specify an identity column.
• Hash indexes cannot be defined on NoPI tables because they inherit the primary index of their underlying base table, and NoPI tables have no primary index.
• SQL UPDATE triggers cannot update a NoPI table.
• SQL UPDATE and UPDATE (Upsert Form) requests cannot update a NoPI target table. SQL UPDATE and UPDATE (Upsert Form) requests can update a primary-indexed target table from a NoPI source table.
• SQL MERGE requests cannot update or insert into a NoPI target table.
SQL MERGE requests can update or insert into a primary-indexed target table from a NoPI source table.
• You cannot load rows into a NoPI table using the MultiLoad utility.
You can define all of the following commonly used table features for NoPI tables:
• Fallback
• Secondary indexes
• Join indexes
• UNIQUE column constraints
• CHECK constraints
• PRIMARY KEY and FOREIGN KEY constraints
• Triggers
• BLOB and CLOB columns.
You can define any of the following table types as NoPI tables:
• Ordinary base data tables
• Global temporary tables
• Volatile tables
You cannot define any of the following table types as NoPI tables:
• Queue tables
• Error tables
• Global temporary trace tables6
After a NoPI staging table has been populated, you should run one of the following types of DML request to move the NoPI staging table source rows to a primary-indexed target table:
• INSERT … SELECT
• MERGE
• UPDATE … FROM
For example, you can populate data into a NoPI table via single-statement INSERT or multi-statement INSERT. The latter can be done with TPump Array INSERT which is more efficient and also an alternative to FastLoad for staging a table.
The example below shows a single-statement INSERT into NoPI table Sales_Temp1:
INSERT INTO Sales_Temp1 (100, '11/03/2008', 10);
INSERT-SELECT into a NoPI target table is also allowed. The source table can be a PI table or a NoPI table.
INSERT INTO Sales_Temp1 SELECT * FROM Sales;
You can use the following DML statements to manipulate or retrieve NoPI table rows prior to
moving them to their primary-indexed target table:
• DELETE
• INSERT
• SELECT
You cannot use UPDATE or UPDATE (Upsert Form) requests to change data in a NoPI table.
The following is a list of other operations and utilities and how they operate on a NoPI table:
Feature Impact
UPSERT - Not supported. This is a composite of UPDATE and INSERT operations such that if the UPDATE fails because the target row does not exist, the INSERT is automatically executed. The UPDATE part of an UPSERT is required to fully specify the primary index. Since there is no primary index in a NoPI table, UPSERT is not supported.
MERGE-INTO - Not supported when the target table is NoPI. This is currently composed of three variant forms – UPSERT, UPDATE-only and INSERT-only. UPSERT is not supported for the same reason discussed above. UPDATE is restricted in the initial release and therefore the UPDATE-only form of MERGE-INTO is not supported. INSERT-only is also not supported but it is very much like INSERT-SELECT which is supported.
MultiLoad - Not supported. FastLoad and TPump are utilities that are normally used to stage a table.
Restore/Copy Supported. The Restore/Copy processing on a NoPI table is very similar to that on a PI table. The main difference is that a NoPI table normally has one hash bucket per AMP. This is like a very non-unique NUPI table. So when the data from a NoPI table is restored or copied to a different configuration with more AMPs, there may be some AMPs that will not have any data at all. So, Restore/Copy can cause data skewing for a NoPI table.
FastExport - Supported. Data in a NoPI table can be FastExported the same way as for a PI table.
CheckTable - Supported. The LEVEL HASH check which is done with either an explicit LEVEL HASH command or implicitly in a LEVEL THREE command works differently on a PI table and a NoPI table. On a PI table, this check regenerates the rowhash for each data row based on the primary index and then compares with the rows on disk. On a NoPI table, this check looks at the rowhash for each data row and verifies that the hash bucket that is part of the rowhash correctly belongs to the AMP.
Table Rebuild - Supported. Table Rebuild processing on NoPI table is the same as that on a PI table. The table has to have fallback protection for Table Rebuild to run. Rows in a NoPI table have rowhash and therefore can be rebuilt the same way as those on a PI table.
Reconfig - Supported. The Reconfig processing on a NoPI table is very similar to that on a PI table. The main difference is that a NoPI table normally has one hash bucket per AMP. This is like a very skewed non-unique NUPI table. So when a system is reconfigured to have more AMPs, there may be some AMPs that will not have any data at all for a NoPI table. So, Reconfig can cause data skewing for NoPI table.
Refer to the following documention and Teradata Database Orange Book for information about, and usage suggestions for, tables with no primary index.:
Teradata Orange Book: No Primary Index (NoPI) Table
Teradata 13.0 Database Design publication, B035-1094-098A, located at:
http://www.info.teradata.com/DataWarehouse/eTeradata-BrowseBy-Results.cfm?pl=&PID=&title=%25&release=&kword=DDESIGN&kword2=TNTDBD&sbrn=7&nm=Database+Design