What is lookup transformation in informatica?
Lookup is a transformation to look up the values from a relational table/view or a flat file. The developer defines the lookup match criteria. There are two types of Lookups in Powercenter-Designer, namely; 1) Connected Lookup 2) Unconnected Lookup . Different caches can also be used with lookup like static, dynamic, persistent, and shared(The dynamic cache cannot be used while creating an un-connected lookup). Each of these has its own identification. For more details, the book "Informatica Help" can be useful.
Hope you are aware with the basics of Informatica. Now proceeding through lookup transformation.
Lookup transformation is Passive and it can be both Connected and UnConnected as well. It is used to look up data in a relational table, view, or synonym. Lookup definition can be imported either from source or from target tables.
For example, if we want to retrieve all the sales of a product with an ID 10 and assume that the sales data resides in another table called 'Sales'. Here instead of using the sales table as one more source, use Lookup transformation to lookup the data for the product, with ID 10 in sales table.
Difference between Connected and UnConnected Lookup Transformation:
1. Connected lookup receives input values directly from mapping pipeline whereas UnConnected lookup receives values from: LKP expression from another transformation.
2. Connected lookup returns multiple columns from the same row whereas UnConnected lookup has one return port and returns one column from each row.
3. Connected lookup supports user-defined default values whereas UnConnected lookup does not support user defined values.
Select dname from dept,emp where: emp.deptno=dept.deptno
Connected LKP trasformation is one which is connected to Pipe line.
Connected LKP trasformation will process each and every Row.
If you want to Use Dynamic LKP cache, use the connected LKP transformation.
If the LKP condition is not matched the LKP transformation will return the Default Value.
it cannot be called
it returns multiple values.
it can use static or dynamic cache
Unconnected LKP trasformation is one which is not connected to the Pipe line.
It should be called either from expression or Update Stragey.
It will not process each and evry row. It will return the values based expression Condition.
If no match found for the LKP condition, the LKP transformation will return Null Values.
it is a reusable trnsformation. The same LKP trnans can be called multiple times in same mapping
it will return only one value.
it can use only static cache
Below are a list of performance considerations for lookups in Informatica PowerCenter.
Performance for Lookups
I have seen people suggesting an index to improve the performance of any SQL. This suggestion is incorrect - many times. Specially when talking about indexing the condition port columns of Lookup SQL, it is far more "incorrect".
Before explaining why it is incorrect, I would try to detail the functionality of Lookup. To explain the stuff with an example, we take the usual HR schema EMP table. I have EMPNO, ENAME, SALARY as columns in EMP table.
Let us say, there is a lookup in ETL mapping that checks for a particular EMPNO and returns ENAME and SALARY from the Lookup. Now, the output ports for the Lookup are "ENAME" and "SALARY". The condition port is "EMPNO". Imagine that you are facing performance problems with this Lookup and one of the suggestion was to index the condition port.
As suggested (incorrectly) you create an index on EMPNO column in the underlying database table. Practically, the SQL the lookup executes is going to be this:
select EMPNO, ENAME, SALARY from EMP ORDER BY EMPNO, ENAME, SALARY;
The data resulted from this query is stored in the Lookup cache and then, each record from the source is looked up against this cache. So, the checking against the condition port column is done in the Informatica Lookup cache and "not in the database". So any index created in the database has no effect for this.
You may be wondering if we can replicate the same indexing here in Lookup Cache. You don't have to worry about it. PowerCenter create "index" cache and "data" cache for the Lookup. In this case, condition port data - "EMPNO" is indexed and hashed in "index" cache and the rest along with EMPNO is found in "data" cache.
I hope now you understand why indexing condition port columns doesn't increase performance.
Having said that, I want to take you to a different kind of lookup, where you would've disabled the caching. In this kind of Lookup, there is no cache. Everytime a row is sent into lookup, the SQL is executed against database. In this scenario, the database index "may" work. But, if the performance of the lookup is a problem, then "cache-less" lookup itself may be a problem.
I would go for cache-less lookup if my source data records is less than the number of records in my lookup table. In this case ONLY, indexing the condition ports will work. Everywhere else, it is just a mere chanse of luck, that makes the database pick up index.
Dynamic Lookups are used for implementing Slowly Changing dimensions. The ability to provide dynamic caching gives Informatica a definetive edge over other vendor products. In a Dynamic Lookup, everytime a new record is found (based on the lookup condition) the Lookup Cache is appended with that record. It can also update existing records in the cache with the incoming values.
More information on Informatica at ITtoolbox
The only issue with the Dynamic Lookup is that they are slow as the caches are updated frequently based on the transaction posted in the database. Please be very careful in using the dynamic lookup. For small tables are fine but for heavy table, try to avoid the dynamic lookup.
http://www.example.com link title