http://www.bluefishgroup.com/library/2002/performance-tips/
Note: This tip only increases performance on pre-4i docbases. In most Documentum applications, you will find yourself writing code to find a set of attributes for a certain object. You will know the ID of the object, so you will issue a query with r_object_id in the where clause. The problem with this is that the r_object_id attribute is a string field that is not indexed in the underlying database. This means that if you query on r_object_id, the database must do a full tablescan to find the rows you are looking for. However, Documentum also stores the r_object_id in an indexed column. It converts the string to an integer and stores it in the r_object_id_i column. Therefore, searching on r_object_id_i is much faster than searching on r_object_id. If you don’t know the integer representation of a specific r_object_id value, you can convert it using the DQL ID operator.
Slow:
SELECT * FROM dm_document WHERE r_object_id = '090f42418000472f'
Fast:
SELECT * FROM dm_document WHERE r_object_id_i = ID('090f42418000472f')
On a docbase with 3600 objects, the first query took .620 seconds and the second query took .060 seconds.
When you create a custom object type with its own unique attributes, you are probably going to issue queries against some of those attributes. If your application queries against the same one or two attributes often (like part number or employee number), you can speed up performance dramatically by adding a database index to the underlying database on these one or two columns.
To do this, modify the database directly, or use Documentum’s built-in MAKE_INDEX
method. The syntax is:
make_index with type_name = '{object type}', attribute = '{attribute name}'
For performance reasons, it is best to keep your object hierarchy as shallow as possible. The reason for this is that each level of the object hierarchy is stored in a separate table in the database. In order to manipulate an object, the Documentum server must join that object type’s tables with the tables of all of the other object types in the hierarchy above it. The more levels in your hierarchy, the more tables that must be joined together. And database joins are very expensive.
For example the following object hierarchy must do 4 joins (one for each level) when you query on plant_maintenance_report.
dm_document
|
|
Report
|
|
technical_report
|
|
plant_maintenance_report
It is advisable to try to collapse this hierarchy. The typical way to do this is to add all the custom attributes to the report object and add an extra attribute that identifies the type of report. You can still query for plant maintenance reports by using a where clause like this: where report_type = 'Plant Maintenance Report'
. The resulting type hierarchy will look like this.
dm_document
|
|
Report
This is one of the less intuitive performance tricks, but it’s true. It is slower to constrain your search to a folder then it is to search the entire docbase. It is especially slow to issue a query that descends through all the sub-folders of a folder. Unless you have a compelling reason to search inside a folder (for example, if the only thing you know about an object is where it lives), don’t do it.
Because repeating attributes are kept in their own table in the database, when you query for a repeating attribute (such as keywords or r_version_label), Documentum must do a database join, which is slow. If you can avoid including repeating attributes in both your select list and your where clause, you can speed up your query.
In some cases, it is faster to search on a custom object type than it is to search on a parent object type. This is true when the total number of objects in the custom object type table is a small percentage of the number of objects in the supertype table. For example, assume that you have a custom object type named report
that is inherited from dm_document
. If there are 100,000 dm_documents and only 5000 reports, it will be faster to say select * from report
than it will to say select * from dm_document
.
Imagine that you are writing an application that generates a custom report. In this report, you will query the docbase for all the documents that are owned by the current user, and you will display the object name and title. There are two ways to go about this.
get
API to get and display the object name and title. The second method here is much faster because Documentum must only issue one query to the docbase. In the first method Documentum must issue one query to get the r_object_id and then for each item in the result set, it must issue another query to retrieve the object name and id.
When you issue a query in Documentum, Documentum does not know if the query will insert, update or delete any rows in the database. Therefore, it will open a special connection to the database just in case the query you are issuing need to update the database.
In order to issue a query in Documentum, the Documentum server must open a connection to Oracle. Since it does this all the time, it keeps a pool of connections available and shares these connections whenever possible. However, if a query is going to insert, update, or delete a row in the database, it can not share a connection with another query.
If you could somehow tell Documentum that your query is just a normal select query, it will share a connection, which is faster and uses less resources. You can tell Documentum that your query is a read-only query by using the readquery
API or the read query flag on the execquery
API. Here are two examples:
readquery,c,select * from dm_document
execquery,c,T,select * from dm_document