Endeca Dgraph vs Agraph

Endeca provides two basic architectures to choose from when implementing an Endeca solution: Dgraph deployment or Agraph deployment.

 

In a Dgraph deployment, the information for all records in the index is contained within a single set of index files which can be run on a single machine. Even if multiple machines are running the index for load-balancing purposes, they all contain a copy of the same complete index.

 

By contrast, an Agraph (“A” for “aggregated”) takes a request from a client, sends that request on to multiple Dgraphs running on other machines, aggregates the results, and then sends the aggregated response back to the client. In an Agraph deployment, each of the component Dgraphs contains a different, mutually exclusive portion of the total index data. Agraphs have some minor limitations when compared to Dgraphs, such as the fact that they do not support relevance ranking for dimension search and do not support the Static relevance ranking module. Apart from these specific features and some minor performance degradation, the functionality of an Agraph and Dgraph are the same.

 

The decision between a Dgraph and Agraph deployment can depend on several factors, but primarily will be driven by the size and nature of your data set. A Dgraph deployment is preferable whenever possible since it is significantly less complex and retains all Endeca’s features. An Agraph deployment can be significantly more complex but can handle larger amounts of data and can more easily grow (by adding additional component Dgraphs) to accommodate increasing amounts of data. The rule of thumb here is to use a Dgraph as long as all your data for the foreseeable future will fit.

 

Unfortunately, there are no hard and fast rules about how many records or how much data can fit in a single Dgraph. The overall capacity of the Dgraph will be limited by the amount of memory available in your deployment environment, and the amount of memory required by your index will depend heavily on the nature of the data in your index. For instance, if you are indexing metadata only, such as for a large product catalog, you are likely to be able to fit many more records in the same amount of memory than if you are indexing the full text of large document bodies. Since the capacity of a Dgraph is dependent on the available memory, the underlying hardware architecture becomes an important consideration.

你可能感兴趣的:(Endeca Dgraph vs Agraph)