GAE paging

Hi all! There's been a lot of good discussion here about doing paging efficiently, as opposed to just using offset. We've been discussing efficient paging internally too. We hope to eventually provide something built in, but in the short term we have other priorities. Still, we've come up with some interesting ideas, and I'd like to share one of them here.

Most of the efficient paging techniques discussed here have been based on sorting and filtering on a unique property. However, now that we have __key__ filtering, http://code.google.com/appengine/docs/datastore/queriesandindexes.html#Queries_on_Keys , we have a built-in unique field that we can filter and sort on. We can use that to support efficient paging in any query, on any schema, purely in user land, without requiring extra properties or other schema changes. (It will sometimes require a new composite index, but that's less invasive than requiring a new property.)

The executive summary is, when you first run the query, if it doesn't already have a sort order on __key__, append one. Then, when you run the query, save the last entity as a bookmark. To pick up from there later, set up the original query, add filters on all of the sort order properties so that it starts immediately after the bookmark entity.

The full design is below. We may provide some kind of built-in paging support in the future, but it won't be this. Given that, I'd love to see someone pick this up and implement it as an open source project!


Bookmarks
==
A bookmark is the last result entity that was returned to the app for a given query. That bookmark entity is serialized and base64-encoded so that apps can pass it around in URL query parameters, cookies, etc. To keep bookmarks reasonably sized, raw properties and properties not involved in the query should be stripped from the bookmark entity before serializing and encoding it.


Queries
==
If you want to be able to resume a query at a bookmark, you need to know that when you first run the query. That's because you need to append an ascending __key__ sort order to the query if it's not already there.

To resume a query at a bookmark, first, decode the bookmark, validate it, and check that it has all of the properties needed by the query. Then, use the algorithm below to generate and run a series of derived queries that transparently resume the query at the bookmarked entity.

Yes, queries plural. If your query has a sort order, you'll need to run two queries in serial to resume from a bookmark, due to the single inequality filter property limit. More accurately, if your query has n sort orders, you'll need to run n + 1 queries.

Also, as mentioned above, making a query bookmarkable sometimes require extra composite indices that the corresponding non-bookmarkable query wouldn't require. This is due to the extra __key__ sort order.


Resuming from a bookmark
==
Here's Python-like pseudocode for resuming a query at a bookmark. It generates a series of queries derived from the original query that, when run in order, return the original query's results starting at the bookmark. (The indentation is meaningful, so you might want to turn on a fixed-width font in your viewer here.)

  given a query and a bookmark entity:
    assert that all of the properties in the query are also in the bookmark entity
  
    # massage the query. this may add extra sort orders, but won't remove the
    # existing ones, so it doesn't change the query's invariants.
    if the query has an inequality filter but no sort order:
      add an ASC sort order on the inequailty filter property
  
    if the query doesn't have a sort order on __key__:
      append an ASC sort order on __key__
  
    # prepare the original query as a template for the derived queries
    if the query has inequality filters:
      remove them and record them
  
    for prop in the query's sort orders:
      add a new filter to the query: "prop = [bookmark entity's value for prop]"
  
    remove the query's sort orders, except for __key__, and record them
  
    # generate the derived queries
    make an empty FIFO queue for the derived queues
    for prop in reversed(original query's sort orders):
      make a deep copy of the original query
      on that deep copy:
        remove the = filter on prop
  
        replace it with a new filter:
          if original sort order on prop was ASC:
            operator = ">"
          else:
            operator = "<"
          the new filter is "prop [operator] [bookmark entity's value for prop]"
  
        also, if there were other inequality filters on prop, add them back too
        also add back the sort order on prop, at the beginning of the sort orders
        enqueue this query

To run the query starting from a bookmark, unwind the queue, ie pop the derived queries off one at a time, run them, and return their results. Then each query runs out of results, pop the next, and so on.

Also, note that ancestor is unrelated. If it's specified on the original query, it should be included in the bookmarkable starting query and all of the derived queries, but it doesn't otherwise affect the algorithm otherwise.


Examples
==
(Again, turn on a fixed-width font in your viewer here.)

| *Original starting query*             | *Bookmarkable starting query*                          | *Derived queries for starting from bookmark entity B*          |

| SELECT * FROM Foo                     | SELECT * FROM Foo ORDER BY __key__ ASC                 | SELECT * FROM Foo WHERE __key__ > B ORDER BY __key__ ASC       |

| WHERE x = 0                           | WHERE x = 0 ORDER BY __key__ ASC                       | WHERE x = 0 AND __key__ > B ORDER BY __key__ ASC               |

| WHERE x > 0                           | WHERE x > 0 ORDER BY x ASC, __key__ ASC                | WHERE x = B.x AND __key__ > B ORDER BY __key__ ASC             |
|                                       |                                                        | WHERE x > B.x ORDER BY x ASC, __key__ ASC                      |

| WHERE x = 0 AND y > 0                 | WHERE x = 0 AND y > 0 ORDER BY y ASC, __key__ ASC      | WHERE x = 0 AND y = B.y AND __key__ > B ORDER BY __key__ ASC   |
|                                       |                                                        | WHERE x = 0 AND y > B.y ORDER BY y ASC, __key__ ASC            |

| WHERE x > 0 AND x < 9                 | WHERE x > 0 AND x < 9 ORDER BY x ASC, __key__ ASC      | WHERE x = B.x AND __key__ > B ORDER BY __key__ ASC             |
|                                       |                                                        | WHERE x > B.x AND x < 9 ORDER BY x ASC, __key__ ASC            |

| WHERE __key__ > A AND __key__ < Z     | WHERE __key__ > A AND __key__ < Z ORDER BY __key__ ASC | WHERE __key__ > B AND __key__ < Z ORDER BY __key__ ASC         |

| ORDER BY x ASC                        | ORDER BY x ASC, __key__ ASC                            | WHERE x = B.x AND __key__ > B ORDER BY __key__ ASC             |
|                                       |                                                        | WHERE x > B.x ORDER BY x ASC, __key__ ASC                      |

| ORDER BY x DESC                       | ORDER BY x DESC, __key__ ASC                           | WHERE x = B.x AND __key__ > B ORDER BY __key__ ASC             |
|                                       |                                                        | WHERE x < B.x ORDER BY x DESC, __key__ ASC                     |

| ORDER BY __key__ ASC                  | ORDER BY __key__ ASC                                   | WHERE __key__ > B ORDER BY __key__ ASC                         |

| ORDER BY __key__ DESC                 | ORDER BY __key__ DESC                                  | WHERE __key__ < B ORDER BY __key__ DESC                        |

| ORDER BY x ASC, y DESC                | ORDER BY x ASC, y DESC, __key__ ASC                    | WHERE x = B.x AND y = B.y AND __key__ > B ORDER BY __key__ ASC |
|                                       |                                                        | WHERE x = B.x AND y < B.y ORDER BY y DESC, __key__ ASC         |
|                                       |                                                        | WHERE x > B.x ORDER BY x ASC, y DESC, __key__ ASC              |

| ORDER BY x ASC, __key__ DESC          | ORDER BY x ASC, __key__ DESC                           | WHERE x = B.x AND __key__ < B ORDER BY __key__ DESC            |
|                                       |                                                        | WHERE x > B.x ORDER BY x ASC, __key__ DESC                     |

| WHERE x = 0 ORDER BY y DESC           | WHERE x = 0 ORDER BY y DESC, __key__ ASC               | WHERE x = 0 AND y = B.y AND __key__ > B ORDER BY __key__ ASC   |
|                                       |                                                        | WHERE x = 0 AND y < B.y ORDER BY y DESC, __key__ ASC           |

| WHERE x > 0 AND x < 9 ORDER BY x DESC | WHERE x > 0 AND x < 9 ORDER BY x DESC, __key__ ASC     | WHERE x = B.x AND __key__ > B ORDER BY __key__ ASC             |
|                                       |                                                        | WHERE x < B AND x > 0 ORDER BY x DESC, __key__ ASC             |

你可能感兴趣的:(python,IE,Google,UP,GAE)