初试
google 搜到下面这篇文章,看起来解释得还蛮不错:
Most DBAPI implementations fully buffer rows as they are fetched - so usually, before the SQLAlchemy ORM even gets a hold of one result, the whole result set is in memory.
But then, the way Query works is that it fully loads the given result set by default before returning to you your objects.
The rationale[原理] here regards[强调] queries that are against more than just simple SELECT statements - joins to other tables which may return the same object identity multiple times in one result set (common with eager loading), the full set of rows needs to be in memory so that the correct results can be returned - otherwise collections and such might be only partially populated.
So Query offers an option to change this behavior, which is the yield_per() call http://www.sqlalchemy.org/docs/orm/query.html?highlight=yield_per#sqlalchemy.orm.query.Query.yield_per .
This call will cause the Query to yield rows in batches, where you give it the batch size.
As the docs state, this is only appropriate if you aren't doing any kind of eager loading of collections - so it's basically if you really know what you're doing.
And also, if the underlying DBAPI pre-buffers rows , there will still be that memory overhead so the approach only scales slightly better than not using it.
没啥屌用。
I hardly ever use yield_per() - instead, I use a better version of the LIMIT approach you suggest above using window functions.
LIMIT and OFFSET have a huge problem that very large OFFSET values cause the query to get slower and slower,
as an OFFSET of N causes it to page through N rows - it's like doing the same query fifty times instead of one, each time reading a larger and larger number of rows.
With a window-function approach, I pre-fetch a set of "window" values that refer to chunks of the table I want to select.
I then emit【发出】 individual SELECT statements that each pull from one of those windows at a time.
The window function approach is on the wiki at http://www.sqlalchemy.org/trac/wiki/UsageRecipes/WindowedRangeQuery and I use it with great success.
Also note, not all databases support window functions - you need PG, Oracle, or SQL Server. IMHO using at least Postgresql is definitely worth it - if you're using a relational database, you might as well use the best.
我们把文章总结一下
-
- yield_pre 没啥屌用
2 要使用视图
3 很多人使用offset,其实是不对的
limit中的offset也是一行一行扫过去的。
这个我亲身体验过,fuck了offset了。。。4.0 window function 是视图吗?
不是,平常所说的视图其实是mysql的view
视图(view)是一种虚拟存在的表,是一个逻辑表,本身并不包含数据。
比如我们提前把两个表join了以后,并且把结果存为视图,以后用起来就很方便了。
此外,视图还拥有有安全,独立的优点。说安全,因为我可以给视图设置详细的权限来规定那些人能看哪些人不能等等。
- 4.1 什么是 window function
文档地址:
https://dev.mysql.com/doc/refman/8.0/en/window-functions.html
- 4.2 来一个简单的中文举例说明
https://www.xz-src.com/sjk/28060.html
https://www.rails365.net/articles/postgresql-de-window-functions-ba
- 4.3 为啥现在不赶紧用呢?
Window function 只支持 8.0版本。日狗了。。。