The test results in this white paper are intended to demonstrate the difference in the performance characteristics of SharePoint lists containing large numbers of items when different data access types are used to present list contents. Test results in this white paper show how to optimize list performance through limits on the number of items that appear in a list, and by choosing the most appropriate method of retrieving list contents.
本白皮书中的测试结果在于证明使用不同的数据访问方法展示包含大量项的列表时的执行性能差别。测试结果还演示了如何通过限制显示的列表项数量,以及挑选最适合的方法来检索列表内容,对列表的执行效率进行优化。
The tests upon which the results in this white paper are based were conducted by using artificially created test data and simulated users. Real-world results may vary depending on hardware, number of concurrent users, farm configuration, and user operations being performed.
白皮书中的测试结果受人工创建的测试数据、模拟的用户数所影响。真实环境中的结果可能会受硬件、并发用户数、服务器场配置以及用户操作所影响。
There is documented guidance for Microsoft Office SharePoint Server 2007 regarding the maximum size of lists and list containers. For typical customer scenarios in which the standard Office SharePoint Server 2007 browser-based user interface is used, the recommendation is that a single list should not have more than 2,000 items per list container. A container in this case means the root of the list, as well as any folders in the list — a folder is a container because other list items are stored within it. A folder can contain items from the list as well as other folders, and each subfolder can contain more of each, and so on. For example, that means that you could have a list with 1,990 items in the root of the site, 10 folders that each contain 2,000 items, and so on. The maximum number of items supported in a list with recursive folders is 5 million items.
关于MOSS2007中列表和列表容量的最大大小已经有了正式指导手册。在典型的用户场景中,标准的MOSS2007是基于浏览器的用户使用界面,建议在一个单一列表容器中不应包含超过2000条列表项。这里的容量是指列表的根下所包含的列表项数量。文件夹也是如此,因为文件夹同样是一个容器,其它列表项也可以存储在其下面。文件夹可以包含列表项,以可以包含其它文件夹,并且每个子文件夹又可以同样包含这些。举例来说,这意味着你可以在列表的根下包含1990条列表项,10个文件夹,每个文件夹中又包含1990条列表项和10个文件夹,以此类推。在一个列表中(包括子文件夹包含的)支持的最大列表项数量是500万条。
In Office SharePoint Server 2007, virtually all end-user data is stored in a list. A document library, for example, is just a specialized list. The same is true for calendars, contacts, and other interfaces; they are all just customized versions of the basic SharePoint list, also referred to as an SPList. The individual items in the list are referred to as list items generally, or an SPListItem in an SPListItemCollection in the Office SharePoint Server 2007 object model. The findings in this article are equally important across all of the ways in which you store and work with data in a Office SharePoint Server 2007 site.
在MOSS2007中,实际上,所有最终用户数据都是存放在列表中。比如文档库,就是一个专门的列表。同样的,日历、联系人以及其它接口,也都是基于基础的列表进行定制化的结果。在MOSS2007对象模型中,通常,列表中单独的列表项被关联到列表项集中,或者说列表项(SPListItem)在列表项集合(SPListItemCollection)中。本文认为在MOSS2007站点中,所有存储和操作数据的方法都是同等重要的。
There are some scenarios in which you want to take advantage of the features of Office SharePoint Server 2007, but need to exceed the limit of 2,000 items per container. If you write your own interface for managing and retrieving the data, it’s quite possible that you can go past this limit without an adverse impact on farm performance. You may be able to manage larger lists to some extent by using views within Office SharePoint Server 2007 that are filtered such that there are never more than 2,000 items returned. Filtered views provide better performance than just trying to view one large flat list, but are not as efficient as breaking down the list into different containers if you are using the predefined browser-based Office SharePoint Server 2007 interface.
在有的场合中,也许你想利用MOSS2007的一些优势特征,但是也同样希望突破每个容器2000条的限制。加入你自己写接口来管理和检索数据,那么很可能你可以越过对服务器场性能的不利影响。通过使用MOSS2007内的视图功能,对数据进行筛选让返回的列表项不超过2000条,从而达到对大列表管理的某种程度扩展。相比尝试查看整个大列表,筛选视图提供了更好的执行效率。假如使用MOSS2007预定义的基于浏览器的接口,将一个大列表分割成几个容器存放比筛选视图还更有效。
If you develop your own interface, there are several different ways to retrieve list data, each with different performance characteristics. Some data access methods perform very well, but are only useful in a limited number of scenarios. Finally, there are also performance tradeoffs that need to be made with other data maintenance tasks in addition to data retrieval.
假设你开发自己的接口,有几种不同的检索列表数据的方式,这些不同的方式也具有不同的性能特点。一些数据访问方法执行得很好,但是这仅仅是在某些限制数量的场合中蔡有用。最后,除数据检索外,在数据维护时,也需要做一些性能折中。
The tests in this white paper were conducted on a relatively underpowered Microsoft Virtual Server 2005 R2 image to show a comparison of farm performance characteristics when different data access types are used to manipulate list data. The goal of these tests was not to establish a new arbitrary limit, or to deliver a “requests per second” type number that is typically used in a load style test to show raw throughput capacity. The virtual server image was running Office SharePoint Server 2007 Enterprise Edition and had 1 gigabyte (GB) of allocated RAM. Virtual Server was running on a host machine with a 2 gigahertz (GHz) dual-core processor and 2 GB of RAM.
本白皮书中的测试是在一个性能相对较低的Microsoft Virtual Server 2005 R2映象上进行的,用以显示使用不同的数据访问方式来操作列表数据时的服务器场执行效率特征的对比情况。测试的目的不是为了设置一个任意的限制,或者发送类似 “每秒请求数”这样的请求,而是典型地使用加载方式测试,显示原始的数据吞吐能力。虚拟服务器映象上安装了MOSS2007企业版,并且为虚拟机分配了1G内存。虚拟机本身运行在内存为2G,CPU为2 GHz的双核主机上。
Baseline tests were done first with a list containing 1,500 items. The list schema looked like this:
基线测试首先是在一个包含1500条项目的列表上进行。列表的结构如下:
Title: Single line of text(单行文本)
Expense Category: Choice (Meals, Travel, Hotel, Supplies)(选项)
Amount: Currency(货币)
Deductible: Yes/No(是/否)
Created By: Person or Group(用户或用户组)
Modified By: Person or Group(用户或用户组)
In the baseline tests, no columns were indexed; measurements were taken just to provide a relative value that could be used after the number of items in the list exceeded recommended boundaries. In the tests against a very large list, one set was done with no columns being indexed and a second round was done after configuring the Expense Category column to be indexed. The query that was executed in each one of the tests used a WHERE clause against the Expense Category field looking for the first 100 items that contained “Supplies.”
在基线测试中,没有索引任何列。仅通过提供一个在列表中使用超过建议界限的数量的列表项时的相对值来衡量。测试中在一个大列表上进行,该列表被设置成不进行列的索引,在下一轮测试中,就是将费用类型列配置为被索引。每个查询都是使用“Where”条件语句进行筛选,以找到费用类型域中包含“Supplies“的前100条数据项。
To provide another point of comparison, the data being selected was based on ID value in the tests against the very large list. The ID is a built-in numeric indexed field in all SharePoint lists that is well suited to queries. The query in this case was constructed with a WHERE clause that retrieved items where the ID ranged from 44,500 through 44,599.
为了提供另外一个对照点,测试中大列表中的数据是基于ID值进行选择。ID是所有SharePoint列表中内建的数字型索引域(字段/列),非常适合这些查询。本例子中的查询是通过“WHERE”条件语句对ID值为44500 到44599范围内的数据进行检索的。
Some tests were also run with the site under load. To create the load during the testing process, a LoadTest was created in the Microsoft Visual Studio .NET 2005 development system to stress test the site. Instead of a specific number of users in the test, it was configured as a goal-based test, or a test in which a target value is defined for a particular measurement, and the test determines the number of requests required to achieve the target. In this case, the goal that was configured for the test was to achieve a consistent target CPU utilization on the Office SharePoint Server 2007 computer of from 60 through 80 percent.
测试也是运行在站点加载方式下。为了在测试过程中创建加载,在 VS2005 开发系统中创建了一个加载测试工具以加强测试站点。它被配置为基于全局的测试,或者目标被定义为一个可以详细衡量的、可以确定对达到目的的请求数量的测试,而不是特殊数量的用户的测试。在本例中,为了测试目的,配置的目标是达到对 MOSS2007 计算机 CPU60% 到 80 %一致的利用率。