关于性能测试RBI方法(翻译水平有限,如有错误麻烦能够指正)

原文地址:http://wenku.baidu.com/view/2b18212c7375a417866f8f35.html

Rapid Bottleneck Identification-A Better Way to do Load Testing
快速的瓶颈识别-一种更好的方法进行负载测试

INTRODUCTION
介绍

You're ready to launch a critical Web application.Ensuring good application performance is crucial,but time is short.How can you optimally test the application and still meet your deadlines?
你已经准备开始一个关键的web应用程序。确定好的应用程序性能是极其重要的,但是时间短促。你如何最佳的测试应用程序并且在最后期限前完成?

Rapid bottleneck identification(RBI) is a new testing methodology that allows quality assurance(QA) professionals to very quickly uncover Web application performance limitations and determine the impact of those limitations on the end user experience. Developed through years of testing engagements across all types of platforms, the RBI methodology dramatically reduces load testing cycles while allowing more―and more thorough―testing. Using this approach, organizations can improve application quality, enhance the customer experience, and lower the cost of deploying new systems.
快速的瓶颈识别(RBI)是一套新的测试方法,它允许QA人员非常快速的发现web应用程序性能界限和确定这些界在最终客户使用上的影响。通过在所有类型的平台上的测试约定发展。当越来越多的程序需要进行测试时RBI方法非常有效的减少了负载测试周期。使用这种方法,能够提升应用程序质量,提升客户体验以及更低的成本开展新系统。

PERFORMANCE TESTING DEFINED
定义性能测试

Performance testing can be roughly defined as “testing conducted to evaluate the compliance of a system or component with specified performance requirements.”However, every application has at least one bottleneck, and few, if any, systems ever meet initial performance requirements. To reflect this reality, let’s redefine performance testing as “testing conducted to isolate and identify the system and application issues (bottlenecks) that will keep the application from scaling to meet its performance requirements.”
性能测试能的大致定义为:"组织测试用于评估系统的合规性或者特定的组件性能要求"。然而,每个应用程序至少有一个,几个或者一些瓶颈。系统是否满足最初的性能要求,为了反映实际情况,让我们重新定义性能测试"组织测试用于隔离和确定系统和应用程序的瓶颈问题,那将保持应用程序从扩展的角度上满足它的性能要求"。
This philosophical shift in perspective―from testing as an evaluation to testing as an active investigation to isolate and resolve problems―is what drove the creation of the RBI methodology. RBI combines a comprehensive understanding of bottlenecks with a refined testing methodology that enables organizations to create highly scalable Web applications.
这个看法上的根本转变-从测试作为一种评估手段到作为一种孤立和解决问题的调查活动-这就是为什么创建RBI方法。RBI结合一种对于瓶颈的综合性理解通过一种新的测试方法,那使公司组织去创建高度可扩展的web应用程序。

UNDERSTANDING BOTTLENECKS, THROUGHPUT, AND CONCURRENCY
理解瓶颈,吞吐量和并发

Before delving into the specifics of the RBI methodology, we must first establish a common understanding of bottlenecks―and where they are found―as well as draw a distinction between throughput and concurrency testing.
在深入研究RBI方法细节之前,我们必须首先对于瓶颈建立一个普遍的理解―他们被发现的地方―并且区别吞吐量和并发测试
Bottlenecks―Key Performance Inhibitors
瓶颈-性能抑制的关键

Any system resource―such as hardware, software, or bandwidth―that places defining limits on data flow or processing speed creates a bottleneck. In Web applications, bottlenecks directly affect performance and scalability by limiting the amount of data throughput or restricting the number of application connections.These problems occur at all levels of the system architecture, including the network layer, the Web server, the application server, and the database server.Historically, based on our experience testing actual customer applications,bottlenecks have been distributed across these components as shown in Figure 1.
任何系统资源―例如硬件,软件或者带宽限制的地方定义数据流或者处理进程速度,造成了一个瓶颈。在web应用程序中,通过限制大数据吞吐量或者限制应用连接数,瓶颈直接影响性能和可扩展性。这些问题发生在所有级别的系统架构上,包括网络层,web服务器,应用服务器和数据库服务器。从历史角度看,基于我们的经验客户的实际应用测试,瓶颈已经分布在这些组件中,如图1所示
The Compounding Impact of Testing Complexity
测试复杂性的组合影响

The testing approach you choose directly impacts the difficulty of isolating and resolving bottlenecks. Unfortunately, too many testing procedures begin with complex usage scenarios where testers try to simulate exactly how the application will be utilized in production. This may involve running several different transactions to simulate different types of users who interact with the application in different ways. Unfortunately, this creates a significant testing roadblock because scenarios that are higher in complexity and involve multiple different transactions introduce more bottlenecks into the test, which makes it difficult to identify root causes.
你选择直接影响隔离和解决瓶颈困难度的测试方法。不幸的是,太多的测试方法以复杂的使用场景开始,测试人员尝试在场景中精确模拟应用程序将如何在实际中被使用。这可能包含运行各个不同的业务去模拟不同的用户类型,用户与应用程序之间通过不同的方式进行交互。不幸的是,这创建了一个重要的测试障碍,因为场景是复杂度较高的并且包含了多重不同的业务,会引进更多的瓶颈进入测试,那会导致确定根本原因变得困难。
For example, the graph in Figure 2 illustrates the test results of a standard e-commerce application that bottlenecked at approximately 2,000 concurrent users. In this sample test, the usage scenarios involved browsing, searching, and adding items to a shopping cart to complete a purchase. Although there were only three transactions being tested, each transaction interacted with all levels of the application architecture―and any one of them could have caused the bottleneck.To further complicate matters, the bottleneck could also have been caused by a system issue. Ultimately, the more variables involved in a test, the more difficult it is to determine the cause of the problem.
例如,图2的图标阐述测试结果,一个标准的电子商务应用,它的瓶颈大约在2000个并发用户数。在这个样本测试中,使用的场景包含浏览,搜索和添加购物车物品完成购买。虽然只有3个业务被测试,但是每个业务与应用架构的所有级别交互―它们中的任何一个业务可能引起瓶颈。为了进一步使问题复杂化,瓶颈也可能是通过系统问题引起的。最终,越多的变数包含在测试中,则确定问题的原因就变得越困难。
If the problem can be in any tier of the architecture, and the likelihood of it being in any one tier is not substantially greater than in any other, where else can you look for guidance?
如何问题在架构的任何层级,并且在任何层级出现的可能性本质上不大于其他层级,你还能够在哪里寻求知道?
Two Primary Issues: Throughput and Concurrency
两个主要问题:吞吐量和并发性

Throughput is the amount of data flow a system can support, measured in hits per second, pages per second, and megabits of data per second. Concurrency is the number of independent users simultaneously connected and using an application.In our experience, a majority of all system and application performance issues result from limitations in throughput.  However, concurrency issues are also critical to application performance and can be even more difficult to isolate.
吞吐量是一个系统可以支持多大的数据流,通过每秒点击数,每秒页面数和每秒数据大小进行测试。并发性是多个独立用户同时连接和使用一个应用程序。在我们实际应用中,大部分的系统和应用程序性能问题是由于吞吐量的上限造成的。然而,并发问题也是应用程序性能表现的关键并且它是更难被隔离的。
Testing for Throughput
测试吞吐量

Testing for throughput involves minimizing the number of user connections to a system and maximizing the amount of work being done by those users. This pushes the system and application to capacity so that all issues will be revealed.
测试吞吐量包含最大限度的减少用户连接数并且让那些用户最大限度的做大量的工作,这是对系统和应用程序的性能施加压力为了让所有问题被暴露出来。
For throughput testing at the system level, basic files can be added to the Web and application servers for testing purposes. The load test can then be set up to request these test files to assess maximum system throughput at each tier.Typically,testers use a large image file for bandwidth tests, a small text file or image for hit rate tests, and a very simple application page―a “Hello World” page, for instance―for page rate testing. If the system does not meet basic application performance requirements―just requesting these simple test pages―testing should cease until the system itself has been improved, either through tuning the settings, increasing the hardware capacity, or increasing the allocated bandwidth.
吞吐量测试在系统级,基础的文件能够被添加到web和应用服务器为了测试目的。然后负载测试能够设置请求这些测试文件去评估系统在每个层级的最大吞吐量。通常,测试员使用一个大的图像文件用于带宽测试,一个小的文本文件或者图像文件用于命中率测试,和一个非常简单的应用程序页面―例如一个“hello world” 的页面用于页面速度测试。如果系统不能满足基本的应用程序性能要求-只是要求这些简单的测试页面-测试应该停止直到系统本身进行改进,或者通过调整设置,增加硬件能力,或者增加分配带宽。
Throughput testing of the actual application then involves hitting key pages and user transactions in the application itself with limited delay between requests to find the page-per-second capacity limit of the various functional components.Obviously, the pages or transactions with the poorest page throughput need the most tuning.
在实际应用中吞吐量测试涉及到命中关键页和应用本身的用户事务在延迟限制和各个功能部件请求查找每秒页面数的能力限制。显然,最低劣的页面吞吐量的业务或者页面最需要进行调整。

Testing for Concurrency
并发测试  

On the system and application levels, concurrency is limited by sessions and socket connections. Code flaws and incorrect server configuration settings can also limit concurrency. Concurrency tests involve ramping up a number of users on the system and using realistic page-delay times at a ramp-up speed slow enough to gather useful data throughout the testing at each level of load. As with throughput testing, it is important to test the key pages and user transactions in the application under test.
在系统级和应用级,并发会由于会话和socket连接而受限制。代码缺陷和不正确的服务器配置设置也会限制并发。并发性测试涉及在系统中增加用户数量和使用真实的页面延迟时间在一个缓慢增加的环境用于足够收集有用的数据遍及负载测试的每个层级。和吞吐量测试一样,测试关键页和应用程序中的用户业务对并发性测试也是很重要的
The Difference Between Throughput and Concurrency Tests
吞吐量和并发性测试的区别

The load generated from a 100 virtual user load test with 1-second think times is not equivalent to a 1,000 virtual user load test with 10-second think times. As Figure 3 illustrates, the two tests are identical in terms of throughput; however, in terms of concurrency they are vastly different.In the first scenario, the throughput test, the application bottlenecked at 50 pages per second. In the second scenario, however, a concurrency test of the same transactions, the application bottlenecked at 25 pages per second. The only differences between these two tests were the number of users on the system and the length of time those users stayed on the pages. In the throughput test with fewer users and shorter page view delays, the application had more throughput capacity; the second test shows the application was limited in its concurrency. If the testers had checked only for throughput, the concurrency issue would not have been discovered until the application was in production.Figure 4 and Figure 5 on the following page show the results of each test and highlight the importance of testing for both throughput and concurrency.
100个虚拟用户响应时间为1秒的负载测试并不等同于1000个用户响应时间为10秒的负载测试。如图三所示,这2个测试的吞吐量相同。但是,在并发方面他们有极大的不同,在第一个场景的吞吐量测试中,应用程序的瓶颈是50pages/dec。然而在第二个场景中,同一个事物的并发测试,应用程序的瓶颈在25pages/sec。这2种测试的不同在于系统中用户数量和这些用户停留在页面的时间长短。在吞吐量测试中用户更少,页面延迟时间更短,应用程序具有更大的吞吐能力。在第二个测试中,显示出了应用程序的并发限制数。如果测试员仅仅测试吞吐量,并发问题将不会并发现知道应用程序投入实际使用中。图四和图五显示每个测试的结果,并且突出吞吐量测试和并发测试的重要性。

THE RBI TESTING APPROACH
RBI测试方法

Traditionally, performance testers focused on concurrent users as the key metric for application scalability. However, if a majority of application and system-level issues are found in throughput tests, a new approach is needed.
传统上,性能测试关注并发用户数作为应用程序扩展性的标准衡量指标。然后,如果大多数应用程序和系统级的问题在吞吐量测试中被发现,那么就需要一个新的方法。
These three principles form the foundation of the RBI methodology.
•     All Web applications have bottlenecks.
•     These bottlenecks can only be uncovered one at a time.  
•     Focus should be placed where the bottlenecks are most likely to occur.
以下是RBI方法的3个基本原则
•     所有web应用程序都有瓶颈
•     这些瓶颈每次只能被发现一个
•     应该关注瓶颈最有可能发生的地方
Although recognizing the importance of concurrency testing, the RBI methodology first focuses on throughput testing to root out the most common bottlenecks, followed by concurrency testing to assess performance under load conditions that reflect the actual number of users expected on the application. RBI testing also starts with the simple tests and then builds in complexity so that when an issue appears, all the other possible causes have been ruled out. Focusing on throughput testing, followed by concurrency testing, and using a structured approach to the test process ensures that bottlenecks are quickly isolated, which improves efficiency and reduces cost.
虽然意识到并发测试的重要性,但是RBI方法首先注重吞吐量测试去搜寻最普遍的瓶颈,接着通过并发性测试在负荷情况下去评估性能这反映了应用程序对实际用户数预期。RBI测试也可以从简单测试开始,然后当一个问题显现时建议复杂测试,排除所有其他可能产生的原因。关注吞吐量测试,然后通过并发测试和采用结构化方法确认测试过程中瓶颈被快速的隔离,那将会提高效率和降低成本。

Benefits of RBI Testing  
RBI测试的好处

The RBI methodology enables rapid yet thorough testing that systematically uncovers all system and application issues―both simple and complex.
RBI方法能够快速全面的进行测试,系统的揭示所有系统和应用程序问题-简单和复杂的
Reduce Testing Time
减少测试时间

How much time can you save by focusing initially on throughput testing? Take an example of a system expected to handle 5,000 concurrent users, with users spending an average of 45 seconds on each page. If the application has a bottleneck that will limit its scalability to 25 pages per second, a concurrency test will find this bottleneck at approximately 1,125 users (25 pages per second at 45 seconds per page).In the interest of not biasing the data, a typical concurrency test ramp up should proceed slowly. For example, you may consider ramping one user every five seconds. In this example, the bottleneck would have occurred 5,625 seconds or 94 minutes into the test (1,125 users at 5 seconds per user). However, to validate the bottleneck, the test would have to continue beyond that point to prove that the throughput was not climbing as users were added. A throughput test could have found this problem in less than 60 seconds.
通过最初关注吞吐量测试你能够节省多少时间?以一个系统预期能处理5000个并发用户作为例子,用户在每页上的平均时间为45秒。如果应用程序存在一个瓶颈将限制它的扩展性为25页每秒,一个并发性的测试将发现这个瓶颈在大约1125用户时显现(每秒25页面在每页45秒内)。有趣的是在没有偏置数据下,一个典型的并发测试应该缓慢上升用户数在测试过程中。例如,你需要考虑每5秒上升一个用户。在这个例子中,瓶颈将会在测试第5625秒或者第94分钟的时候产生(1125个用户在每5秒上升一个用户的环境下)。然而,为了验证瓶颈,测试将不得不继续超过那个点用于证明吞吐量将不会攀升随着用户增加。吞吐量测试可能在60秒内发现这个瓶颈。
Eliminate Initial Testing Complexity
消除最初测试的复杂性

Very often performance testing begins with overly complex scenarios exercising too many components at the same time, making it easy for bottlenecks to hide.The RBI methodology begins with system-level testing that can be carried out before the application is even deployed.
性能测试经常从过于复杂的场景开始在同一时间测试多个组件,这使得瓶颈不容易被发现。RBI方法从系统级测试开始,那能够在应用程序被部署前进行开展执行。
Improve QA Efficiency
提升QA效率

The RBI methodology tests the simplest test cases first and then moves on to those with increased complexity. If the simplest test case works and the next level of complexity fails, the bottleneck lies in the newly added complexity. By uncovering bottlenecks using a tiered approach, you can quickly identify issues as well as isolate issues in components of which you have limited knowledge.
RBI测试方法从最简单的测试用例开始然后慢慢增加复杂度。如果最简单的测试用例正常而下一级复杂度的失败,那么瓶颈就在新增加的复杂度上面。使用分层的方法发现瓶颈,你能够快速识别问题以及在组件中隔离问题通过你有限的知识。

Enhance Testing Effectiveness with Knowledge Aggregation
通过知识聚集提升测试效率

The modular and iterative nature of the methodology means that when a bottleneck appears, all the previously tested components have already been ruled out. For instance, if hitting the home page shows no bottlenecks but hitting the home page plus executing a search shows very poor performance, the cause of the bottleneck lies in the search functionality.
RBI Testing for Common System Bottlenecks
模块化和迭代方法意味着当一个瓶颈出现时,所有先前被测试的组件已经被排除在外了。例如,如果点击主页显示没有瓶颈产生但是点击主页外加执行搜索显示性能很差,那么瓶颈就是由于搜索功能引起的。
Any performance testing should begin with an assessment of the basic network infrastructure supporting the application. If this basic system cannot support the anticipated user load on a system, even infinitely scalable application code will bottleneck. Basic system-level tests should be run to validate bandwidth, hit rate, and connections. Additionally, simple test application pages should be exercised―simple “hello world” pages, for instance.
任何测试都应该首先评估网络基础设施对于应用程序的支持。如果这个基本的网络系统不能够支持逾期的用户负载在系统上,甚至无限扩展的应用程序代码将产生瓶颈。基本的系统级测试应该验证带宽,命中率和连接。另外,简单测试应用程序页面应该是十分简单的一个页面,例如“hello world”。
The Application
应用程序

After validating that the system infrastructure meets the most basic needs of the end users, turn to the application itself. Once again, start with the simplest possible test case.
在验证系统设施满足最终用户最基本的需求后,转向应用程序本身,再一次,从最简单的测试用例开始。
If testing has progressed this far without uncovering system-level issues (or those issues have been resolved), any remaining problems are caused by the application itself. For example, if a test application page achieved 100 pages per second and the home page bottlenecks at only 10 pages per second, the problem lies in the overhead required to display the home page.
如果测试没有取得进展发现系统级问题(或者那些问题已经被解决了),遗留的问题则是由应用程序本身引起的。例如,如果一个测试应用程序页面达到了100pages/sec并且主页的瓶颈为只有10pages/sec,问题就在于要求显示首页的开销上。
At this point, the test application page test provides two valuable pieces of information. First, because we know that the system itself is not the bottleneck, the culprit can only be the code on the home page. Second, we can see how much tuning the home page could improve performance. The difference between the performance of the test application page (100 pages per second) and the home page (10 pages per second) determines the maximum performance improvement tuning could provide. Likewise, multipage transactions can be assessed by breaking down the performance of individual pages in the transaction and evaluating how each contributes to the performance of the overall transaction.
在这点上,应用程序页面测试提供了2个珍贵的信息。首先,因为我们知道应用程序本身没有瓶颈,罪魁祸首只可能是首页代码,第二,我们看到调整首页能够提升多少性能。测试应用程序页面(100pages/sec)和首页(10pages/sec)之间的性能差异决定了可以提供性能改进的最大值。同样的,多页的交易能够通过划分各个页面在业务中的性能进行评估并且评估每个性能能够对整个业务起到帮助。

Since any real-world application page likely requires more processing power than a “hello world” test page, it is reasonable to expect some drop-off in performance.However, the greater that drop-off, the greater the need for―and potential gain from―tuning. It is also important to note that if the drop-off between the test page and an actual application page is not substantial and the performance is still insufficient to meet the needs of the application user base, you need to add more hardware capacity.
由于真实的应用程序页面可能比一个“hello world”页面要求更大的处理能力,预期性能直降这是合理的。然而,如果性能大落,这就需要从调试中获得更大的潜在增益。还有一点也是同样重要的,注意如果大落在测试页面和实际应用页面之间没有实质性关系并且性能仍然不能满足用户群对应用程序的需求。你就需要添加更多的硬件。
Up to this point no mention has been made of page-response times. Although response times are a key metric of overall performance, response times will be the same for one user as they are for 1,000 or 100,000 users unless a bottleneck is encountered. So in this methodology, response times are only useful as an indicator that a bottleneck has been reached (if response times begin to spike) or as failure criteria (if response times exceed some predefined threshold), with poorly performing pages (those that experience errors or high response times) most in  need of code optimization.
上面这点没有提及到页面响应时间。虽然相应时间对整个性能是一个重要的度量指标,一个用户和1000个或者10000个用户相应时间应该一样除非遇到瓶颈。因此在这个方法中,响应时间仅仅是作为一个瓶颈已经达到的指标(如果响应时间开始突然上升)或者是作为失败标准的指标(如果响应时间超过预期阀值),那些性能不佳的页面(页面错误或者响应时间过高)最需要被代码优化

RBI Testing for Application Bottlenecks
RBI测试用户应用瓶颈

As with the system-level testing, the RBI methodology begins application testing with the simplest possible test case and then builds in complexity. In a typical e-commerce application, you would test the home page first and then add pages and business functions until complete, real-world transactions are being tested, first individually and then in complex scenario usage patterns. As steps are added, any degradation in response times or page throughput will have been caused by the newly added step, making it easier to isolate what code needs to be investigated.
和系统级测试一样,RBI方法开始进行应用测试从最简单的测试用开始然后构建复杂的。在一个典型的电子商务应用中,你将首先测试首页然后添加页面和业务功能直到完成,真实业务要被测试,首先单独的业务测试然后在复杂的使用场景测试。由于步骤被添加,任何响应时间上升或者吞吐量的下降都将会被新的添加步骤造成,这使得它更容易被隔离,并且隔离的代码需要被审查。
Once each of the business functions and transactions has been tested and optimized (as necessary), the transactions can be combined into complete scenario concurrency tests. These concurrency tests must focus on two key components.First, the concurrency test must accurately reflect what real users do on the site―browse, search, register, login, and purchase. Second, the steps in those transactions must be performed at the same pace as real-world visitors with appropriate “think times” between each step. This data can be gathered with a Web logging tool that exposes session length, pages viewed per session (to determine the user pacing), and percentages of pages hit (to determine the actual business functions used).
一旦每个业务功能和交易已经被测试和优化(如果有必要的话),那么交易就可以整合进完整的场景进行并发测试。并发测试必须关注2个关键组件。第一,并发测试必须准确的反映真实用户在网站上所作的浏览,搜索,登录以及购买行为。第二,这些必须执行的交易步骤之间的等待时间必须与真实用户进行操作的每个步骤之间的等待时间相同。这个等待时间数据需要通过web日志工具进行收集,日志工具反映了用户会话时间长度,查看的每个会话页面(用于决定用户步进时间),和每页百分比命中(用于确定实际业务功能的应用)。
Once the test has been designed from real-world data―or educated assumptions for an application not yet deployed―the test must be executed in a way that gathers valuable information at various user load levels. If the site is expected to handle 1,000 concurrent users, then it’s important not to start those users all at once.  Instead ramp your test slowly, adding one or more users at defined time intervals, until you reach 1000. This will allow you to determine overall performance at each level of user load and also make it easier to identify performance problems when they begin to occur.  
一旦测试已经从真实数据中设计完成-或者假定一个应用还没有被部署-测试必须在不同的用户负载级别进行收集有价值的信息。如果站点预期处理1000个并发用户,那么不要一次性的加载这些用户是很重要的。相应的应该慢慢的使用户数递增,在定义的时间间隔内增加一个或多个用户,知道达到1000用户。这将使你确定在每个级别用户负载的整体性能,也能够使你当性能问题发生的时间跟容易进行定位。
CONCLUSION
结论

The RBI methodology for load testing improves testing efficiency by focusing first on where bottlenecks most often occur―in the throughput. Once throughput has been thoroughly tested, you can test the system and application for concurrency to assess performance under realistic user loads.  By following a structured approach from system testing to application testing and slowly, systematically introducing complexity into the test cases, you can quickly isolate bottlenecks and their associated root cause.
RBI负载测试方法首先关注瓶颈在吞吐量方面最常发生从而提高测试效率。一旦吞吐量已经被彻底的测试,你可以在现实用户负载环境下测试系统和应用的并发来进行性能评估。通过从一个结构化方法从系统测试到应用测试并且缓慢的,系统的引进复杂度到进入测试用例,你能够快速的隔离瓶颈和他们相关的根本原因。
Although this paper focuses on methodology, it is important to point out that much of this process can and should be automated using an automated testing tool. Oracle Application Testing Suite is the centerpiece of Oracle Enterprise Manager for functional and load testing for Web and service-oriented architecture applications.
虽然本文着重方法论,但是需要重点指出很多这些过程能够通过自动化工具进行自动化测试。Oracle应用测试套件是一个Oracle企业管理者对web和面向服务体系结构应用的功能和负载测试的核心工具。

你可能感兴趣的:(性能测试,吞吐量,RBI方法)