Wednesday, March 23, 2011 8:27 PM
By James Whittaker
对于测试范围的形式,谷歌并没有使用通用的代码测试、集成测试、系统测试这些常用术语来做区分,而是使用小规模测试、中等规模测试、大规模测试这样的称呼【译者注:代码测试(code testing), 通常指单元测试和API级别的测试,一般使用XUnit、Gtest框架,但谷歌并没有使用代码级别测试这种说法】。小规模测试就是针对小量代码的测试,中等规模测试、大规模测试以此类推。所有的三种工程师角色【译者注,软件开发工程师、软件测试开发工程师、软件测试工程师,参见本系列第二篇】,都会去执行上面的三类测试,可能是自动化的测试,也可能是手动测试。
小规模测试,通常(但也并非所有)是自动化的,一般是针对一个单独的函数或者模块。这种测试一般由软件开发工程师【SWE】或者软件测试开发工程师【SET】来实现,通常在运行的时候会依赖模拟环境,当软件测试工程师【TEs】需要去诊断定位一个特定错误时,会去筛选一些小规模测试集合并运行来验证特定问题。对于小规模测试,主要集中在常见功能问题验证上,例如数据损坏、错误边界、发生错误时如何结束等。小规模测试尝试去解决的问题是,代码是否按照其假定的方式运行。
中等规模测试,可以是自动化的或者手动的,涉及到2个及以上功能模块,特别是要覆盖这些功能模块之间交互的地方。有不少软件测试开发工程师【SET】把这种测试描述成“测试一个函数,以及它最近的邻居们”【”testing a function and its nearest neighbors.”】。软件测试开发工程师在独立的功能模块开发完毕后会驱动进行这种测试,软件开发工程师是写这些测试代码、并调试和维护这些测试的主要力量。如果一个测试用例运行失败或者运行错误,相应的开发会自动地跳出来查看处理。在开发周期的后期,软件测试工程师会运行这些中等规模测试,可能是手动的方式(如果很难或者需要投入比较大成本去自动化的时候)或者自动化的方式去运行。中等规模测试尝试去解决的问题是,一些相近的交互功能模块组合在一起是否和预期一致。
大规模测试,涵盖三个及以上(通常更多)功能模块,描述最终用户的使用场景及其可能扩展。所有的功能模块集成为一个整体的时候需要去关心许多问题,但在谷歌,对于大规模测试,更倾向于着重结果,例如,这个软件是用户期望的那样么?所有的工程师都会参与到大规模测试中,无论是使用自动化还是探索性测试方法。大规模测试尝试去解决的问题是,这个产品运行地是否是最终用户期望的那样。
小规模测试、中等规模测试、大规模测试这些术语本身其实并不重要,你可以给它们取任何你想的名称。对于谷歌的测试人员来说,有了这样一个统一的称谓后,就可以使用这些称谓来讨论正在进行什么样的测试以及其测试范围。有一些雄性勃勃的测试人员也会谈到第四种测试,被称为超级大规模测试,公司的其他测试人员可以认为这样的测试是一个非常大的系统级别的测试,涵盖到几乎所有的功能而且会持续很长的时间,其他的解释都会比较多余了。
哪些需要被测试及测试范围的确定,这是一个动态变化的过程,在不同的产品之间会有比较大的差异。谷歌更倾向于频繁发布,从产品的外面用户那里得到反馈之后再迭代开发。如果谷歌开发了一些产品,或者在已有产品上增加了新功能,会尽可能早地对外发布并让外部用户能使用并从中受益。在这个过程中需要较早地把用户和外部开发者牵扯进来,并要有一个很好的处理规则来验证是否满足发布条件。
最后,自动化测试和手动测试,对于所有的三种类型测试【小规模、中等规模、大规模测试】来说当然更喜欢前者。如果能够被自动化,而且不需要任何人智力和直觉判断,那就应该把它变成自动化的。只有在特别需要人为判断的时候,例如用户的界面是否漂亮、或暴漏一些涉及用户隐私的内容时,在这些情况下应该保留手动测试。
话虽如此,对于谷歌来说非常重要的是仍然使用了大量的手动测试,不管是使用文本记录的方式还是使用探索性测试,虽然有些已经进入了自动化测试的视线。业界使用的录制技术将手动测试转变成自动化测试,可以在每个版本后自动地重复运行,这样保证了最少的回归工作,并把手动测试的重点放在新问题上。而且,谷歌已经将提交BUG的过程和一些手动测试的日常工作也自动化了,例如,如果一个自动化测试运行失败,系统会自动检测到最后一次代码变更的信息,一般来说这是引起测试失败的原因,系统会给这次代码提交的作者发送一封通知邮件同时自动创建一个BUG来记录这个问题。在测试上,“人类智慧的最后一英寸”体现在测试设计上,谷歌的下一代测试工具也正在这个方向上努力尝试,将其自动化。
这些工具在以后的文章中会被提及强调。不过,下一篇文章还是会将重点放在软件测试开发工程师【SET】的工作上。希望能得到你的持续关注。
公直
2012/4/2
英文原文,
How Google Tests Software – Part Five
http://googletesting.blogspot.com/2011/03/how-google-tests-software-part-five.html
Wednesday, March 23, 2011 8:27 PM
By James Whittaker
Instead of distinguishing between code, integration and system testing, Google uses the language of small, medium and large tests emphasizing scope over form. Small tests cover small amounts of code and so on. Each of the three engineering roles may execute any of these types of tests and they may be performed as automated or manual tests.
Small Tests are mostly (but not always) automated and exercise the code within a single function or module. They are most likely written by a SWE or an SET and may require mocks and faked environments to run but TEs often pick these tests up when they are trying to diagnose a particular failure. For small tests the focus is on typical functional issues such as data corruption, error conditions and off by one errors. The question a small test attempts to answer is does this code do what it is supposed to do?
Medium Tests can be automated or manual and involve two or more features and specifically cover the interaction between those features. I’ve heard any number of SETs describe this as “testing a function and its nearest neighbors.” SETs drive the development of these tests early in the product cycle as individual features are completed and SWEs are heavily involved in writing, debugging and maintaining the actual tests. If a test fails or breaks, the developer takes care of it autonomously. Later in the development cycle TEs may perform medium tests either manually (in the event the test is difficult or prohibitively expensive to automate) or with automation. The question a medium test attempts to answer is does a set of near neighbor functions interoperate with each other the way they are supposed to?
Large Tests cover three or more (usually more) features and represent real user scenarios to the extent possible. There is some concern with overall integration of the features but large tests tend to be more results driven, i.e., did the software do what the user expects? All three roles are involved in writing large tests and everything from automation to exploratory testing can be the vehicle to accomplish accomplish it. The question a large test attempts to answer is does the product operate the way a user would expect?
The actual language of small, medium and large isn’t important. Call them whatever you want. The important thing is that Google testers share a common language to talk about what is getting tested and how those tests are scoped. When some enterprising testers began talking about a fourth class they dubbed enormousevery other tester in the company could imagine a system-wide test covering nearly every feature and that ran for a very long time. No additional explanation was necessary.
The primary driver of what gets tested and how much is a very dynamic process and varies wildly from product to product. Google prefers to release often and leans toward getting a product out to users so we can get feedback and iterate. The general idea is that if we have developed some product or a new feature of an existing product we want to get it out to users as early as possible so they may benefit from it. This requires that we involve users and external developers early in the process so we have a good handle on whether what we are delivering is hitting the mark.
Finally, the mix between automated and manual testing definitely favors the former for all three sizes of tests. If it can be automated and the problem doesn’t require human cleverness and intuition, then it should be automated. Only those problems, in any of the above categories, which specifically require human judgment, such as the beauty of a user interface or whether exposing some piece of data constitutes a privacy concern, should remain in the realm of manual testing.
Having said that, it is important to note that Google performs a great deal of manual testing, both scripted and exploratory, but even this testing is done under the watchful eye of automation. Industry leading recording technology converts manual tests to automated tests to be re-executed build after build to ensure minimal regressions and to keep manual testers always focusing on new issues. We also automate the submission of bug reports and the routing of manual testing tasks. For example, if an automated test breaks, the system determines the last code change that is the most likely culprit, sends email to its authors and files a bug. The ongoing effort to automate to within the “last inch of the human mind” is currently the design spec for the next generation of test engineering tools Google is building.
Those tools will be highlighted in future posts. However, my next target is going to revolve around The Life of an SET. I hope you keep reading.