关于银行项目的软件测试
At some point during his or her career, a programmer might come across the following argument, presented by some colleague, partner, or decision maker:
在他或她的职业生涯中的某个时候,程序员可能会遇到以下由同事,合伙人或决策者提出的论点:
Since we can always test our software by hand, we do not need to implement Automated Software Testing.
由于我们始终可以手动测试软件,因此我们无需实施自动化软件测试。
Apparently, I reached that point in my career, so now I need to debate this argument. I decided to be a good internet citizen and publish my thoughts. So, in this post I am going to be deconstructing that argument, and demolishing it from every angle that it can be examined. I will be doing so using language that is easy to process by people from outside of our discipline.
显然,我的职业生涯达到了这一点,所以现在我需要辩论这个论点。 我决定成为一名优秀的互联网公民,并发表自己的想法。 因此,在这篇文章中,我将解构该论点,并从可以检查的每个角度将其删除。 我将使用易于被本学科以外的人处理的语言来这样做。
In the particular company where that argument was brought forth, there exist mitigating factors which are specific to the product, the customers, and the type of relationship we have with them, all of which make the argument not as unreasonable as it may sound when taken out of context. Even in light of these factors, the argument still deserves to be blown out of the water, but I will not be bothering the reader with the specific situation of this company, so as to ensure that the discussion is applicable to software development in general.
在提出该论点的特定公司中,存在一些针对产品,客户以及我们与他们之间的关系类型所特有的缓解因素,所有这些因素都使论点不像采用时那样听起来合理。你离题了。 即使考虑到这些因素,该论点仍然值得夸大其词,但我不会因该公司的具体情况而困扰读者,以确保该讨论总体上适用于软件开发。
In its more complete form, the argument may go like this:
以更完整的形式,该参数可能如下所示:
Automated Software Testing represents a big investment for the company, where all the programmers in the house are spending copious amounts of time doing nothing but writing software tests, but these tests do not yield any visible benefit to the customers. Instead, the programmers should ensure that the software works by spending only a fraction of that time doing manual testing, and then we can take all the time that we save this way and invest it in developing new functionality and fixing existing issues.
自动化软件测试对公司来说是一笔巨大的投资,公司内部的所有程序员都在花费大量时间,除了编写软件测试之外,什么也不做,但是这些测试不会给客户带来任何明显的好处。 相反,程序员应该通过仅花费一小部分时间来进行手动测试来确保软件能够正常工作,然后我们就可以花所有的时间保存这种方式,并将其投入到开发新功能和解决现有问题上。
To put it more concisely, someone might say something along these lines:
简而言之,有人可能会说些类似的话:
I do not see the business value in Automated Software Testing.
我在自动化软件测试中看不到商业价值。
This statement is a bunch of myths rolled up into an admirably terse statement. It is so disarmingly simple, that for a moment you might be at loss of how to respond. Where to begin, really. We need to look at the myths one by one. Here it goes:
这句话是一堆神话,令人钦佩。 如此简单的解除武装很简单,以至于您可能一会儿就无所适从。 从哪里开始,真的。 我们需要一一看待神话。 它去了:
误解一:软件测试是一项巨大的投资。 (Myth #1: Software testing represents a big investment.)
No it doesn’t. Or maybe it does, but its ROI is so high that you absolutely don’t want to miss it.
不,不是。 也许可以,但是它的投资回报率很高,您绝对不想错过它。
If you do not have software testing in place, then it is an established fact in our industry that you will end up spending an inordinate amount of time researching unexpected application behavior, troubleshooting code to explain the observed behavior, discovering bugs, fixing them, and often repeating this process a few times on each incident because the fix for one bug often creates another bug, or causes pre-existing bugs to manifest, often with the embarrassment of an intervening round-trip to the customer, because the “fixed” software was released before the newly introduced bugs were discovered.
如果您没有适当的软件测试,那么在我们行业中,已经确定的事实是,您将花费大量的时间来研究意外的应用程序行为,对代码进行故障排除以解释观察到的行为,发现错误,修复问题,以及经常在每个事件上重复此过程几次,因为对一个错误的修复通常会创建另一个错误,或者使先前存在的错误显现出来,并且常常会给客户带来往返的麻烦,因为“已修复”的软件在发现新引入的错误之前已发布。
Really, it works the same way as education. To quote a famous bumper sticker:
确实,它的工作方式与教育相同。 引用著名的保险杠贴纸:
You think education is expensive? Try ignorance!
您认为教育昂贵吗? 尝试无知!
Furthermore, your choice of Manual Software Testing vs. Automated Software Testing has a significant impact on the development effort required after the testing, to fix the issues that the testing discovers. It is a well established fact in the industry that the sooner a bug is discovered, the less it costs to fix it.
此外,您选择手动软件测试还是自动软件测试对测试后需要进行的开发工作产生重大影响,以解决测试中发现的问题。 业界早已发现,漏洞发现越早,修复它的成本就越少。
- The earliest time possible for fixing a mistake is when making it. That’s why we use strongly typed programming languages, together with Integrated Development Environments that continuously compile our code as we are typing it: this way, any syntax error or type violation is immediately flagged by the IDE with a red underline, so we can see it and fix it before proceeding to type the next line of code. The cost of fixing that bug is near zero. (And one of the main reasons why virtually all scripting languages are absolutely horrible is that in those languages, even a typo can go undetected and become a bug.) 最早的可能是在纠正错误时。 这就是为什么我们使用强类型编程语言以及集成开发环境来在我们键入代码时不断编译我们的代码的原因:这样,IDE会立即用红色下划线标记任何语法错误或类型冲突,因此我们可以看到它并修复它,然后继续键入下一行代码。 修复该错误的成本几乎为零。 (几乎所有脚本语言都绝对可怕的主要原因之一是,在这些语言中,即使是拼写错误也可能无法被发现并成为错误。)
- If you can’t catch a bug at the moment you are introducing it, the next best time to catch it is when running automated tests, which is what you are supposed to do before committing your changes to the source code repository. If that doesn’t happen, then the bug will be committed, and this already represents a considerable cost that you will have to pay later for fixing it. 如果在引入错误时仍无法捕获到该错误,那么下一个最佳捕获时间是在运行自动化测试时进行,这是在将更改提交到源代码存储库之前应该执行的操作。 如果这种情况没有发生,则将提交该错误,这已经代表了相当大的成本,您稍后必须为修复它付出代价。
- The next best time to catch the bug is by running automated tests as part of the Continuous Build System. This will at least tell you that the most recent commit contained a bug. If there is no Continuous Build with Automated Software Testing in place, then you suffer another steep increase in the price that you will have to pay for eventually fixing the bug. 下一个捕获该错误的最佳时间是通过运行自动测试作为持续构建系统的一部分。 这至少会告诉您最近的提交包含一个错误。 如果没有采用自动软件测试的持续构建,那么您将遭受另一笔陡峭的价格上涨,您必须为此付出最终修复该错误的费用。
- By the time a human being gets around to manually testing the software and discovering the bug, many more commits may have been made to the source code repository. This means that by the time the bug is discovered, we will not necessarily know which commits contributed to it, nor which programmers made the relevant commits, and even if we do, they will at that moment be working on something else, which they will have to temporarily drop, and make an often painful mental context switch back to the task that they were working on earlier. Naturally, the more days pass between committing a bug and starting to fix it, the worse it gets. 当人们开始手动测试软件并发现错误时,可能已经对源代码存储库进行了更多提交。 这意味着,在发现该错误时,我们将不必知道是由哪些提交引起的,也不是由哪些程序员进行了相关的提交,即使我们这样做了,他们此时仍将继续从事其他工作,他们将必须暂时放弃,并使经常痛苦的心理环境切换回他们之前从事的工作。 自然地,从提交错误到开始修复错误,它花费的时间越长,它变得越糟。
- At the extreme, consider trying to fix a bug months after it was introduced, when nobody knows anything about the changes that caused it, and the programmer who made those changes is not even with the company anymore. Someone has to become intimately familiar with that module in order to troubleshoot the problem, consider dozens of different commits that may have contributed to the bug, find it, and fix it. The cost of fixing that bug may amount to more than a programmer’s monthly salary. 在极端情况下,考虑在引入缺陷后几个月尝试尝试修复错误,这时没有人知道导致缺陷的更改,而做出这些更改的程序员甚至连公司都没有了。 为了解决此问题,必须有人熟悉该模块,考虑可能导致该错误的数十种不同提交,然后找到并修复。 修复该错误的成本可能超过程序员的月薪。
This is why the entire software industry today literally swears in the name of testing: it helps to catch bugs as early as possible, and to keep the development workflow uninterrupted, so it ends up saving huge amounts of money.
这就是为什么今天整个软件行业都以测试的名义发誓:它有助于尽早发现错误,并保持开发工作流不间断,因此最终节省了大量资金。
误解2:软件测试代表着一项投资。 (Myth #2: Software testing represents an investment.)
No, it does not even. Software testing is regarded by our industry as an integral part of software development, so it is meaningless to examine it as an investment separate from the already-recognized-as-necessary investment of developing the software in the first place.
不,甚至没有。 软件测试在我们的行业中被视为软件开发的组成部分,因此,将其视为与开发软件时已被认可的必要投资分开的投资是没有意义的。
Beware of the invalid line of reasoning which says that in order to implement a certain piece of functionality all we need is 10 lines of production code which cost 100 bucks, whereas an additional 10 lines, that would only be testing the first 10 lines, and would cost an extra 100 bucks, are optional.
提防无效的推理,即要实现某些功能,我们需要的是10条生产代码,这些代码成本100美元,而另外10条生产代码,则只能测试前10条代码,并且将花费额外的100块钱,是可选的。
Instead, the valid reasoning is that in order to implement said functionality we will need 20 lines of code, which will cost 200 bucks. It just so happens that 10 of these lines will reside in a subfolder of the source code tree called “production”, while the other 10 lines will reside in a subfolder of the same tree called “testing”; however, the precise location of each group of lines is a trivial technicality, bearing no relation whatsoever to any notion of “usefulness” of one group of lines versus the other. The fact is that all 20 of those lines of code are essential in order to accomplish the desired result.
取而代之的是,有效的推理是,要实现所述功能,我们将需要20行代码,这将花费200美元。 碰巧的是,这些行中的10行将驻留在源代码树的一个名为“生产”的子文件夹中,而其他10行将驻留在同一树的一个子文件夹中(称为“测试”); 但是,每组线的精确位置是一项琐碎的技术性工作,与一组线相对于另一组线的“有用性”概念无关。 事实是,所有这些代码行中的所有20行对于实现期望的结果都是必不可少的。
That’s because production code without corresponding testing code cannot be said with any certainty to be implementing any functionality at all. The only thing that can be said about testless code is that it has so far been successful at creating the impression to human observers that its behavior sufficiently resembles some desired functionality. Furthermore, it can only be said to be successful to the extent that it has been observed thus far, meaning that a new observation tomorrow might very well find that it is doing something different.
这是因为没有相应测试代码的生产代码根本无法确定要实现任何功能。 关于无测试代码,唯一可以说的是,迄今为止,它已经成功地给人类观察者留下了印象,即其行为足以类似于某些所需的功能。 此外,只能说到目前为止它已被观察到是成功的,这意味着明天的新观察很可能会发现它在做一些不同的事情。
That’s a far cry from saying that “this software does in fact implement that functionality”.
这与说“该软件确实实现了该功能”相差甚远。
误解三:软件测试只是草率的管理。 (Myth #3: Software testing is just sloppiness management.)
This is usually not voiced, but implied. So, why can’t programmers write correct software the fist time around? And why on earth can’t software just stay correct once written?
通常不会发声,而是暗示。 那么,为什么程序员不能在第一时间编写出正确的软件? 到底为什么软件一旦编写就不能保持正确?
There is a number of reasons for this, the most important ones have to do with the level of maturity of the software engineering discipline, and the complexity of the software that we are being asked to develop.
造成这种情况的原因有很多,最重要的原因与软件工程专业的成熟程度以及我们要求开发的软件的复杂性有关。
Maturity
到期
Software development is not a hard science like physics and math. There exist some purely scientific concepts that you learn in the university, but they are rarely applicable to the every day reality of our work. When it comes to developing software, there is not as much help available to us as there is to other disciplines by means of universal laws, fundamental axioms, established common practices and rules, ubiquitous notations, books of formulas and procedures, ready made commercially available standardized components to build with, etc. It is difficult to even find parallels to draw for basic concepts of science and technology such as experimentation, measurement, and reproducibility. That’s why software engineering is sometimes characterized as being more of an art than a science, and the fact that anyone can potentially become a programmer without necessarily having studied software engineering does not help to dispel this characterization.
软件开发不是像物理学和数学那样的硬科学。 您在大学里有一些纯粹的科学概念,但是它们很少适用于我们工作的日常现实。 在软件开发方面,通过通用法律,基本公理,已建立的通用惯例和规则,无处不在的符号,公式和程序的书,以及现成的商业化方法,对我们而言,向我们提供的帮助并不比其他学科多标准化的组件等。很难找到类似的基本科学技术概念,例如实验,测量和可重复性。 这就是为什么软件工程有时被描述为一门艺术而不是一门科学的原因,而且任何人都可以不必学习软件工程就可以成为程序员的事实无助于消除这种特征。
Automated Software Testing is one of those developments in software engineering that make it more like a science than like an art. With testing we have finally managed to introduce the concepts of experimentation, measurement, and reproducibility in software engineering. Whether testability alone is enough to turn our discipline into a science is debatable, but without testing we can be certain that we are doing nothing but art.
自动化软件测试是软件工程领域的其中一项发展,使之更像一门科学,而不是一门艺术。 通过测试,我们终于设法引入了软件工程中的实验,测量和可再现性的概念。 光靠可测性是否足以将我们的学科变成一门科学还是有争议的,但是如果没有测验,我们可以确定我们除了艺术之外什么也没做。
Complexity
复杂
The software systems that we develop today are immensely complex. A simple application which presents a user with just 4 successive yes/no choices has 16 different execution paths that must be tested. Increase the number of choices to 7, and the number of paths skyrockets to 128. Take a slightly longer but entirely realistic use case sequence of a real world application consisting of 20 steps, and the total number of paths exceeds one million. That’s an awful lot of complexity, and so far we have only been considering yes/no choices. Now imagine each step consisting of not just a yes/no choice, but an entire screen full of clickable buttons and editable fields which are interacting with each other. This is not an extreme scenario, it is a rather commonplace situation, and its complexity is of truly astronomical proportions.
我们今天开发的软件系统非常复杂。 一个仅向用户显示4个连续的是/否选择的简单应用程序具有16个必须测试的不同执行路径。 将选择数量增加到7,将路径数量增加到128。在由20个步骤组成的真实应用程序中,使用稍长一些但完全符合实际的用例序列,路径总数超过一百万。 这非常复杂,到目前为止,我们仅在考虑是/否选择。 现在,想象每个步骤不仅包括是/否选择,还包括一个充满可点击按钮和可编辑字段的整个屏幕,它们相互交互。 这不是一个极端的情况,这是一个相当普遍的情况,其复杂性确实是天文数字。
Interestingly enough, hardware engineers like to off-load complexity management to the software. Long gone are the times when machines consisted entirely of hardware, with levers and gears and belts and cams all carefully aligned to work in unison, so that turning a crank at one end would cause printed and folded newspapers to come out the other end. Nowadays, the components of the hardware tend to not interact with each other, because that would be too complex and too difficult to change; instead, every single sensor and every single actuator is connected to a central panel, from which software takes charge and orchestrates the whole thing.
有趣的是,硬件工程师喜欢将复杂性管理卸载到软件上。 机器完全由硬件组成,杠杆,齿轮,皮带和凸轮都经过精心对准以统一工作的时代已经一去不复返了,因此,转动曲柄的一端会使印刷和折叠的报纸从另一端出来。 如今,硬件组件之间往往不会相互影响,因为这太复杂了,很难更改。 取而代之的是,每个传感器和每个执行器都连接到中央面板,中央面板由软件负责并协调整个过程。
However, software is not a magical place where complexity just vanishes; you cannot expect to provide software with complex inputs, expect complex outputs, and at the same time expect the insides of it to be nothing but purity and simplicity: a system cannot have less complexity than the complexity inherent in the function that it performs.
但是,软件并不是复杂性消失的神奇地方。 您不能期望为软件提供复杂的输入,期望复杂的输出,同时不能期望其内部只有纯粹和简单:系统的复杂性不能低于其执行功能固有的复杂性。
The value of moving the complexity from the hardware to the software is that the system is then easier to change, but when we say “easier” we do not mean “simpler”; all of the complexity is still there and must be dealt with. What we mean when we say “easier to change” is that in order to make a change we do not have to begin by sending new blueprints to the steel foundry. That’s what that you gain by moving complexity from the hardware to the software: being able to change the system without messy, time-consuming, and costly interactions with the physical world.
将复杂性从硬件转移到软件的价值在于,系统更易于更改,但是当我们说“更轻松”时,我们并不意味着“更简单”。 所有的复杂性仍然存在,必须加以解决。 当我们说“更容易改变”时,我们的意思是,为了进行改变,我们不必先向钢厂发送新的蓝图。 这就是通过将复杂性从硬件转移到软件而获得的:无需与物理世界进行混乱,耗时且昂贵的交互即可更改系统。
So, even though we have eliminated those precisely crafted and carefully arranged levers and gears and belts and cams, their counterparts now exist in the software, you just do not see them, you have no way of seeing them unless you are a programmer, and just as the slightest modification to a physical machine of such complexity would be a strenuous ordeal, so is the slightest modification to a software system of similar complexity a strenuous ordeal.
因此,即使我们已经消除了那些经过精心设计和精心布置的杠杆,齿轮,皮带和凸轮,但它们现在已经存在于软件中,您只是看不到它们,除非您是程序员,否则您将无法看到它们,正如对这种复杂性的物理机器进行最小的修改将是一场艰辛的苦难一样,对具有相同复杂度的软件系统进行的最小修改也将是一次艰苦的苦难。
Software can only handle complexity if done right. You cannot develop complex software without sophisticated automated software testing in place, and even if you develop it, you cannot make any assumptions whatsoever about its correctness. Furthermore, even if it appears to be working correctly, you cannot make the slightest change to it unless automated software testing is in place to determine that it is still working correctly after the change. That is because you simply cannot test thousands or millions of possible execution paths in any way other than in an automated way.
如果操作正确,软件只能处理复杂性。 如果没有进行复杂的自动化软件测试,就无法开发复杂的软件,即使您进行开发,也无法对软件的正确性做出任何假设。 此外,即使它看起来可以正常工作,也不能对其进行任何改动,除非进行了自动软件测试,以确定更改后它仍然可以正常工作。 那是因为您根本无法以自动化以外的任何方式测试成千上万的可能执行路径。
误解4:测试对客户没有明显好处 (Myth #4: Testing has no visible benefit to the customers)
Yes it does. It is called reliable, consistent, correctly working software. It is also called software which is continuously improving instead of remaining stagnant due to fear of it breaking if sneezed at. It is also called receiving newly introduced features without losing old features that used to work but are now broken. And it is even called receiving an update as soon as it has been introduced instead of having to wait until some poor souls have clicked through the entire application over the course of several days to make sure everything still works as it used to.
是的,它确实。 它被称为可靠,一致,正确运行的软件。 它也被称为软件,它会不断改进而不是因为担心打喷嚏而破裂而停滞不前。 这也称为接收新引入的功能,而又不会丢失曾经起作用但已被破坏的旧功能。 它甚至被称为在引入更新后立即接收更新,而不必等到几天之后可怜的灵魂点击了整个应用程序,以确保一切仍然正常进行。
误区5:手动测试可以确保软件正常运行。 (Myth #5: Manual testing can ensure that the software works.)
No it cannot. That’s because the complexity of the software is usually far greater than what you could ever possibly hope to test by hand. An interactive application is not like a piece of fabric, which you can visually inspect and have a fair amount of certainty that it has no defects. You are going to need to interact with the software, in a mind-boggling number of different ways, to test for a mind-boggling number of possible failure modes.
不,它不能。 这是因为软件的复杂性通常远大于您可能希望手工进行的测试。 交互式应用程序不像一块织物,您可以对其进行视觉检查,并且可以肯定地说它没有缺陷。 您将需要以令人难以置信的多种不同方式与软件进行交互,以测试令人难以置信的多种可能的故障模式。
When we do manual testing, in order to save time (and our sanity) we focus only on the subset of the functionality of the software which may have been affected by recent changes that have been made to the source code. However, the choice of which subsets to test is necessarily based on our estimations and assumptions about what parts of the program may have been affected by our modifications, and also on guesses about the ways in which these parts could behave if adversely affected. Alas, these estimations, assumptions, and guesses are notoriously unreliable: it is usually the parts of the software that nobody expected to break that in fact break, and even the suspected parts sometimes break in ways quite different from what anyone had expected and planned to test for.
当我们进行手动测试时,为了节省时间(以及我们的理智),我们仅关注软件功能的子集,该子集可能已受到对源代码所做的最新更改的影响。 但是,选择哪个子集进行测试必须基于我们对程序的哪些部分可能已受到我们的修改影响的估计和假设,以及对这些部分受到不利影响时的行为方式的猜测。 遗憾的是,这些估计,假设和猜测非常不可靠:通常是没人希望破坏的软件部分实际上破坏了,甚至可疑部分有时也以不同于任何人预期和计划破坏的方式破坏了。测试。
And this is by definition so, because all the failure modes that we can easily foresee, based on the modifications that we make, we usually examine ourselves before even calling the modifications complete and committing our code.
顾名思义,这是因为我们可以很容易地预见到所有失败模式,基于我们所做的修改,我们通常会在检查完修改并提交代码之前检查自己。
Furthermore, it is widely understood in our industry that persons involved in the development of software are generally unsuitable for testing it. No developer ever uses the software with as much recklessness and capriciousness as a user will. It is as if the programmer’s hand has a mind of its own, and avoids sending the mouse pointer in bad areas of the screen, whereas that is precisely where the user’s hand is guaranteed to send it. It is as if the programmer’s finger will never press that mouse button down as heavily as the user’s finger will. Even dedicated testers start behaving like the programmers after a while on the job, because it is only human to employ acquired knowledge about the environment in navigating about the environment, and to re-use established known good paths. It is in our nature. You can ask people to do something which is against their nature, and they may earnestly agree, and they may even try their best, but the results are still guaranteed to suffer.
此外,在我们的行业中,众所周知,从事软件开发的人员通常不适合对其进行测试。 没有开发人员会像用户所愿那样鲁re和反复无常地使用该软件。 就像程序员的手有自己的想法一样,避免将鼠标指针发送到屏幕的不良区域,而正是这保证了用户的手可以发送它。 好像程序员的手指永远不会像用户的手指一样沉重地按下该鼠标按钮。 即使是专门的测试人员,在工作一段时间后也开始表现出像程序员一样的行为,因为只有人类才能在导航环境中运用已获得的有关环境的知识,并重用已建立的已知良好路径。 这是我们的本性。 您可以要求人们做违背他们本性的事情,他们可能会真诚地同意,甚至可能会尽力而为,但结果仍然会受到影响。
Then there is repetitive motion fatigue, both of the physical and the mental kind, that severely limit the scope that any kind of manual testing will ever have.
然后是重复性的运动疲劳,无论是身体上还是精神上的疲劳,都严重限制了任何种类的手动测试所具有的范围。
Finally, there is the issue of efficiency. When we do manual software testing, we are necessarily doing it in human time, which is excruciatingly slow compared to the speed at which a computer would carry out the same task. A human being testing permutations at the rate of one click per second could theoretically test one million permutations in no less than 2 working months, the computer may do it in a matter of minutes. And the computer will do this perfectly, while the most capable human being will do this quite sloppily in comparison. That’s how inefficient manual software testing is.
最后,还有效率问题。 当我们进行手动软件测试时,我们一定会在人工时进行测试,这与计算机执行相同任务的速度相比实在是太慢了。 一个人以每秒单击一次的速度测试排列,理论上可以在不少于2个工作月的时间内测试一百万个排列,计算机可以在几分钟内完成。 而计算机将完美地做到这一点,而相比之下,最有才干的人将做到这一点。 这就是效率低下的手动软件测试。
误解6:手动测试比编写测试花费的时间更少。 (Myth #6: Manual testing takes less time than writing tests.)
No it doesn’t. If you want to say that you are actually doing some manual testing worth speaking of, and not a joke of it, then you will have to spend copious amounts of time doing nothing but that, and you will have to keep repeating it all over again every single time the software is modified.
不,不是。 如果您想说您实际上是在进行一些值得一提的手动测试,而不是开玩笑,那么您将不得不花费大量的时间无所作为,而您将不得不继续重复一遍。每次修改软件。
In contrast, with software testing you are spending some time up-front building some test suites, which you will then be able to re-execute every time you need them, with comparatively small additional effort. So, manual testing for a certain piece of software is an effort that you have to keep repeating, while writing automated test suites for that same piece of software is something that you do once and from that moment on it keeps paying dividends.
相反,在进行软件测试时,您将花费一些时间来预先构建一些测试套件,然后您就可以在每次需要它们时重新执行它们,而只需付出相对较小的努力即可。 因此,您必须不断重复对某个软件进行手动测试,而为同一软件编写自动化测试套件则需要您一次执行,从那一刻起,它就一直为您带来回报。
This is why it is a fallacy to say that we will just test the software manually and with the time that we will save we will implement more functionality: as soon as you add a tiny bit of new functionality, you have to repeat the testing all over again. Testing the software manually is a never ending story.
这就是为什么说我们只是手动测试软件,然后节省时间会实现更多功能是谬论的原因:一旦您添加了一点新功能,就必须重复所有测试再次。 手动测试软件是一个永无止境的故事。
The situation is a lot like renting vs. buying: with renting, at the end of each month you are at exactly the same situation as you were in the beginning of the month: the home still belongs in its entirety not to you, but to the landlord, and you must now pay a new rent in full, in order to stay for one more month. With buying, you pay a lot of money up front, and some maintenance costs and taxes will always be applicable, but the money that you pay goes into something tangible, it is turned into value in your hands in the form of a home that you now own.
这种情况很像是租房还是买房:在租房时,每个月底时您的状况与月初时完全一样:房屋仍然完全不属于您,而是属于您房东,您现在必须全额支付新租金,才能再住一个月。 购买时,您需要预先支付很多钱,并且始终需要支付一些维护费用和税金,但是所支付的钱会变成有形的东西,并以房屋的形式变成您手中的价值现在拥有。
Furthermore, the relative efficiency of manual testing is usually severely underestimated. In order to do proper manual testing, you have to come up with a meticulous test plan, explaining what the tester is supposed to do, and what the result of each action should be, so that the tester can tell whether the software is behaving according to the requirements or not. However, no test plan will ever be as unambiguous as a piece of code that is actually performing the same test, and the more meticulous you try to be with the test plan, the less you gain, because there comes a point where the effort of writing the test plan starts being comparable to the effort of writing the corresponding automated test instead. So, you might as well write the test plan down in code to begin with.
此外,通常会严重低估手动测试的相对效率。 为了进行正确的手动测试,您必须提出一个细致的测试计划,说明测试人员应该执行的操作以及每个操作的结果,以便测试人员可以判断软件是否根据是否达到要求。 但是,没有一个测试计划会像实际执行相同测试的代码那样明确,并且您对测试计划的努力越细致,获得的收益就越少,因为在某种程度上,开始编写测试计划与开始编写相应的自动化测试相当。 因此,您不妨先在代码中写下测试计划。
Of course one round of writing automated software testing suites will always represent more effort than a few rounds of manually performing the same tests, so the desirability of one approach vs. the other may depend on where you imagine the break-even point to be. If you reckon that the break-even point is fairly soon, then you already see the benefit of implementing automated software testing as soon as possible. If you imagine it will be after the IPO, then you might think it is better to defer it, but actually, even in this case you might not want to go this way, more about that later.
当然,一轮编写自动化软件测试套件将比几轮手动执行相同的测试花费更多的精力,因此,一种方法与另一种方法的合意性可能取决于您对盈亏平衡点的设想。 如果您认为收支平衡点已经相当快了,那么您已经看到了尽快实施自动化软件测试的好处。 如果您认为它将在IPO之后进行,那么您可能会认为最好推迟发行,但实际上,即使在这种情况下,您可能也不想这样做,以后再讨论。
Well, let me tell you: in the software industry the established understanding is that the break-even point is extremely soon. Like write-the-tests-before-the-app soon. (A practice known as Test-Driven Development.)
好吧,让我告诉您:在软件行业中,已经建立的理解是,收支平衡点非常快。 就像很快在应用程序之前编写测试。 (一种称为测试驱动开发的实践。)
误解7:您无需进行软件测试就可以继续开发新功能并解决现有问题。 (Myth #7: You can keep developing new functionality and fixing existing issues without software testing in place.)
In theory you could, but in practice you can’t. That’s because every time you touch the slightest part of the software, everything about the software is now potentially broken. Without automated software testing in place, you just don’t know. This is especially true of software which has been written messily, which is in turn especially common in software which has been written without any Automated Software Testing in place from the beginning. Paradoxically enough, automated software testing forces software designs to have some structure, this structure reduces failures, so then the software has lesser testing needs.
理论上可以,但实际上不能。 这是因为每当您触摸软件的任何部分时,有关该软件的所有信息现在都可能会损坏。 没有适当的自动化软件测试,您就是不知道。 对于杂乱编写的软件尤其如此,这反过来在从一开始就没有进行任何自动化软件测试的情况下尤其常见。 矛盾的是,自动化软件测试迫使软件设计具有某种结构,这种结构减少了故障,因此软件对测试的需求较少。
To help lessen change-induced software fragility, we even have a special procedure governing how we fix bugs: when a bug is discovered, we do not always just go ahead and fix it. Instead, what we often do is that we first write a test which checks for the bug according to the requirements, without making any assumptions as to what might be causing it. Of course, since the bug is in the software, the test will initially be observed to fail. Then, we fix the bug according to your theory as to what is causing it, and we should see that test succeeding. If it doesn’t, then we fixed the wrong bug, or more likely, we just broke something which used to be fine. Furthermore, all other tests better also keep succeeding, otherwise in fixing this bug we broke something else. As a bonus, the new test now becomes a permanent part of the suite of tests, so if this particular behavior is broken again in the future, this test will catch it.
为了帮助减少变更引起的软件脆弱性,我们甚至设有专门的程序来控制错误的修复:发现错误后,我们并不总是继续进行修复。 取而代之的是,我们经常要做的是首先编写一个测试,该测试根据需求检查错误,而无需对可能导致错误的原因进行任何假设。 当然,由于该错误位于软件中,因此最初会观察到测试失败。 然后,我们会根据您的理论来修复导致问题的错误,然后测试应该会成功。 如果不是,则我们修复了错误的错误,或更可能的是,我们只是打破了以前很好的东西。 此外,其他所有更好的测试也可以继续保持成功,否则在修复此错误时,我们会破坏其他功能。 另外,新的测试现在成为测试套件的永久组成部分,因此,如果将来再次违反此特定行为,则此测试将抓住它。
If you go around “fixing bugs” without testing mechanisms such as this in place, you are not really fixing bugs, you are just shuffling bugs around. The same applies to features: if you go around “adding features” without the necessary testing mechanisms in place, then by definition you are not adding features, you are adding bugs.
如果您在没有解决诸如此类的测试机制的情况下解决“修复错误”,那么您并不是在真正地修复错误,而只是改组了错误。 这同样适用于功能:如果没有适当的必要测试机制就进行“添加功能”,那么根据定义,您没有添加功能,而是在添加错误。
误区八:软件测试没有商业价值 (Myth #8: Software testing has no business value)
Yes it does. The arguments that I have already listed should be making it clear that it does, but let me provide one more argument, which shows how Automated Software Testing directly equates to business value.
是的,它确实。 我已经列出的论点应该清楚地表明确实如此,但是让我再提供一个论点,它表明自动化软件测试如何直接等同于业务价值。
A potentially important factor for virtually any kind of business is investment. When an investor is interested in a software business, and if they have the slightest clue as to what it is that they are doing, they are likely to want to evaluate the source code before committing to the investment. Evaluation is done by sending a copy of the software project to an independent professional software evaluator. The evaluator examines the software and responds with investment advice.
对于几乎任何类型的业务而言,潜在的重要因素都是投资。 当投资者对软件业务感兴趣时,如果他们对自己的工作有丝毫的了解,他们很可能希望在进行投资之前先评估源代码。 评估是通过将软件项目的副本发送给独立的专业软件评估员来完成的。 评估人员检查软件,并给出投资建议。
The evaluator may begin by using the software as a regular user to ensure that it appears to do what it is purported to do, then they may examine the design to make sure it makes sense, then they may examine the source code to make sure things look normal, etc. After spending not too much time on these tasks, the evaluator is likely to proceed to the tests. Software testing is so prevalent in the software industry, that it is unanimously considered to be the single most important factor determining the quality of the software.
评估者可以首先以普通用户的身份使用该软件,以确保它看起来像在做它打算做的事情,然后他们可以检查设计以确保它有意义,然后可以检查源代码以确保一切正常。看起来很正常,等等。在这些任务上花费了太多时间之后,评估者可能会继续进行测试。 软件测试在软件行业中如此普遍,以致于它被一致认为是决定软件质量的最重要的单个因素。
If there are no tests, this is very bad news for the investment advice.
如果没有测试,这对于投资建议来说是个坏消息。
If the tests do not pass, this is also very bad news.
如果测试不通过,这也是一个坏消息。
If the tests succeed, then the next question is how thorough they are.
如果测试成功,那么下一个问题是测试的彻底程度。
For that, the evaluator is likely to use a tool called “Code Coverage Analyzer”. This tool keeps track of the lines of code that are being executed as the program is running, or, more likely, as the program is being exercised by the tests. By running the tests while the code coverage analysis tool is active, the evaluator will thus obtain the code coverage metric of the software. This is just a single number, from 0 to 100, and it is the percentage of the total number of source code lines that have been exercised by the tests. The more thorough the tests are, the higher this number will be.
为此,评估人员可能会使用一个名为“代码覆盖率分析器”的工具。 该工具跟踪在程序运行时(或者更有可能在测试正在执行程序时)正在执行的代码行。 通过在代码覆盖率分析工具处于活动状态时运行测试,评估程序将因此获得软件的代码覆盖率度量。 这只是一个从0到100的数字,它是测试已执行的源代码行总数的百分比。 测试越彻底,这个数字就越高。
This is a very useful metric, because in a single number it captures an objective, highly important quality metric for the entirety of the software system. It also tends to highly correlate to the actual investment advice that the evaluator will end up giving. The exact numbers may vary depending on the product, the evaluator, the investor, the investment, and other circumstances, but a rough breakdown is as follows:
这是非常有用的度量标准,因为它以单个数字捕获了整个软件系统的客观,非常重要的质量度量标准。 它还往往与评估者最终给出的实际投资建议高度相关。 确切数字可能会因产品,评估人员,投资者,投资和其他情况而异,但大致分类如下:
- below 50% means “run in the opposite direction, this is as good as Ebola.” 低于50%表示“朝相反方向运行,这与埃博拉病毒一样好”。
- 50–60% means “poor”, 50-60%表示“差”,
- 60–70% means “decent”, 60–70%表示“体面”,
- 70–80% means “good”, 70–80%表示“好”,
- 80–90% means “excellent”, 80–90%表示“优秀”,
- 90–100% means “exceptional”. 90–100%表示“例外”。
Of course, the graph of programming effort required vs. code coverage achieved is highly non-linear. It is relatively easy to pass the 45% mark; it becomes more and more difficult as you go past the 65% mark; it becomes exceedingly difficult once you cross the 85% mark.
当然,所需的编程工作量与所获得的代码覆盖率的关系图是高度非线性的。 通过45%相对容易; 当您超过65%时,难度会越来越大; 越过85%的关卡,将变得异常困难。
In my experience and understanding, conscientious software houses in the general commercial software business are striving for the 75% mark. In places where they only achieve about 65% code coverage they consider it acceptable but at the same time they either know that they could be doing better, or they have low self-respect. High criticality software (that human life depends on, or a nation’s reputation,) may have 100% coverage, but a tremendous effort is required to achieve this. In any case, what matters is not so much what the developers think, but what the evaluator thinks; and evaluators tend to use the established practices of the industry as the standard by which they judge. The established practices call for extensive software testing, so if you do not do that, then your evaluation is not going to look good.
以我的经验和理解,在一般商业软件业务中,有责任心的软件公司正在争取75%的成绩。 在他们只能实现大约65%的代码覆盖率的地方,他们认为这是可以接受的,但同时他们知道自己可以做得更好,或者他们的自尊心很低。 高危度软件(人类赖以生存或享有盛誉的国家/地区)可能具有100%的覆盖率,但是要实现这一目标需要付出巨大的努力。 无论如何,重要的不是开发人员的想法,而是评估者的想法。 评估者倾向于将行业的既定实践作为他们判断的标准。 既定的做法要求进行广泛的软件测试,因此,如果您不这样做,则您的评估效果会很差。
So, is there business value in software testing? investment prospects alone say yes, regardless of the technical merits of it. Furthermore, software evaluation may likely be part of the necessary preparations for an IPO to take place, so even if you imagined the break-even point of automated testing vs. manual testing to be after the IPO, there is still ample reason to have them all in perfect working order well before the IPO.
那么,软件测试是否具有商业价值? 不论技术前景如何,仅投资前景就可以。 此外,软件评估可能是进行IPO的必要准备工作的一部分,因此,即使您认为IPO后自动测试与手动测试的收支平衡点仍然存在,仍然有充分的理由进行软件测试一切都在首次公开募股之前就处于完美的工作状态。
The above is applicable for businesses that are exclusively into software development. I do not know to what degree parallelisms can be drawn with companies for which software is somewhat secondary, but I suspect it is to no small extent.
以上内容适用于专门从事软件开发的业务。 我不知道软件在哪些方面处于次要地位的公司可以在多大程度上获得并行性,但我怀疑这在很大程度上是不可行的。
This article was originally published in Dec 2019 in the author’s personal blog: https://blog.michael.gr/2019/12/on-software-testing.html
本文最初于2019年12月发布在作者的个人博客中: https : //blog.michael.gr/2019/12/on-software-testing.html
翻译自: https://medium.com/@mike.nakis/on-software-testing-15ac960ab1e5
关于银行项目的软件测试