本文介绍软工大牛Andrea Stocco及其今年发表的顶会论文“Visual Web Test Repair”。
论文名称:Visual Web Test Repair
一作:Andrea Stocco
单位:University of British Columbia
联系方式:[email protected]
Andrea Stocco个人信息:
主页是 https://www.ece.ubc.ca/~astocco/
现在是博士后,导师是Ali Mesbah(第三作者)。
根据其主页提供的发表文章信息,我发现这位作者前两年似乎还声明不显,没有什么顶会文章,有不少第二作者的文章,今年一下三篇顶会,两篇FSE 2018(一作),一篇ICSE 2018(二作)。
有空真的值得读一下他今年的顶会文章:
1)Web Test Repair Using Computer Vision. Stocco, A.; Yandrapally, R.; and Mesbah, A. In Proceedings of the 26th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, of ESEC/FSE 2018 - Demonstration Track, pages 4 pages, 2018. ACM
2)Visual Web Test Repair. Stocco, A.; Yandrapally, R.; and Mesbah, A. In Proceedings of the 26th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, of ESEC/FSE 2018, pages 12 pages, 2018. ACM
3)Fine-Grained Test Minimization. Vahabzadeh, A.; Stocco, A.; and Mesbah, A. In Proceedings of the 40th ACM/IEEE International Conference on Software Engineering, of ICSE 2018, pages 210–221, 2018. ACM
三篇文章都和测试有关,其中两篇都是web test。
Motivation(写作动机):
To detect the root causes of a test breakage, developers typically inspect the test’s interactions with the application through the GUI. Existing automated test repair techniques focus instead on the code and entirely ignore visual aspects of the application.
Our work(论文工作):
We propose a test repair technique that is informed by a visual analysis of the application. Our approach captures relevant visual information from tests execution and analyzes them through a fast image processing pipeline to visually validate test cases as they re-executed for regression purposes. Then, it reports the occurrences of breakages and potential fixes to the testers. Our approach is also equipped with a local crawling mechanism to handle non-trivial breakage scenarios such as the ones that require to repair the test’s workflow. We implemented our approach in a tool called Vista.
这里讲的是网页测试,和传统的测试还不一样。
A test breakage is defined as the event that occurs when the test raises exceptions or errors that do not pertain to the presence of a bug or a malfunction of the application under
test. This is different from cases in which tests expose failures, meaning they raise exceptions which signal the presence of one or more bugs in the production code. In the latter case, the developer is required to correct the application, whereas in the former case, the tester must find a fix for the broken test.
所以,这不是对源代码的修复,是对网页测试用例的修复。
The key insight behind our approach is that the manual actions and reasoning that testers perform while searching for repairs can be automated to a large extent by leveraging and combining differential testing, computer vision, and local crawling. 人工的修复和推导行为很大程度上可以被 differential testing,computer vision和local crawling结合起来代替。
作者在empirical evaluation中提出了2个research questions:
RQ1 (effectiveness): How effective is the proposed visual approach at repairing test breakages compared to a state-of-the-art
DOM-based approach?
RQ2 (performance): What is the overhead and runtime of executing the visual approach compared to the DOM-based one?
这个真的很厉害诶。一个是effectiveness有效性。一个是performance性能。
这也算解决了我的一个疑惑,什么时候该用effectiveness。(efficiency)
1)都是用latex写的,明显可以看出来。文中命名的工具VISTA不是word里面那种简单的大写,而是latex里面的大写。看来我得好好学一下latex啊
2)感觉introduction讲的挺清楚的。作者用一般的test和网页test作比较,让我一下就搞明白了这个不是对源代码的修复,是对test的修复。(看来比较是个很不错的选择)
3)作者喜欢在文中用斜体字来强调。
4)我发现这一篇FSE 和 前面我看的一篇都喜欢用 “the key sight”:
The key insight behind our approach is that the manual actions and reasoning that testers perform while searching for repairs can be automated to a large extent by leveraging and combining differential testing, computer vision, and local crawling.
The key insight behind MemFix is that finding such a set of deallocation statements corresponds to solving an exact cover problem derived from a variant of typestate static analysis.
很神奇,这是一个套路,可以用的。
5)看了两篇FSE 2018,我发现这些顶会论文都不是平白无故来的,从他们的论文可以看到很深的积累,比如这篇文章,是结合了differential testing,computer vision和local crawling的,这非常厉害,三个里面我一个都不懂。
以后,也要多学些技术。我的知识面还是太窄了。
6)有套路,这文章竟然列举了五个贡献,而且每个贡献就两行,基本上一句话。
很酷哇。
Our paper makes the following contributions:
• The first repair-oriented taxonomy of test breakages in web applications. 第一个进行分类的
• An algorithm for visually monitoring and validating web test cases, based on a fast image processing pipeline. 一个视觉检查web test cases的算法
• A novel approach for repairing broken test cases using visual analysis and crawling. 修复test cases的算法
• An implementation of our algorithm in a tool named Vista, which is publicly available [58]. 把这些算法实现到了开源工具。
• An empirical evaluation of Vista in repairing the breakages found in our study. Vista was able to provide correct repairs for 81% of breakages, with a 41% increment over an existing
DOM-based state-of-the-art approach. 实证分析。
原来开源,也是一个贡献。我看到的两篇FSE都这么写。
此外,实证分析也是贡献。
7)related work一般都放在倒数第二节。我之前都放在第二节。。。
这个可以注意下。
9)作者写的threats to validity好像只提到了一类threat:generalizability
Concerning the generalizability of our results, we ran our approach with a limited number of subjects and test suites. However, we believe the approach to be applicable in a general web testing scenario (unless strict timing constraints apply), even though the magnitude of the results might change. To limit biases in the manual selection of the versions, we considered all the available releases after those for which the test suites were developed for.
10)他们的related work都写的无可挑剔,基本上。参考文献引用的超级多,写的也比较清楚。
breakage 英[ˈbreɪkɪdʒ]
美[ˈbrekɪdʒ]
n. 破坏,破损,破损量;
non-trivial
非平凡; 非平凡的; 不平凡; 有意义的; 面对较重大;
span
VERB 包括;遍及
If something spans a range of things, all those things are included in it.
e.g., Bernstein’s compositions spanned all aspects of music, from symphonies to musicals.
occurrences 英[ə’kʌrənsɪz]
美[ə’kʌrənsɪz]
n. 事件; 出现; 发生( occurrence的名词复数 ); 发生的事;
pertain 英[pəˈteɪn]
美[pərˈteɪn]
vi. 适合; 关于,有关; 附属,从属;
pertain to
(formal) 与…相关;关于
to be connected with sth/sb
malfunction
英 [ˌmælˈfʌŋkʃn] 美 [mælˈfʌŋkʃən]
n. 失灵;故障,功能障碍
vi. 失灵;发生故障
文档对象模型(Document Object Model,简称DOM),是W3C组织推荐的处理可扩展标志语言的标准编程接口。
DOM(文档对象模型)是针对HTML和XML文档的一个API,通过DOM可以去改变文档。
这个说法很官方,大家肯定还是不明白。
举个例子:我们有一段HTML,那么如何访问第二层第一个节点呢,如何把最后一个节点移动到第一个节点上面去呢?
DOM就是定义了如果做类似操作,那么应该怎么做的标准。比如用getElementById来访问节点,用insertBefore来插入节点。
当浏览器载入HTML时,会生成相应的DOM树。
简而言之,DOM可以理解为一个访问或操作HTML各种标签的实现标准。[1]
inspection 英[ɪnˈspekʃn]
美[ɪnˈspɛkʃən]
n. 检验; 检查; 视察; 检阅;
taxonomy
英[tækˈsɒnəmi]
美[tækˈsɑ:nəmi]
n. (生物) 分类学,分类系统;
travelogue
英[ˈtrævəlɒg]
美[ˈtrævəˌlɔɡ, -ˌlɑɡ]
n. 游记; 旅行纪录片; 旅行广播节目; 关于旅游的讲座;
Web tests are prone to break frequently as the application under test evolves, causing much maintenance e!ort in practice.
这个句式我见的比较少。causing
Our empirical evaluation on 2,672 test cases spanning 86 releases of four web applications shows that Vista is able to repair, on average, 81% of the breakages, a 41% increment with respect to existing techniques.
这个句式比较复杂。on average 还好。最后一个分句: a 41% increment with respect to existing techniques.这个很酷,值得学习。
Thus, these techniques are, in many cases, either unable to correct breakages, or produce many false positives.
这个句子比较奇怪。是既不能correct breakages,也不能 produce many false positives吗?
经查阅,我最后认为,这句话的意思是:these techniques要么不能纠正breakages,要么会生成很多错的positives。
Moreover, breakages do not always occur at the precise point in which the test execution stops, which renders automated repair even more challenging.
我平时少有看到, which 非限制性定语从句的使用,这里记录一下。
In this paper, we propose a novel test repair algorithm, implemented in a tool named Vista, that leverages the visual information extracted through the execution of test cases, along with image processing and crawling techniques, to support automated repair of web test breakages.
复杂句。 以及leverage的使用可以注意下。
For each of them, we documented the type of breakage, and the position of the associated repair, resulting in a taxonomy of test breakages in web applications from a repair-oriented viewpoint.
复杂句。associated 的使用(和corresponding类似)很酷。result in引起,导致。
[1] DOM 之通俗易懂讲解. https://www.cnblogs.com/iOS-mt/p/5600959.html
帘外五更风,吹梦无踪。
——李清照《浪淘沙》
此外,在写此文时,正好听到李荣浩的《作曲家》,记录之。