没办法,再抓紧一点吧,不想把工作拖到最后,今天+明天 就得干完所有的论文。(至少要大概了解:技术,idea来源,一些思路)
Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis
作者:Sergey Mechtaev; Jooyong Yi; Abhik Roychoudhury
1) Today automated program repair have garnered interest.
2) Genprog and SPR, two search-based repair tools, has common limitations as they deteles functionality in thier many “plausible” repairs.
3) Semantics-based repair is promising but has limiations – scalability.
4) Our repair method, called Angelix, can scale up to programs of similar size as are handled by search-based repair methods such as GenProg and SPR.
5) Angelix deals with large-scale real-world software, generating repairs including multi-location repairs. Also, angelix automatically repaired the well-known HearyBleed vulnerability.
While such semantics-based repair methods show promise in terms of quality of generated repairs, their scalability has been a concern so far.
Various automated repair tools, such as GenProg [14], PAR [21], relifix [39], SemFix [26], Nopol [8], DirectFix [24] and SPR [23], to name only a few, have been introduced recently.
These automated repair methods can be classified into the following two broad methodologies, i.e., search-based methodology (e.g., GenProg, PAR, and SPR) and semantics-based methodology (e.g., SemFix, Nopol, and DirectFix).
Meanwhile, the semantics-based repair methodology synthesizes a repair using semantic information (via symbolic execution and constraint solving).
Classifying repair methods into search based repair and semantics based repair is somewhat analogous to classification of software testing into search-based testing and symbolic-execution-based testing [28]
Currently, research in automated program repair considers all the three attributes – scalability (should scale to large real-world programs), repairability (should repair a large number of defects possibly by covering many defect classes), and the quality of repairs (should produce repairs which make less changes to the program, delete less functionality, and are more likely to be accepted by developers).
Semantics based repair methods often work by extracting a repair constraint typically via symbolic execution. This repair constraint acts as a specification to guide program synthesis - so a patch satisfying the repair constraint can be synthesized.
The key enabler for scalable multi-
line bug fix in this paper, is our novel lightweight repair constraint that we call an angelic forest. This angelic forest is automatically extracted via symbolic execution. As compared to the repair constraints used in the previous work [24, 26], the angelic forest is simpler, and its size is independent of the size of the program under repair, thereby making our repair method scale. Our angelic forest, despite its simplicity, contains enough semantic information to enable multi-location bug fix. Among existing search-based repair tools, SPR does not support multi-line fixes. While GenProg [14] can change multiple locations of the program, a recent study on GenProg repairs [33] shows that seemingly complex repairs generated from GenProg are in the overwhelming majority of cases in fact functionally equivalent to single line modification.
The absence of an angelic forest for a chosen n suspicious locations implies that it is not possible to repair the bug by changing these n locations. Symbolic execution finds an angelic forest (or proves the absence of an angelic forest) efficiently by exploring only feasible execution paths.
In our custom symbolic execution, symbols are installed dur-
ing symbolic execution by replacing the value of each in-
stance of a suspicious expression with a fresh symbol (line 7).
Our repair tool allows to control the following parameters
of our repair algorithm — the maximum number of suspi-
cious locations that can be repaired at the same time, the
kinds of suspicious expressions, and the kinds of (semantics-
preserving) program transformation.
First, for the maximum number of suspicious locations, we
used the value between 1 and 10 (inclusive).
Afterwards, our repair algorithm replaces user-
configured n most suspicious expressions—chosen based on
the result of statistical fault localization—with symbolic vari-
ables, as shown in Figure 1c where conditional expressions
and the right-hand side of an assignment are replaced with
symbolic variables.
Second, our repair algorithm performs controlled symbolic
execution with a few selected suspicious expressions, instead
of usual symbolic input.
这个的意思就是自己设定 n的数目?
不是在statement level,是在expression level
We note that each of these afore-listed techniques is the
improvement or extension of earlier work by us and oth-
ers. As already mentioned, our novel lightweight program-
size-independent semantic signature is the improvement of
the heavyweight semantic signature used in our prior work
DirectFix [24]. We also mention that the controlled sym-
bolic execution was first introduced in our prior work, Sem-
Fix [26], although there a symbol is installed only at one
location, and as a result, multi-location repair was not pos-
sible. Lastly, our repair strategy to ignore repair-wise in-
feasible suspicious locations has a similarity with Nopol [8]
and SPR [23]. While detailed comparison will be provided
in Section 8, Nopol and SPR currently cannot fix multi-
location bugs. Furthermore, multi-location fix seems fun-
damentally difficult in Nopol and SPR, due to their weaker
semantic signatures that do not capture the dependence be-
tween multiple program locations. The unique combination
of our novel semantic signature with the existing techniques
enables scalable multi-location bug fixing.
All our experiments were performed on Intel Xeon E5-2660 2.20GHz CPU with Ubuntu 14.04 64-bit operating system. We used 12 hours as the timeout of each repair session.
到底是不是一次修一个,还是每次修angelix forest,