Systems Engineering and Electronics ›› 2020, Vol. 42 ›› Issue (10): 2399-2408.doi: 10.3969/j.issn.1001-506X.2020.10.31
Dan GONG1,2(), Tiantian WANG1(
), Xiaohong SU1(
), Meihan DONG1(
)
Received:
2020-01-29
Online:
2020-10-01
Published:
2020-09-19
CLC Number:
Dan GONG, Tiantian WANG, Xiaohong SU, Meihan DONG. Identification method of similar bugs based on historical software repository and abstract syntax tree[J]. Systems Engineering and Electronics, 2020, 42(10): 2399-2408.
Table 1
Information of the subject programs"
文件名 | 代码行数 | 语句数 | 函数数 |
fannkuch | 105 | 65 | 2 |
n-body | 141 | 68 | 4 |
nsieve-bits | 36 | 26 | 1 |
partialsums | 68 | 52 | 3 |
puzzle | 84 | 58 | 7 |
recursive | 55 | 30 | 6 |
spectral-norm | 53 | 37 | 5 |
Bubblesort | 171 | 92 | 5 |
FloatMM | 160 | 84 | 6 |
IntMM | 159 | 83 | 6 |
Oscar | 323 | 169 | 10 |
Perm | 169 | 90 | 7 |
Puzzle | 225 | 174 | 8 |
Queens | 188 | 103 | 6 |
Quicksort | 174 | 103 | 6 |
RealMM | 161 | 84 | 6 |
Towers | 218 | 129 | 12 |
Treesort | 187 | 118 | 8 |
Table 5
Manually implanted bug module library"
类型 | 源文件(错误植入行) | |
相似块 | 不相似块 | |
1 | Towers.c(137), Towers.c(150) | fannkuch.c(70), Queens.c(169) |
2 | Bubblesort.c(137), Quicksort.c(136), Treesort.c(138) | Treesort.c(153) |
3 | Bubblesort.c(169), Perm.c(166), Quicksort.c(171), Towers.c(216), Treesort.c(185) | Bubblesort.c(149) |
4 | FloatMM.c(140), IntMM.c(140), RealMM.c(142) | Bubblesort.c(132) |
5 | Bubblesort.c(136), Quicksort.c(135), Treesort.c(137) | Queens.c(145) |
6 | Bubblesort.c(135), Quicksort.c(134), Treesort.c(136) | n-body.c(66) |
7 | FloatMM.c(129), IntMM.c(129), RealMM.c(131) | FloatMM.c(152) |
8 | Bubblesort.c(116), Puzzle.c(116), Queens.c(116), Towers.c(116), Treesort.c(116) | FloatMM.c(120) |
Table 8
Identification results of manually implanted similar bugs"
类型 | 分组 | TP | FP | TN | FN | 准确率 | 召回率 |
1 | similar | 2 | 0 | 25 | 0 | 1.00 | 1.00 |
all | 2 | 2 | 30 | 2 | 0.89 | 0.50 | |
2 | similar | 3 | 0 | 24 | 0 | 1.00 | 1.00 |
all | 3 | 1 | 31 | 1 | 0.94 | 0.75 | |
3 | similar | 5 | 0 | 22 | 0 | 1.00 | 1.00 |
all | 5 | 1 | 29 | 1 | 0.94 | 0.83 | |
4 | similar | 3 | 0 | 24 | 0 | 1.00 | 1.00 |
all | 3 | 1 | 31 | 1 | 0.94 | 0.75 | |
5 | similar | 3 | 0 | 24 | 0 | 1.00 | 1.00 |
all | 3 | 1 | 31 | 1 | 0.94 | 0.75 | |
6 | similar | 3 | 0 | 24 | 0 | 1.00 | 1.00 |
all | 4 | 0 | 32 | 0 | 1.00 | 1.00 | |
7 | similar | 3 | 0 | 24 | 0 | 1.00 | 1.00 |
all | 3 | 1 | 31 | 1 | 0.94 | 0.75 | |
8 | similar | 5 | 0 | 22 | 0 | 1.00 | 1.00 |
all | 6 | 0 | 30 | 0 | 1.00 | 1.00 | |
合计 | similar | 27 | 0 | 189 | 0 | 1.00 | 1.00 |
all | 29 | 7 | 245 | 7 | 0.95 | 0.81 |
Table 9
Bug identification results of real bug module library"
项目 | P | N | TP | FP | Recall | FPP |
gmp | 14 | 7 | 13 | 3 | 0.93 | 0.43 |
gzip | 75 | 36 | 72 | 21 | 0.96 | 0.58 |
libtiff | 1 675 | 822 | 1 566 | 411 | 0.93 | 0.50 |
lighttpd | 187 | 91 | 170 | 47 | 0.91 | 0.52 |
php | 737 | 363 | 730 | 139 | 0.99 | 0.38 |
python | 284 | 139 | 261 | 100 | 0.92 | 0.72 |
valgrind | 187 | 87 | 156 | 67 | 0.83 | 0.77 |
wireshark | 403 | 198 | 386 | 112 | 0.96 | 0.57 |
total | 3 562 | 1 743 | 3 354 | 900 | 0.94 | 0.52 |
1 | LI J Y, ERNST M D. CBCD: cloned buggy code detector[C]//Proc.of the International Conference on Software Engineering, 2012: 310-320. |
2 | LI Z M, LU S, MYAGMAR S, et al. CP-miner: a tool for finding copy-paste and related bugs in operating system code[C]//Proc.of the 6th Conference on Symposium on Operating Systems Design & Implementation, 2004. |
3 | JUERGENS E, DEISSENBOECK F, HUMMEI B, et al. Do code clones matter?[C]//Proc.of the 31st IEEE International Conference on Software Engineering, 2009: 485-495. |
4 | 苏小红, 张凡龙. 面向管理的克隆代码研究综述[J]. 计算机学报, 2018, 41 (3): 628- 651. |
SU X H , ZHANG F L . A survey for management-oriented code clone research[J]. Chinese Journal of Computers, 2018, 41 (3): 628- 651. | |
5 | KRINKE J, GOLD N, JIA Y, et al. Cloning and copying between GNOME projects[C]//Proc.of the 7th IEEE Working Conference on Mining Software Repositories, 2010: 98-101. |
6 | BAUER V, HAUPTMANN B. Assessing cross-project clones for reuse optimization[C]//Proc.of the 7th International Workshop on Software Clones, 2013: 60-61. |
7 | Simian-similarity analyser.[EB/OL].[2019-12-01].http://www.harukizaemon.com/simian/index.html. |
8 | ROY C K, CORDY J R. Nicad: accurate detection of near-miss intentional clones using flexible pretty-printing and code normalization[C]//Proc.of the 16th IEEE International Conference on Program Comprehension, 2008: 172-181. |
9 | KAMIYA T , KUSUMOTO S , INOUE K . CCFinder: a multilinguistic token-based code clone detection system for large scale source code[J].IEEE Trans.on Software Engineering, |
10 | GODE N, KOSCHKE R. Incremental clone detection[C]//Proc.of the 13th European Conference on Software Maintenance and Reengineering, 2009: 219-228. |
11 | JIANG L X. DECKARD: scalable and accurate tree-based detection of code clones[C]//Proc.of the 29th International Conference on Software Engineering, 2007: 96-105. |
12 | BULYCHEV P, MINEA M. Duplicate code detection using anti-unification[EB/OL].[2018-10-8].http://cyberleninka.ru/article/n/duplicate-code-detection-using-anti-unification. |
13 | KRINKE J. Identifying similar code with program dependence graphs[C]//Proc.of the Conference on Reverse Engineering, 2001: 301-309. |
14 | GABEL M, JIANG L X, SU Z D. Scalable detection of semantic clones[C]//Proc.of the 30th ACM/IEEE International Conference on Software Engineering, 2008: 321-330. |
15 |
KONTOGIANNIS K A , DEMORI R , MERLO E , et al. Pattern matching for clone and concept detection[J]. Automated Software Engineering, 1996, 3, 77- 108.
doi: 10.1007/BF00126960 |
16 | MAYRAND J, LEBLANC C, MERLO E M. Experiment on the automatic detection of function clones in a software system using metrics[C]//Proc.of the International Conference on Software Maintenance, 1996: 244-253. |
17 | LI Z, ZOU D Q, XU S H, et al. VulPecker: an automated vulnerability detection system based on code similarity analysis[C]//Proc.of the ACM International Conference Proceeding Series on Computer Security Applications, 2016: 201-213. |
18 | ZHANG T, YANG G, LEE B, et al. Predicting severity of bug report by mining bug repository with concept profile[C]//Proc.of the 30th Annual ACM Symposium on Applied Computing, 2015: 1553-1558. |
19 | BHATTACHARYA P, NEAMTIU I. Bug-fix time prediction models: can we do better?[C]//Proc.of the 8th International Working Conference on Mining Software Repositories, 2011: 207-210. |
20 | ROCHA H, VALENTE M T, MARQUES-NETO H, et al. An empirical study on recommendations of similar bugs[C]//Proc.of the 23rd International Conference on Software Analysis, Evolution and Reengineering, 2016. |
21 | LAZAR A, RITCHEY S, SHARIF B. Improving the accuracy of duplicate bug report detection using textual similarity mea-sures[C]//Proc.of the International Conference on Software Engineering, 2014: 308-311. |
22 | KEVIC K, MULLER S C, FRITZ T, et al. Collaborative bug triaging using textual similarities and change set analysis[C]//Proc.of the 6th International Workshop on Cooperative and Human Aspects of Software Engineering, 2013: 17-24. |
23 | Clang: a C language family frontend for LLVM[EB/OL].[2020-8-25].http://clang.llvm.org/. |
24 | Git[EB/OL].[2020-8-25].https://git-scm.com/. |
25 | Test-suite guide[EB/OL].[2020-8-25].http://www.llvm.org/docs/TestSuiteGuide.html. |
26 | LLVM download page[EB/OL].[2020-8-25].http://releases.llvm.org/download.html. |
27 | GOUES L C, HOLTSCHULTE N, SMITH K E, et al. Manybugs and introClass benchmarks for automated repair of C programs[EB/OL].[2020-8-25]. https://repairbenchmarks.cs.umass.edu/. |
28 |
GOUES L C , HOLTSCHULTE N , SMITHK E , et al. The manybugs and introclass benchmarks for automated repair of C programs[J]. IEEE Trans.on Software Engineering, 2015, 41 (12): 1236- 1256.
doi: 10.1109/TSE.2015.2454513 |
29 | LE G C, DEWEY-VOGT M, FORREST S, et al. A systema-tic study of automated program repair: fixing 55 out of 105 bugs for MYM8 each[C]//Proc.of the 34th International Conference on Software Engineering, 2012: 3-13. |
30 | WEIMER W, FRY Z P, FORREST S. Leveraging program equi-valence for adaptive program repair: models and first results[C]//Proc.of the 28th IEEE/ACM International Conference on Automated Software Engineering, 2013: 356-366. |
31 | LONG F, RINARD M. Staged program repair with condition synthesis[C]//Proc.of the ACM SIGSOFT Symposium on the Foundations of Software Engineering, 2015: 166-178. |
32 | MECHTAEV S, JOOYONG Y, ROYCHOUDHURY A. Angelix: scalable multiline program patch synthesis via symbolic analysis[C]//Proc.of the 38th IEEE/ACM International Conference on Software Engineering, 2016: 691-701. |
33 | PAN K , KIM S , WHITEHEAD E J . Toward an understanding of bug fix patterns[J]. Empirical Software Engineering, 2009, 14 (3): 286- 315. |
34 | CAMPOS E C, MAIA M D A. Common bug-fix patterns: a large-scale observational study[C]//Proc.of the ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, 2017: 404-413. |
[1] | WU Cai-hua, ZHU Xiao-dong, LIU Jun-tao, WANG Yi-gang. New software reliability growth model [J]. Journal of Systems Engineering and Electronics, 2009, 31(8): 2024-2028. |
[2] | LI Hai-feng, LU Min-yan, WANG Xue-cheng. Modified Jelinski-Moranda model with right-censored data [J]. Journal of Systems Engineering and Electronics, 2009, 31(6): 1496-1499. |
[3] | SUN Meng-lin, GAN Zhi-qiang. Quality measurement evaluation implement in aerospace software [J]. Journal of Systems Engineering and Electronics, 2009, 31(4): 956-959. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||