Recently, the paper titled A Universal Data Augmentation Approach for Fault Localization by the research group led by Associate Professor Lei Yan from the School of Big Data and Software Engineering was accepted by the 44th International Conference on Software Engineering (ICSE 2022) as a CCF-A conference. In response to the presence of imbalance of failure test cases in the existing software fault localization technology, this paper proposed a general data enhancement method of software fault localization technology that involved the use of the semantic representation and synthesis technology of failure test, further increased the accuracy of the existing software fault localization technology and provided an efficient and reliable solution for software fault localization.
In recent years, software fault localization has been a hot topic and research focus in the field of software engineering, and is of great significance for tackling the increasingly prominent software fault issues and reducing the loss caused to social production and national economy. Balancing data set is the key to efficient operation of fault localization. Due to the irregular distribution and low proportion of failure test cases in the input domain, it is extremely difficult to obtain a balanced data set for fault localization. This has become a bottleneck that is restricting the development of fault localization. In order to break through this bottleneck, this paper proposed to characterize the distribution law of input domain of failure test cases based on data feature space. According to the distribution law of failure test cases characterized by data feature space, failure test cases with input domain distribution characteristics can be synthesized to solve the problem of imbalance of failure test cases. This method has resulted in an average significant improvement of 45% as compared with SOTA fault localization method. ICSE reviewers spoke highly of the ideas, experiment and contribution of this data enhancement method proposed in this paper. One of the reviewers said that "Data audit in itself is pretty rare in software engineering. So novelty is pretty high," adding that the research could provide solutions to other areas of software engineering. “More broadly, data augmentation in software engineering could have a great impact also for other related fields, such as program repair.”
ICSE is a recognized top international academic conference in the field of software engineering, and is also a class-A conference recommended by China Computer Federation that is “among the very few best journals and conferences in the world that Chinese researchers are encouraged to participate in”. It has won a great reputation in the academic sector.
The School of Big Data and Software Engineering of Chongqing University is the first organization of the paper. Xie Huan, a postgraduate student of 2020 of the School, is the first author. Lei Yan is the corresponding author. The related research has been supported by the National Key R&D Program, and the National Natural Science Foundation of China.