Atif M. Memon and Qing Xie, Empirical Evaluation of the Fault-detection Effectiveness of Smoke Regression Test Cases for GUI-based Software
Daily builds and smoke regression tests have become popular quality assurance mechanisms to detect defects early during software development and maintenance. In previous work, we addressed a major weakness of current smoke regression testing techniques, i.e., their lack of ability to automatically (re)test graphical user interface (GUI) event interactions -- we presented a GUI smoke regression testing process called DART. We have deployed DART and have found several interesting characteristics of GUI smoke tests that we empirically demonstrate in this paper. Our experimental subjects consist of four GUI-based applications. We generate 5000-8000 smoke tests (enough to be run in one night) for each application. Our results show that (1) short GUI smoke tests with certain test oracles are effective at detecting a large number of faults, (2) there are classes of faults that our smoke test cannot detect, (3) short smoke tests execute a large percentage of code, and (4) the entire smoke testing process is feasible to do in terms of execution time and storage space.