1. It is my understanding that the authors consider DSD-Crasher to be superior because it avoids certain language-level sound bugs that would not also be user-level sound bugs, thus eliminating some time from the debugging/fixing process (also compensating for its relative long running time). However, most of the times applications are improved and new versions are created. Isn't it possible that some of these "eliminated" bugs would emerge in future versions of the application, and thus they would have to be fixed anyway? Wouldn't this defeat the purpose of spending the extra time to avoid these bugs in the first place? Please elaborate. Solution: Ideally, these user-level unsound bugs don't change from verison to version. An example they give in the paper is that a look-ahead functions are meant to be called with nonnegative inputs. Also, if a bug *does* become user-level sound in a later version, fixing it then is 'good enough'. The benefit of eliminating bugs that never become user-level sound will (hopefully) outweigh the cost of ruling out bugs that *do* (and therefore have to be fixed later). However, it certainly could be a reasonable approach to run DSD-Crasher to find current user-level sound bugs, then to run Check 'n' Crash to find the other language-level sound bugs. If you have enough time and resources to investigate *all* reported bugs, then great; if not, starting with DSD-Crashers reports is probably a good idea. 2. In the results of this paper, DSD-Crasher takes more time than Check 'n' Crash, and runs approximately the same number of test cases (439 vs. 434), yet has less reports confirmed by test cases (7 vs. 4). In what cases would this be interpreted as an improvement? In what circumstances would this be undesirable? Solution: If the eliminated bug reports were false warnings (user-level unsound), then eliminating them saves the developers time and effort -- they don't have to sift through those errors If the eliminated bug reports were eliminated because Daikon generated constraints that were incorrect, the developers won't see that there are bugs. 3. The authors of Check 'n' Crash and DSDCrasher use the terms 'sound for correctness' and 'sound for incorrectness', and they argue that 'sound for incorrectness' is a goal worth pursuing. Define these two types of 'soundness', and give some reasons why having a tool be 'sound for incorrectness' would be worthwhile. Solution: Sound for correctness -- prove absence (of a particular kind) of bugs Sound for incorrectness -- show that there is a bug Why incorrectness? -- being sound for correctness means being pessimistic and usually involves rejecting valid programs (example: divide-by-zero is not ruled out by static checking because such an analysis would have to reject too many programs) -- tools that are sound for correctness have many false positives; this is particularly problematic if the people checking for correctness are not the developers and have difficulty distinguishing real bugs from false warnings