1.
It is my understanding that the authors consider DSD-Crasher to be
superior because it avoids certain language-level sound bugs that would
not also be user-level sound bugs, thus eliminating some time from the
debugging/fixing process (also compensating for its relative long
running time). However, most of the times applications are improved and
new versions are created. Isn't it possible that some of these
"eliminated" bugs would emerge in future versions of the application,
and thus they would have to be fixed anyway? Wouldn't this defeat the
purpose of spending the extra time to avoid these bugs in the first
place? Please elaborate.

Solution:
Ideally, these user-level unsound bugs don't change from verison to
version. An example they give in the paper is that a look-ahead
functions are meant to be called with nonnegative inputs.
Also, if a bug *does* become user-level sound in a later version, fixing
it then is 'good enough'. The benefit of eliminating bugs that never
become user-level sound will (hopefully) outweigh the cost of ruling out
bugs that *do* (and therefore have to be fixed later).
However, it certainly could be a reasonable approach to run DSD-Crasher
to find current user-level sound bugs, then to run Check 'n' Crash to
find the other language-level sound bugs. If you have enough time and
resources to investigate *all* reported bugs, then great; if not,
starting with DSD-Crashers reports is probably a good idea.

2.
In the results of this paper, DSD-Crasher takes more time than Check 'n'
Crash, and runs approximately the same number of test cases (439 vs.
434), yet has less reports confirmed by test cases (7 vs. 4). In what
cases would this be interpreted as an improvement? In what circumstances
would this be undesirable?

Solution:
If the eliminated bug reports were false warnings (user-level unsound),
then eliminating them saves the developers time and effort -- they don't
have to sift through those errors
If the eliminated bug reports were eliminated because Daikon generated
constraints that were incorrect, the developers won't see that there are
bugs.

3.
The authors of Check 'n' Crash and DSDCrasher use the terms 'sound for
correctness' and 'sound for incorrectness', and they argue that 'sound
for incorrectness' is a goal worth pursuing. Define these two types of
'soundness', and give some reasons why having a tool be 'sound for
incorrectness' would be worthwhile.

Solution:
Sound for correctness -- prove absence (of a particular kind) of bugs
Sound for incorrectness -- show that there is a bug
Why incorrectness?
 -- being sound for correctness means being pessimistic and usually
involves rejecting valid programs (example: divide-by-zero is not ruled
out by static checking because such an analysis would have to reject too
many programs)
 -- tools that are sound for correctness have many false positives; this
is particularly problematic if the people checking for correctness are
not the developers and have difficulty distinguishing real bugs from
false warnings