Since we're talking about the limits of our charter, here's another issue related to memory accesses and threads, and particularly to section 6 of Jeremy's and Bill's proposal. There are a lot more details in my POPL 2003 paper. I've previously discussed some of this with Joshua and Bill. I don't believe that very many uses Java finalizer or java.lang.ref can be correct without some changes along these lines.
Consider the following use of finalizers in Java. I think this is actually in some sense the prototypical use. (You can use java.lang.ref instead. At this level it doesn't matter.)
We have some class C which keeps the actual state associated with C objects in some static array A. Each C object contains a field "index", which contains the index in A of the corresponding data. (There are several possible reasons for doing this, e.g. to ensure that temporary files logically associated with C objects can be found and cleaned up at process exit, or possibly to allow sharing between C instances.)
The C.finalize() method removes the corresponding data from the A array.
Now consider a method C.foo() which may be the last call on a C object. Assume that x of class C is accessed by a single thread. Typically x.foo() will do something like:
x_index = x.index;
Operate on A[x_index];
The problem is that x may be unreachable after the read of x.index. (Or potentially much earlier, if that memory read was for example moved outside an enclosing loop.) Thus the operation on A[x_index] may occur after x's finalizer has run, and the data associated with x has been removed from A.
Thus this code is clearly wrong. Since we all know that finalizers introduce concurrency, this perhaps shouldn't be surprising. foo() and the finalizer run in separate threads, so both should by synchronized. Thus foo() should be effectively:
lock x;
x_index = x.index;
Operate on A[x_index];
unlock x;
Now assume that the last calls to x.foo() are inlined in a loop:
for (...;...;...) {
lock x;
x_index = x.index;
Operate on A[x_index];
unlock x;
}
Assume further that I have a VM that stores the lock associated with x in a separate table (many JDK classic variants do, as does gcj/gij) and that the compiler notices that x.index is constant through the loop. I may get code that effectively does:
x_lock_index = ...x...
x_index = x.index;
for (...;...;...) {
lock' x_lock_index;
Operate on A[x_index];
unlock' x_lock_index;
}
Now x is unreachable in the loop, and I have the same problem. (12.6.1: "Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable.")
I would like to see us conclude one of the following:
1) The above is already clearly outlawed by some part of the current JLS or proposed memory model that I just happened to overlook, or
2) We should add a constraint that any object is reachable (for finalization/GC purposes) when its lock is released. Since prior memory references are ordered before the lock release, that should ensure that visible operations on A[x_index] are complete before the finalizer runs.
(Based on an earlier conversation with Guy and Joshua, I think there is some agreement that the above optimization shouldn't happen. I'm unsure whether there are any implementations for which it might. I think for gcj/gij it can't currently, but the reasons are an accident that could conceivably get "fixed" in the future.)
If we do (2), I would also like to see a second constraint:
- If Object x refers to object y via a final or volatile field and x is reachable, then y must be reachable.
This is necessary to make the "Finalizer Guardian idiom" (p. 23 in Joshua's "Effective Java") correct. It is also needed if you want to explicitly impose a finalizer ordering by having one finalizer clear a reference to another.
Hans
-------------------------------
JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel
This archive was generated by hypermail 2b29 : Thu Oct 13 2005 - 07:00:42 EDT