I've been catching up on the discussion here, and wanted to inject a
couple of observations.
------------------------------------------------------------
RECOGNIZING THAT AN OBJECT WON'T CHANGE
As many folks noted after my last comment, the real dangers of excess
synchronization come because Java has idioms which _force_ the use of
mutation during initialization and otherwise leave an object
untouched.
The latest version of our JMM paper (the one which will appear in the
OOPSLA proceedings, and which I'll be presenting at the JMM workshop)
gives translations which permit lock elimination, including the
elimination of related memory barriers:
ftp://csg-ftp.lcs.mit.edu/pub/papers/csgmemo/memo-428.ps.gz
The approach we take is roughly this:
1) Perform no fence of any kind on MonitorEnter if we were the last
holder of an object's lock. The creating thread is initially the
"last holder". On some architectures this might even be worth
putting in the locking code.
2) rely on the fact that a MonitorExit and its related fence
operation can be deferred as long as we like, up to the next
MonitorEnter which actually performs a fence operation (ie
non-local MonitorEnter).
This gives us two related optimizations:
1) Elimination of synchronization for thread-local objects,
including related memory fences.
2) Merging of adjacent synchronization regions on the same object.
This is a bit limited, since synchronization regions are often
nested, but it does allow a series of getter/setter
methods to be executed with only a single lock & unlock even if the
object being manipulated is shared. Doing this around a loop can
lead to starvation, so caveat compiler.
I wonder if we can perhaps extend this notion a bit. I have no idea
how to formalize this in our own framework offhand, but what if we
reset the "last holder" of a lock only if we performed a write while
the lock was being held? Then we'd have the following rules:
* As always, the first access of a shared object in any thread must
perform a memory fence, or we do not even see meaningful contents
for the object.
* If we only read an object after initialization, then subsequent
accesses need not perform memory fences on MonitorExit, and could be
collapsed to eliminate locking entirely.
I'm frankly unsure how a compiler could go about spotting the above
properties, however---if I did I'm sure I'd also understand how to
encode them in our framework. It would probably involve
re-compilation after the init methods were invoked, and observing that
those methods had become dead code.
Doug Lea wrote:
> For example, how can you verify that no field written in a
> "final" clause is ever written again? And how can you tell exactly
> which fields need barriers on reads (on platforms requiring this)?
> The current rules restricting writes to finals be in constructors make
> these issues tractable, but they don't extend in any way I can see to
> these more general constructions.
And I think this enforcement problem remains whether we allow the
programmer to declare such behavior (as Doug Lea was discussing above)
or whether we write the memory model itself to be permissive in this
particular case, as I suggest here.
------------------------------------------------------------
MONOTONICITY WITHOUT LOCKING
Most techniques that I've seen for lock-free concurrency rely on a
monotonicity guarantee, ie, the value you care about will only get
"more defined" over time.
Take Hans Boehm's examples:
> It seems to me it's unattainable for most large programs. There are too
> many ways to cheat and get better performance. And some of the more
> interesting cheats are even correct, e.g. updating a cache with an atomic
> pointer assignment and not locking on lookup, or "or"ing boolean results by
> initializing a global to "false" and then asynchronously writing "true" into
> it, potentially from multiple threads. Even if the program in question
> doesn't care enough about performance to do this, one if the library writers
> will have.
We do have to be careful here, though, as there seem to be two kinds
of monotonicity guarantees we rely upon.
1) In the case of cache update, it is must be all right to re-compute
the cache value in multiple threads. As a result we can keep the
cache in a plain old array---if the cached data is final or
synchronized (or itself obeys a monotonicity constraint). I call
this the "may" case, since we *may* get recent data, but will run
correctly if we do not. The "final or synchronized" caveat is
because the cache update can otherwise make an incompletely written
object visible.
Note that to keep the cache in a hash table, we must guarantee that
table updates are atomic. This is doable without locks, but messy, as
hash table updates require both key and datum to be updated.
2) In the case of the global "or", we might need to know the precise
current value of the "or". I call this the "must" case, since we
*must* see the most recent data for correctness. In this case, the
field must be volatile, or we must rely on locking.
I'm not sure I've seen the above distinction codified clearly in this
discussion. Volatile fields require essentially the same memory fence
behavior as synchronization (albeit avoiding the lock manipulation,
and maybe reducing two fences to just one depending on the
architecture). It's thus unclear to me how much we really buy in
practice in the *must* case. This is especially true if the compiler
can spot and optimize synchronized getter/setter methods, in which
case they can be made to look like volatile reads/writes.
-Jan-Willem Maessen
-------------------------------
JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel
This archive was generated by hypermail 2b29 : Thu Oct 13 2005 - 07:00:28 EDT