(Apologies to those who are tired of hearing about this... :-)
> If the discussion is about whether to support or not to support this
> kind of construction, my personal opinion is that the first three
> points that you mentioned against it are very strong (refer to the
> answer for Question 1 in the "recap: concurrent reads" message), and
> should be followed
Perhaps seeing one of the examples I've been alluding to will help
convince you otherwise. Attached is code for class CRHashtable (CR for
"Concurrently Readable"). In it, all update operations entail locking,
but they are all written so that the hash table is at all times
consistent, and thus in principle readable without locking. However,
the retrieval operations (mainly get()) sometimes require memory
barriers in order to obtain fresh reads of the linked lists off the
indexed bins. (The code tries to minimize even these barriers via a
form of double-check.)
Except... Right now, I don't know a way to write this efficiently in a
way that is unambiguously legal wrt proposed memory model.
Two variants are in the code, controlled by static final
USE_SYNCH_FOR_BARRIER. If true, it acquires and releases the main lock
used for updates as the read barrier (this is the legal but slow way).
Otherwise, it reads a dummy volatile that is written on each
update. This is the trick that Bill thought was reasonable wrt his
proposal. It is also a trick that I happen to know works fine right
now on Sun VMs.
(As mentioned in previous exchanges, a third version is legal under
JLS memory model but not the LC-based one: You could get a barrier
without any chance of contention via
synchronized(new Object(){}.
Or more grossly but cheaply, and with fewer guarantees:
synchronized(Thread.currentThread()){}.)
The differences between these two are pretty dramatic under heavy
contention. Having the read operations contend for the same lock
needed by the update operations starts killing performance as the lock
is increasingly tied up by these update operations.
For example, here are runs times for a program in which there is an
intially empty table and 100K items. Each thread does 10M times: for
some random item k: if k is absent, then with 0.1 probability add it,
else if it is present, then with 0.05 probablity remove it. (These
get/put and get/remove sequences are unsynchronized, so some of the
attempted puts and removes fail.) Here, about 85% of the operations
are reads, which appears to be a relatively low percentage compared to
typical programs (for example those in SpecJVM programs). Higher
percentages of reads would produce yet larger differences
For comparison, runs for java.util.Hashtable are shown, as well as for
java.util.HashMap, which is unsynchronized so is only shown for
1-thread.
This is on a 4CPU sparc E450 running the Production 1.2 VM. The
patterns of results under hotspot are similar. Times in millesconds.
#threads
1 2 4
HashMap 19248
CR - volatile 19710 23997 38312
CR - lock 21160 28212 200167
Hashtable 22833 198470 277712
The performance of "CR-lock" is pretty typical of these algorithms
when locking is used for barriers: It runs OK up to the point where
contention for the lock hits a threshhold that starts a positive
feedback loop leading to it almost always being held; thus, among
other things, defeating fast-path locking heuristics used in this and
most VMs.
Again, the issue here is that in this algorithm and many like it,
there is no logical reason for most reader operations to obtain a
lock. Forcing a programmer to use one anyway can lead to poorer
performance.
This is of course a completely artificial test, running under fairly
extreme conditions. Those of you able to do so might try renaming
this class to java.util.Hashtable and comparing performance on real
loads. The class is designed to be 100% interoperable with
jdk1.2 Hashtable.
> But, furthermore, it seems that the discussion
> here somehow diverged from memory model issues, turning to what I
> would call 'program model issues'. Isn't this issue related with the
> mapping we should use from the language primitives to the memory
> model? Or should we really incorporate this level of detail in the
> memory model specification?
Right. I don't see anything about LC itself that precludes lockless
memory acquire/release. We just don't have an obvious syntactic slot
to put it in. Unless the language syntax is changed solely to support
a new memory model (which seems extremely unlikely), the choice of
memory model and the choice of how to map language constructs down to
it are not independent.
-Doug
-------------------------------
JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel
This archive was generated by hypermail 2b29 : Thu Oct 13 2005 - 07:00:24 EDT