At 8:13 AM -0500 11/9/99, Doug Lea wrote:
>As an example, here's a popular design for an unbounded linked queue,
>published in Maged Michael and Michael Scott's "Simple, Fast, and
>Practical Non-Blocking and Blocking Concurrent Queue Algorithms",
>Proceedings, 15th ACM Symposium on Principles of Distributed
>Computing, ACM, 1996 -- see
> http://www.cs.rochester.edu/u/michael/PODC96.html.
>
>This is the fastest lock-based queue design that I know of. It is used
>in a lot of systems. A Java version that conforms to both current and
>proposd revised models looks like this. (This is from p130-131 of 2nd
>edition of CPJ, and is the basis of my util.concurrent.LinkedQueue class.)
With my proposed change to the meaning of volatile, a programmer can
declare the
next field to be volatile and eliminate the inner lock. Doug
hinted/suggested this:
>Note that volatiles could come to the rescue here if dereferencing a
>volatile reference did guarantee visibility of contents.
At 8:13 AM -0500 11/9/99, Doug Lea wrote:
>I suppose that is worth mentioning here that the non-blocking version
>of this queue presented in the Michael & Scott paper cannot be
>efficiently implemented in Java because it requires a CAS.
Actually....
I've been thinking about how a compiler could recognize that a synchronization
lock could be replaced with a CAS instruction. I think the following
code will do
work for the Queue example.
The compiler has to be able to determine that the only places where a lock will
be obtained on the object referenced by this.lock is in the 5
synchronized block
within this code. Further, it needs to verify that each of these
synchronized blocks
is performing a simple atomic compare, swap and branch. Given that, I
think it will
be possible to replace them all with compare and swaps.
Of course, we got a fair number of volatile fields in here as well, so figuring
out the exact number of memory barriers required by this code is a
little tricky.
It is somewhat simplified by the fact that the CAS probably acts as a memory
barrier on most machines.
Actually, I suspect that the code we code generate from this would be
more efficient
than Michael's non-blocking queue algorithm, partially because having
garbage collection
makes a number of things simpler (it eliminates the need for a
double-word CAS and a counter
field).
I need to think this over some more, but I'm going to be brave and
post the code
I've got at the moment. I would be too surprised if I have to email
out some bug fixes
tomorrow morning.
Bill
Queue implementation that can be optimized to use non-blocking CAS:
public class Queue {
static class Node {
Node(Object o) {
value = o;
}
Object value;
volatile Node next;
}
private final Object lock = new Object();
volatile Node head = new Node(null);
volatile Node tail = head;
void enqueue(Object o) {
Node nn = new Node(o);
Node t;
while (true) {
t = tail;
synchronized(lock) {
if (t.next == null) {
t.next = nn;
break;
}
}
Node nxt = t.next;
synchronized(lock) {
if (tail == t) tail = nxt;
}
}
synchronized(lock) {
if (tail == t) tail = nn;
}
}
Object dequeue() {
Node h;
while(true) {
h = head;
Node nxt = h.next;
if (nxt == null) return null;
Node t = tail;
if (h == t)
synchronized(lock) {
if (tail == t) tail = nxt;
}
else synchronized(lock) {
if (head == h) {
head = nxt;
break;
}
}
}
Object tmp = h.value;
h.value = null;
return tmp;
}
}
-------------------------------
JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel
This archive was generated by hypermail 2b29 : Thu Oct 13 2005 - 07:00:22 EDT