At 9:08 AM -0700 7/11/02, Cliff Click wrote:
>If I see a dead volatile load I can remove the actual load, but
>a volatile ref includes a 'MemBarAcquire' (in HotSpot parlance)
>to help prevent various compiler & hardware reorderings. Can I
>remove the MemBarAcquire as well?
Sorry, but no. It has clearly defined semantics and the acquire
barrier can't be eliminated.
For example, Rob Strom's Optimistic Readers in ECOOP 2001 assumes
volatiles are two-way memory barriers. To work with Java volatiles,
you need to insert some dummy/dead reads and writes. See me email
below.
I can't imagine that you would find many codes with a significant
number of dead volatile reads, unless someone is doing something that
depends upon the memory barrier semantics of dead volatile reads.
Bill
At 3:25 PM -0400 7/6/01, Bill Pugh wrote:
>X-Sender: pugh@savoir.cs.umd.edu (Unverified)
>Date: Fri, 6 Jul 2001 15:25:17 -0400
>To: javamemorymodel@cs.umd.edu
>From: Bill Pugh <pugh@cs.umd.edu>
>Subject: JavaMemoryModel: Roach motel volatile ordering and the
>optimistic readers trans
>Sender: owner-javamemorymodel@cs.umd.edu
>
>At 1:09 PM +1000 6/29/01, David Holmes wrote:
>>It is not clear to me that volatile and monitors should have the same
>>semantics when it comes to reorderings. The basic "roach motel" approach
>>with synchronized blocks works fine (so we've been persuaded :) ) because of
>>the added mutual exclusion. With volatiles there is no mutual exclusion and
>>I am concerned that these ordering rules may inhibit the use of volatile in
>>implementing various wait-free/non-blocking algorithms.
>>
>>As an example, the optimistic readers transform of Strom and Auerbach (ref
>>ECOOP 2001) requires insertion of a dummy volatile read after a volatile
>>write to prevent subsequent non-volatile writes from being performed prior
>>to the volatile write. If volatiles are to be useful as flags in
>>wait-free/non-blocking algorithms then it seems to me that ordering is
>>critical and so volatile accesses should act as code motion barriers in both
>
>
>The load/acquire and store/release semantics for synchronization
>memory accesses has a long history in the architecture community.
>For example, see:
>
>K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and
>J.L. Hennessy. Memory consistency and event ordering in scalable
>shared-memory multiprocessors. In Proceedings of the 17th Annual
>International Symposium on Computer Architecture, pages 15--26.
>IEEE, May 1990.
>
>Making volatile read/write be full two directional memory barriers
>will likely at least double the cost of volatile reads on IA-64 and
>Alpha SMPs.
>
>Plus, you don't really need it. Here is an example of Rob's
>optimistic readers idiom (from his paper):
>
>volatile int vno = 0;
>int a,b;
>synchronized void inc(int x) {
> vno++;
> a += x;
> b += x;
> vno++;
> }
>
>/* unsynchronized */
>int prod() {
> int v1 = vno; /* pre-inspect */
> int t1 = a;
> int t2 = b;
> int v2 = vno; /* post-inspect */
> if (v1 == v2 && v1%2 == 0)
> return t1*t2; /* commit */
> else ... /* abort */
> }
>
>Rob's example only works if volatiles are 2-way memory barriers.
>Here is how you need to change it to make it work for volatiles that
>have load.acquire and st.release semantics:
>
>
>volatile int vno = 0;
>volatile boolean force;
>int a,b;
>synchronized void inc(int x) {
> int oldVersion = vno;
> vno = oldVersion+1;
> boolean ignore = force; /* Don't allow accesses to be hoisted */
> a += x;
> b += x;
> vno = oldVersion+2;
> }
>
>/* unsynchronized */
>int prod() {
> int v1 = vno; /* pre-inspect */
> int t1 = a;
> int t2 = b;
> force = false; /* force all accesses to be committed */
> int v2 = vno; /* post-inspect */
> if (v1 == v2 && v1%2 == 0)
> return t1*t2; /* commit */
> else ... /* abort */
> }
>
>
>If the critical section of call to prod() completes before the
>critical section of a call to inc(), then
> the second read of vno in prod() occurs before
> the first write to vno in inc()
> therefore the write to force in prod() occurs before
> the read of force in inc()
> therefore all the memory accesses in the critical region of prod()
> are ordered by the happens_before relation to be before the memory
> accesses in the critical region of inc()
>
>If the critical section of call to inc() completes before the
>critical section of a call to prod(), then
> the second write to vno in inc() occurs before the first read of
> vno in prod()
> therefore all the memory accesses in the critical region of inc()
> are ordered by the happens_before relation to be before the memory
> accesses in the critical region of prod()
>
>Note that no one ever cares about the value stored into the force
>variable. It is just used for establishing memory ordering.
>
>I am fairly confident that any situation that really depends on
>2-way memory barriers can be handled under acquire/release semantics
>be inserting some additional dummy memory accesses. And this way,
>you will only incur the costs of the additional memory/compiler
>barriers when you really need them.
>
>For example, if you need a memory barrier:
>
>private volatile boolean force;
>
>final void memoryBarrier() {
> force = false;
> boolean ignore = force;
> }
>
>Note that this memory barrier would only work between threads using
>the same force field. If force were thread local, the compiler would
>be free to optimize away the memory barrier.
>
>I know that dummy volatile memory accesses are not intuitive. But
>this isn't something that most people should be trying to do.
>
> Bill
>
>
>-------------------------------
>JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel
-------------------------------
JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel
This archive was generated by hypermail 2b29 : Thu Oct 13 2005 - 07:00:40 EDT