RE: JavaMemoryModel: Question about the semantics of volatile

From: Evan Ireland (eireland@sybase.com)
Date: Wed Mar 17 2004 - 15:31:30 EST


Bill,

Perhaps I am missing something, but the weak interpretation doesn't look
safe.

If under the weak interpretation the compiler can eliminate the volatile
reads, what constraint prevents applying that optimization to the extreme
that each thread could keep a cache of all volatiles it has written to,
(i.e.
mapping from addresses to values written) such that upon encountering a
volatile read the previously written value could be looked up in the cache?

> -----Original Message-----
> From: owner-javamemorymodel@cs.umd.edu
> [mailto:owner-javamemorymodel@cs.umd.edu]On Behalf Of Bill Pugh
> Sent: Thursday, 18 March 2004 5:04 a.m.
> To: javamemorymodel-cs.umd.edu
> Subject: JavaMemoryModel: Question about the semantics of volatile
>
>
> OK, a question has come up regarding the semantics of volatile.
> There is a point on which there are two different interpretations.
> As far as the model goes, it can handle either interpretation.
>
> So we'd like to open the question up to discussion and debate by
> the broader JMM community.
>
> For each volatile, there is a total order over all accesses to that
> volatile.
>
> Strong interpretation:
> There is a happens-before (or release/acquire) relationship from
> each write to each latter read of that volatile.
>
> Weak interpretation:
> There is a happens-before (or release/acquire) relationship from
> each write to each latter read of that volatile _that sees that
> write_.
>
> Here is an example that shows the difference:
>
> Initially, x = y = v = 0.
> v is a volatile variable.
>
>
> Thread 1:
> r1 = x
> v = 0
> r2 = v
> y = 1
>
> Thread 2:
> r3 = y
> v = 0
> r4 = v
> x = 1
>
> Is the behavior r1 == r2 == 1 possible?
>
> Under the weak interpretation, each thread might see its own volatile
> write,
> in which case the volatile accesses would have no impact. In fact, under
> the weak semantics, the compiler could eliminate the volatile reads and
> transform this to:
>
> Thread 1:
> r1 = x
> v = 0
> r2 = 0
> y = 1
>
> Thread 2:
> r3 = y
> v = 0
> r4 = 0
> x = 1
>
> Under the strong semantics, this would not be possible, since we would
> be
> guaranteed an hb edge from the first volatile write to both volatile
> reads.
>
> The definition of Lazy Release Consistency uses the weak definition.
> We are investigating what the actual behavior of a DSM like Treadmarks
> would be.
>
>
>
> Another place where this makes a difference: back in 2000, Rob Strom
> wanted guidance
> on whether his optimistic readers transformation could be implemented
> in Java.
> The strong semantics makes this possible (see email below). I'm not
> sure it
> is possible to do the optimistic reads transformation with the weak
> semantics.
>
> Thoughts?
>
> Bill
>
> ----
> From: pugh@cs.umd.edu
> Subject: JavaMemoryModel: Roach motel volatile
> ordering and the
> optimistic readers trans
> Date: July 6, 2001 3:25:17 PM EDT
> To: javamemorymodel@cs.umd.edu
>
> At 1:09 PM +1000 6/29/01, David Holmes wrote:
> > It is not clear to me that volatile and monitors should have the same
> > semantics when it comes to reorderings. The basic "roach motel"
> > approach
> > with synchronized blocks works fine (so we've been persuaded :) )
> > because of
> > the added mutual exclusion. With volatiles there is no mutual
> > exclusion and
> > I am concerned that these ordering rules may inhibit the use of
> > volatile in
> > implementing various wait-free/non-blocking algorithms.
> >
> > As an example, the optimistic readers transform of Strom and Auerbach
> > (ref
> > ECOOP 2001) requires insertion of a dummy volatile read after a
> > volatile
> > write to prevent subsequent non-volatile writes from being performed
> > prior
> > to the volatile write. If volatiles are to be useful as flags in
> > wait-free/non-blocking algorithms then it seems to me that ordering is
> > critical and so volatile accesses should act as code motion barriers
> > in both
>
>
> The load/acquire and store/release semantics for synchronization memory
> accesses has a long history in the architecture community. For example,
> see:
>
> K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J.L.
> Hennessy. Memory consistency and event ordering in scalable
> shared-memory multiprocessors. In Proceedings of the 17th Annual
> International Symposium on Computer Architecture, pages 15--26. IEEE,
> May 1990.
>
> Making volatile read/write be full two directional memory barriers will
> likely at least double the cost of volatile reads on IA-64 and Alpha
> SMPs.
>
> Plus, you don't really need it. Here is an example of Rob's optimistic
> readers idiom (from his paper):
>
> volatile int vno = 0;
> int a,b;
> synchronized void inc(int x) {
> vno++;
> a += x;
> b += x;
> vno++;
> }
>
> /* unsynchronized */
> int prod() {
> int v1 = vno; /* pre-inspect */
> int t1 = a;
> int t2 = b;
> int v2 = vno; /* post-inspect */
> if (v1 == v2 && v1%2 == 0)
> return t1*t2; /* commit */
> else ... /* abort */
> }
>
> Rob's example only works if volatiles are 2-way memory barriers.
> Here is how you need to change it to make it work for volatiles that
> have load.acquire and st.release semantics:
>
>
> volatile int vno = 0;
> volatile boolean force;
> int a,b;
> synchronized void inc(int x) {
> int oldVersion = vno;
> vno = oldVersion+1;
> boolean ignore = force; /* Don't allow accesses to be hoisted */
> a += x;
> b += x;
> vno = oldVersion+2;
> }
>
> /* unsynchronized */
> int prod() {
> int v1 = vno; /* pre-inspect */
> int t1 = a;
> int t2 = b;
> force = false; /* force all accesses to be committed */
> int v2 = vno; /* post-inspect */
> if (v1 == v2 && v1%2 == 0)
> return t1*t2; /* commit */
> else ... /* abort */
> }
>
>
> If the critical section of call to prod() completes before the critical
> section of a call to inc(), then
> the second read of vno in prod() occurs before
> the first write to vno in inc()
> therefore the write to force in prod() occurs before
> the read of force in inc()
> therefore all the memory accesses in the critical region of prod()
> are ordered by the happens_before relation to be before the memory
> accesses in the critical region of inc()
>
> If the critical section of call to inc() completes before the critical
> section of a call to prod(), then
> the second write to vno in inc() occurs before the first read of
> vno in prod()
> therefore all the memory accesses in the critical region of inc()
> are ordered by the happens_before relation to be before the memory
> accesses in the critical region of prod()
>
> Note that no one ever cares about the value stored into the force
> variable. It is just used for establishing memory ordering.
>
> I am fairly confident that any situation that really depends on 2-way
> memory barriers can be handled under acquire/release semantics be
> inserting some additional dummy memory accesses. And this way, you will
> only incur the costs of the additional memory/compiler barriers when
> you really need them.
>
> For example, if you need a memory barrier:
>
> private volatile boolean force;
>
> final void memoryBarrier() {
> force = false;
> boolean ignore = force;
> }
>
> Note that this memory barrier would only work between threads using the
> same force field. If force were thread local, the compiler would be
> free to optimize away the memory barrier.
>
> I know that dummy volatile memory accesses are not intuitive. But this
> isn't something that most people should be trying to do.
>
> Bill
>
>
> -------------------------------
> JavaMemoryModel mailing list -
> http://www.cs.umd.edu/~pugh/java/memoryModel
>
> -------------------------------
> JavaMemoryModel mailing list -
> http://www.cs.umd.edu/~pugh/java/memoryModel
>

-------------------------------
JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel



This archive was generated by hypermail 2b29 : Thu Oct 13 2005 - 07:01:00 EDT