I would like to suggest an implementation for final
fields, on machines with weak memory model, that match
J. Manson and W. Pugh document dated April 7, 2003.
The problem: while it's easy to put a (write) memory
barrier when exiting contractors, it's hard to ask for
such (read) barrier at all accesses to all final
fields. Thus, on such machines, a String constructed
at one processor can be seen as mutable at another
one.
What I suggest is as follows:
1. Have a shared memory S, containing objects who
survived GC (which includes a global memory barrier on
all processors).
2. Each processor will have its own address space for
allocating new objects. Processor A will have address
space A, processor B will have B, and so.
3. This address space will be out of the virtual
address space for all processor, except the owner
processor.
4. When processor A allocate new objects, it will use
pointers to A. Thus, processor A can safely access
objects at S, and objects at A.
5. When processor A wants to publish a freezed final
field, it will do the following:
5.1. memory barrier.
5.2. write the number of the last processor page
(4KB on Pentium), and ensure that it will never use
the rest of this page for other created objects (the
GC will compact it later).
5.3. another memory barrier to reflect this
write.
5.4. publish the address (to A).
6. When processor B tries to access data in A it will
get an exception, since this page is not mapped. Then,
it will do the following:
6.1. Take the address written at 5.2.
6.2. Map all pages till that point from A to its
own address space.
6.3. Resume.
7. It is also possible that processor A will delay
steps 5.1 to 5.3, and do 5.4 immediately. In this
case, the resume in 6.3 can cause another interrupt,
but at some point, the data will be available.
Meanwhile, this processor can yield to another thread.
Benefits:
1. Zero latency when accessing shared objects.
2. Zero latency when accessing objects created by the
same processor.
3. Minimal number of memory barriers when constructing
objects. Need a barrier only when a group of freezed
fields has to be published. Can be delayed to any
point needed.
4. Only a single memory barrier in the other
processors in order to get all the accessible fields
from any given final field, as required.
Of course, the OS must support haveing different
virtual memory for different CPUs on the same process.
thanks,
Doron Rajwan, doron@rajwan.org,
http://www.rajwan.org/, doron.rajwan@schema.com
=====
Doron Rajwan, mailto:doron@rajwan.org
-------------------------------
JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel
This archive was generated by hypermail 2b29 : Thu Oct 13 2005 - 07:00:51 EDT