> * The Pentium doesn't offer a memory barrier instruction. The only
> instruction
> that has the effect of a memory barrier is the processor ID instruction,
> which looks like it might be expensive.
Bus locking seems to be the most efficient way to get a memory barrier
on a Pentium, somewhat more efficient than the cpuid instruction.
Here's some notes I wrote up about 2 years ago:
/*
* Ensure that all previous memory operations are completed before
* continuing.
*/
static inline void Cilk_fence(void)
{
/* We use an xchg instruction to serialize memory accesses, as can
* be done according to the Intel Architecture Software Developer's
* Manual, Volume 3: System Programming Guide
* (http://www.intel.com/design/pro/manuals/243192.htm), page 7-6,
* "For the P6 family processors, locked operations serialize all
* outstanding load and store operations (that is, wait for them to
* complete)." The xchg instruction is a locked operation by
* default. Note that the recommended memory barrier is the cpuid
* instruction, which is really slow (~70 cycles). In contrast,
* xchg is only about 23 cycles (plus a few per write buffer
* entry?). Still slow, but the best I can find. -KHR */
int x=0,y;
asm volatile ("xchgl %0,%1"
:"=r" (x)
:"m" (&y), "0" (x)
:"memory");
}
-Keith
This archive was generated by hypermail 2b29 : Thu Oct 13 2005 - 07:00:24 EDT