Snippet: Double-Word Compare-And-Swap in the D language

Update: My changes to DMD have been pulled in. DMD's inline assembler now supports cmpxchg16b.

 

I'm currently working on a bunch of lock-free data structures including a lock-free FIFO queue which is primarily used for the event queue.

The implementation of such lock-free structures, however, is plagued by a major obstacle: The risk of ABA occurrence, see http://en.wikipedia.org/wiki/ABA_problem

One approach to overcome this problem is described in "An optimistic approach to lock-free FIFO queues" using tags being incremented together with the node pointer being exchanged (atomically!) - see http://people.csail.mit.edu/edya/publications/OptimisticFIFOQueue-journal.pdf

This approach however requires the cmpxchg16b CPU instruction on x86-64, respectively cmpxchg8b on x86. The availability of these instructions is indicated by the flag CX8/16 in /proc/cpuinfo.

Since neither GDC nor DMD implement the cmpxchg16b instruction in their inline assembler yet, I had to add it on my own. Just checkout my changes to GDC at https://bitbucket.org/nischu7/gdc/changeset/d8a2a73fb3d8.

As soon as you've rebuilt GDC you'll be able to compile this function: 

gdc w/ d2.0.50 - GC crash @amd64

To my great joy, GDC has been merged with the current D frontend (which is 2.0.50, to express it numerically).

But apparently, the garbage collector is a mere wreck when it comes to 64 bit builds. Pronghorn crashes after exactly 94 sequential requests (segfault) when compiling in 64 bit mode unless we disable the GC by calling "GC.disable()".

And as I don't want to abandon the 64 bit support for now, there's no way to bypass this bug except disabling the GC and doing explicit memory management as we did in good old times (which I still prefer over garbage collection). But as Walter Bright points out here, garbage-collected programs are (usually) faster. Moreover D is designed as a garbage collected language - some features, such as array concentration, rely on the GC. Garbage collection, however, isn't well-suited for realtime applications such as server daemons since the arbitrary collects performed by the GC will block all threads temporarily. Hence we need to tune the GC behaviour anyway and perform manual collects whenever the server is idle.

In a few days we'll see whether I was able to fix that GC bug on my own, whether "ibuclaw" will have fixed it or if the official 64 bit DMD will have been published.

32 bit builds, however, are working fine.