Monday, September 19, 2011

Bug in LockFreePipe

Dmitry Vyukov, one of my fellow engineers at Google, pointed out a bug in the implementation of LockFreePipe that you can find here and there on the internet, for example here. That reminded me that Bruce Dawson (who coauthored the original with me) had discovered this bug a while ago and tried to explain it to me, but at the time we didn't have the code in front of us and I didn't completely grok what he was trying to tell me.

Anyway, this particular bug is simple: in Read(), after the call to memcpy(), there should be a barrier before the read pointer is updated. Otherwise the read pointer update can become visible before the memcpy is finished, giving the writer an opportunity to overwrite the memory before it's read.

        unsigned long cbTailBytes = min( bytesLeft, c_cbBufferSize - actualReadOffset );
#ifdef _XBOX_VER
        XMemCpy( pbDest, m_pbBuffer + actualReadOffset, cbTailBytes );
        memcpy( pbDest, m_pbBuffer + actualReadOffset, cbTailBytes );
        bytesLeft -= cbTailBytes;

        if( bytesLeft )
#ifdef _XBOX_VER
            XMemCpy( pbDest + cbTailBytes, m_pbBuffer, bytesLeft );
            memcpy( pbDest + cbTailBytes, m_pbBuffer, bytesLeft );

// OOPS: no barrier here. These operations could be reordered
// by the compiler or by the CPU. Should have lwsync or _ReadWriteBarrier here.
        readOffset += cbDest;
        m_readOffset = readOffset;

Dmitry also pointed out that a store-release is meaningless without a load-acquire, so there should also be a barrier in Write() after m_readOffset is read:

   bool __forceinline          Write( const void* pvSrc, unsigned long cbSrc )
        // Reading the read offset here has the same caveats as reading
        // the write offset had in the Read() function above. 
        DWORD readOffset = m_readOffset;
// OOPS: need lwsync or _ReadWriteBarrier here if we want it to work
        DWORD writeOffset = m_writeOffset;

Needless to say, I'm humiliated :-) and curious to know why this worked so well in the past. My first guess was that MSVC did not reorder because the read pointer was marked volatile. Of course, that raises the question of whether any of our barriers were necessary, if volatile is so magic. Well, yes, they were; MSVC volatile semantics at the time prevented compiler reordering, but didn't insert any barriers at the CPU level. The write could still have become visible any time before the read was issued. So why didn't it?

I'd say we got lucky. Just because the CPU can reorder stores before reads, doesn't mean that it does. There's a good reason for the PowerPC to delay stores--it's something that can usually be deferred until later, because in general nobody's really waiting for a store to complete. (If the data is getting reused, it's probably in one of the PPC's copious number of registers. And if it's not... see "load-hit-store.") There's seldom a good reason for a read to get delayed, because the read is very frequently a dependency of the next instruction in the queue. I'm speaking very generally and in a fashion that involves waving my hands quite a lot, but the point is that the window for failure on this is pretty small. Not, however, small enough to be nonexistent, which is what you need if you're doing lock-free work. Bummer.

The sad thing is that I really like this class. It's small and lightweight, doesn't do any allocation, and is basically perfect for things like passing data in and out of audio routines where blocking would cause glitches. I originally wrote it to dump sample data out of custom DSP routines on the Xbox 360, and it worked awesomely well for that purpose. But after talking to Dmitry, I realize that it really only worked  as well as it did because of quirks in the 360's CPU architecture. It's not a generic piece of code.

In any case, if you're using this class, please fix it by inserting memory barriers as described in the code snippet above. And stay tuned--there may be other bugs that I haven't found yet.

No comments:

Post a Comment

Submit comments here. Be nice.