The Happens-Before relationship is rarely properly understood, but is a vital part to parallel, or multi-threaded, programming. It explains the “how could that happen” problem that can happen with multiple running threads.

If you have a thread running on one CPU, and another thread running on a different CPU, you can not presume that the second thread will see the changes the first thread made, even if they are using shared data.

This java example might help to explain it:

public class Globals {

    public static int a = 0;
    public static int b = 0;
}

This class has two variables that can be viewed and modified from anywhere.

The two methods below are executed in separate threads:

public void thread_1()
{
    Globals.a=1;
    Globals.b=2;
}


public void thread_2()
{
    if(Globals.b == 2)
    {
        assert(Globals.a == 1);
    }
}

If b is equals to 2, a must be 1 …. because b is set to 2 after a has been set to 1. So there is no way that the assert in method thread_2() could fail … correct?

This is a safe assumption if both threads are running on the same CPU, and we’ll explain why later.

But, if these two threads are running on two different CPUs this could fail, i.e it’s possible for b to be set to 2 without a being set to 1.

Why does this happen? There are a couple of reasons that will cause this event:

  1. In Java (or other compilers) the JIT will try to optimise the code, and may decide to put a into a register, and not write it to main memory, but may decide to write b to memory. The second thread will see b in main memory, but not the change to a which is in the register of CPU 1.
  2. The CPU may run the assignments in thread_1() out of order. That is, it may put the operations to set a to 1 and b to 2 in different pipelines (and other reasons), and b might be set to 2 before a is set to one. This isn’t a problem on an individual CPU, because the CPU’s logic is aware of it, but the second CPU running thread_2() won’t see it. You can’t guarantee what has happened before on one CPU is visible to another.

In Java the recommended way to solve this is using volatile keyword. This wraps the operations in a MONITOR_ENTER/MONITOR_EXIT, and either of these will cause the CPUs to synchronize their operations. This is big overhead, so should be used sparingly - a normal L1 Cache write is around 0.5 ns, a Mutex lock/unlock 25ns.