c# - 为什么我无法通过 C# 重现 Richter 的 CLR 中的易失性错误

问题描述

我通过 Richter 的名著学习多线程，并编写了一个测试来研究和理解 volatile 行为（CLR via C#，第 4 版，第 29 章：原始线程同步结构，第 764 页）。Richter 说，处理器可能会颠倒从 RAM 操作加载和存储的顺序，这可能会导致一些不可预测的行为。另外，我在网上阅读了很多文章，证实了 Richter 描述的错误。但是，我的测试永远不会重现这种情况并且总是失败。我的问题是：我做错了什么，为什么我不能重现这个错误？

这是我的处理器型号和构建设置：Core i7-7700HQ（4 核，超线程），Release x86，启用优化标志（Richter 断言这些设置可能导致错误），.NET 4.6.1，VS 2017，NUnit，测试 UI来自 ReSharper。

这是代码

[TestFixture]
public class ReversedOrderBugTest
{
    [Test]
    public void ReversedOrderBug_Run_ReturnsTrue()
    {
        ReversedOrderBug rob = new ReversedOrderBug { AttemptsNumber = 1000000 };
        bool hasBug = rob.Run();
        Assert.True(hasBug);
    }
}

public class ReversedOrderBug
{
    private int _A;
    private int _B;

    // According to Richter, the processor could store _B before _A.
    // However, according to MSND, this statement is wrong because the CLR
    // prohibits reversing the order of sequential storing operations:
    private void Task1()
    {
        _A = 1;
        _B = 1;
    }

    // Here we try to reproduce the moment when _A = 0 and _B = 1,
    // and, also, the reversed order of reading operations  (the
    // case when _A is read before _B):
    private int Task2()
    {
        if (_B == 1) return _A;
        else retutn -1;
    }

    /// <summary>The number of attempts to reproduce the bug.</summary>
    public int AttemptsNumber { get; set; } = int.MaxValue;

    /// <summary>Reproduces the "_A = 0 and _B = 1" bug.</summary>
    /// <returns>True if the bug is caught, False otherwise.</returns>
    public bool Run()
    {
        // Loop the code from Richter's book until we get _A = 0 and _B = 1:
        for (int i = 0; i < AttemptsNumber; i++)
        {
            // Reset variables before each loop:
            _A = 0;
            _B = 0;

            // Run and wait for tasks:
            Task task1 = Task.Run(() => Task1());
            Task<int> task2 = Task.Run(() => Task2());
            Task.WaitAll(task1, task2);

            // Break the loop only when _A = 0 and _B = 1:
            if (task2.Result == 0) return true;
        }

        // The previous loop could not catch the moment when _A = 0 and _B = 1:
        return false;
    }
}

ReverseOrderBug 类的修订版，它使用线程而不是 Tasks，并添加了一个随机延迟来模仿异步性：

1) 第一个方法 Run1 使用 new Thread() 构造函数手动创建和运行线程。它的工作速度比 ThreadPool 或 Tasks 的类似物慢得多。这是可以预测的，因为该方法会创建大量新线程，这会损害性能。

2) 第二个 Run2 方法使用 ThreadPool 并且速度更快，因为它将线程池中的线程排队并仅在确实需要时创建新线程。

此外，我在 Task1 和 Task2 方法中添加了一个 RandomDelay 调用。但是，结果还是一样：测试总是失败。我还应该提到，在 Task2 方法中手动反转读取的变量使测试成功（首先读取 _A，然后读取 _B）。但是，我确信反向阅读并不是这次成功的原因。相反，它表明 CLR 或处理器不会反向读取操作！

这是修改后的代码：

public class ReversedOrderBug
{
    private int _A;
    private int _B;
    private readonly RandomNumberGenerator _Random = RandomNumberGenerator.Create();

    private void RandomDelay(bool longerDelay = true)
    {
        var data = new byte[2];
        _Random.GetNonZeroBytes(data);

        var number = longerDelay
            ? (data[0] + data[1]) * 100
            : data[0] * 10;

        while (number > 0) number--;
    }

    // According to Richter, the processor could store _B before _A.
    // However, according to MSND, this statement is wrong because the CLR
    // prohibits reversing the order of sequential storing operations:
    private void Task1Cst(CancellationTokenSource cst)
    {
        RandomDelay();
        _A = 1;
        _B = 1;
        cst.Cancel();
    }

    // Here we try to reproduce the moment when _A = 0 and _B = 1,
    // and, also, the reversed order of reading operations  (the
    // case when _A is read before _B):
    private int Task2Cst(CancellationTokenSource cst)
    {
        RandomDelay();
        var result = _B == 1
            ? _A
            : -1;

        cst.Cancel();
        return result;
    }

    /// <summary>The number of attempts to reproduce the bug.</summary>
    public int AttemptsNumber { get; set; } = int.MaxValue;

    /// <summary>Reproduces the "_A = 0 and _B = 1" bug. Creates threads manually.</summary>
    /// <returns>True if the bug is caught, False otherwise.</returns>
    public bool Run1()
    {
        // Loop the code from Richter's book until we get _A = 0 and _B = 1:
        for (int i = 0; i < AttemptsNumber; i++)
        {
            // Reset variables before each loop:
            _A = 0;
            _B = 0;

            var result = 1;
            var cst1 = new CancellationTokenSource();
            var cst2 = new CancellationTokenSource();

            // Run and wait for tasks:
            var t1 = new Thread(() => Task1Cst(cst1));
            var t2 = new Thread(() => result = Task2Cst(cst2));

            t1.Start();
            t2.Start();

            while (!(cst1.Token.IsCancellationRequested && cst2.Token.IsCancellationRequested)) ;

            // Break the loop only when _A = 0 and _B = 1:
            if (result == 0) return true;
        }

        // The previous loop could not catch the moment when _A = 0 and _B = 1:
        return false;
    }

    /// <summary>Uses ThreadPool to create threads.</summary>
    public bool Run2()
    {
        // Loop the code from Richter's book until we get _A = 0 and _B = 1:
        for (int i = 0; i < AttemptsNumber; i++)
        {
            // Reset variables before each loop:
            _A = 0;
            _B = 0;

            var result = 1;
            var cst1 = new CancellationTokenSource();
            var cst2 = new CancellationTokenSource();

            // Run and wait for tasks:
            ThreadPool.QueueUserWorkItem(o => Task1Cst(cst1));
            ThreadPool.QueueUserWorkItem(o => result = Task2Cst(cst2));
            while (!(cst1.Token.IsCancellationRequested && cst2.Token.IsCancellationRequested)) ;

            // Break the loop only when _A = 0 and _B = 1:
            if (result == 0) return true;
        }

        // The previous loop could not catch the moment when _A = 0 and _B = 1:
        return false;
    }
}

标签： c#volatile

c# - 为什么我无法通过 C# 重现 Richter 的 CLR 中的易失性错误

问题描述

解决方案

推荐阅读