首页 > 解决方案 > ReceiveAsync 中断/中断消息传递

问题描述

这个问题在尝试实施针对这个问题的建议解决方案时引起了人们的注意。

问题总结

执行从 TransformBlock 到 WriteOnceBlock 的 ReceiveAsync() 调用会导致 TransformBlock 实质上将自身从流中删除。它停止传播任何类型的消息,无论是数据还是完成信号。

系统设计

该系统旨在通过一系列步骤解析大型 CSV 文件。

流程的有问题的部分可以(不熟练地)可视化如下:

部分数据流

平行四边形是BufferBlock,菱形是BroadcastBlocks,三角形是WriteOnceBlocks,箭头是TransformBlocks。实线表示使用 LinkTo() 创建的链接,虚线表示从 ParsedHeaderAndRecordJoiner 到 ParsedHeaderContainer 块的 ReceiveAsync() 调用。我知道这个流程有点不理想,但这不是问题的主要原因。

代码

应用程序根

这是创建必要块并使用 PropagateCompletion 将它们链接在一起的类的一部分

using (var cancellationSource = new CancellationTokenSource())
{
    var cancellationToken = cancellationSource.Token;
    var temporaryEntityInstance = new Card(); // Just as an example

    var producerQueue = queueFactory.CreateQueue<string>(new DataflowBlockOptions{CancellationToken = cancellationToken});
    var recordDistributor = distributorFactory.CreateDistributor<string>(s => (string)s.Clone(), 
        new DataflowBlockOptions { CancellationToken = cancellationToken });
    var headerRowContainer = containerFactory.CreateContainer<string>(s => (string)s.Clone(), 
        new DataflowBlockOptions { CancellationToken = cancellationToken });
    var headerRowParser = new HeaderRowParserFactory().CreateHeaderRowParser(temporaryEntityInstance.GetType(), ';', 
        new ExecutionDataflowBlockOptions { CancellationToken = cancellationToken });
    var parsedHeaderContainer = containerFactory.CreateContainer<HeaderParsingResult>(HeaderParsingResult.Clone, 
        new DataflowBlockOptions { CancellationToken = cancellationToken});
    var parsedHeaderAndRecordJoiner = new ParsedHeaderAndRecordJoinerFactory().CreateParsedHeaderAndRecordJoiner(parsedHeaderContainer, 
        new ExecutionDataflowBlockOptions { CancellationToken = cancellationToken });
    var entityParser = new entityParserFactory().CreateEntityParser(temporaryEntityInstance.GetType(), ';',
        dataflowBlockOptions: new ExecutionDataflowBlockOptions { CancellationToken = cancellationToken });
    var entityDistributor = distributorFactory.CreateDistributor<EntityParsingResult>(EntityParsingResult.Clone, 
        new DataflowBlockOptions{CancellationToken = cancellationToken});

    var linkOptions = new DataflowLinkOptions {PropagateCompletion = true};

    // Producer subprocess
    producerQueue.LinkTo(recordDistributor, linkOptions);

    // Header subprocess
    recordDistributor.LinkTo(headerRowContainer, linkOptions);
    headerRowContainer.LinkTo(headerRowParser, linkOptions);
    headerRowParser.LinkTo(parsedHeaderContainer, linkOptions);
    parsedHeaderContainer.LinkTo(errorQueue, new DataflowLinkOptions{MaxMessages = 1, PropagateCompletion = true}, dataflowResult => !dataflowResult.WasSuccessful);

    // Parsing subprocess
    recordDistributor.LinkTo(parsedHeaderAndRecordJoiner, linkOptions);
    parsedHeaderAndRecordJoiner.LinkTo(entityParser, linkOptions, joiningResult => joiningResult.WasSuccessful);
    entityParser.LinkTo(entityDistributor, linkOptions);
    entityDistributor.LinkTo(errorQueue, linkOptions, dataflowResult => !dataflowResult.WasSuccessful);
}

HeaderRowParser

此块从 CSV 文件解析标题行并进行一些验证。

public class HeaderRowParserFactory
{
    public TransformBlock<string, HeaderParsingResult> CreateHeaderRowParser(Type entityType,
        char delimiter,
        ExecutionDataflowBlockOptions dataflowBlockOptions = null)
    {
        return new TransformBlock<string, HeaderParsingResult>(headerRow =>
        {
            // Set up some containers
            var result = new HeaderParsingResult(identifier: "N/A", wasSuccessful: true);
            var fieldIndexesByPropertyName = new Dictionary<string, int>();

            // Get all serializable properties on the chosen entity type
            var serializableProperties = entityType.GetProperties()
                .Where(prop => prop.IsDefined(typeof(CsvFieldNameAttribute), false))
                .ToList();

            // Add their CSV fieldnames to the result
            var entityFieldNames = serializableProperties.Select(prop => prop.GetCustomAttribute<CsvFieldNameAttribute>().FieldName);
            result.SetEntityFieldNames(entityFieldNames);

            // Create the dictionary of properties by field name
            var serializablePropertiesByFieldName = serializableProperties.ToDictionary(prop => prop.GetCustomAttribute<CsvFieldNameAttribute>().FieldName, prop => prop, StringComparer.OrdinalIgnoreCase);

            var fields = headerRow.Split(delimiter);

            for (var i = 0; i < fields.Length; i++)
            {
                // If any field in the CSV is unknown as a serializable property, we return a failed result
                if (!serializablePropertiesByFieldName.TryGetValue(fields[i], out var foundProperty))
                {
                    result.Invalidate($"The header row contains a field that does not match any of the serializable properties - {fields[i]}.",
                        DataflowErrorSeverity.Critical);
                    return result;
                }

                // Perform a bunch more validation

                fieldIndexesByPropertyName.Add(foundProperty.Name, i);
            }

            result.SetFieldIndexesByName(fieldIndexesByPropertyName);
            return result;
        }, dataflowBlockOptions ?? new ExecutionDataflowBlockOptions());
    }
}

ParsedHeaderAndRecordJoiner

对于通过管道传入的每个后续记录,此块旨在检索已解析的标头数据并将其添加到记录中。

public class ParsedHeaderAndRecordJoinerFactory
{
    public TransformBlock<string, HeaderAndRecordJoiningResult> CreateParsedHeaderAndRecordJoiner(WriteOnceBlock<HeaderParsingResult> parsedHeaderContainer, 
        ExecutionDataflowBlockOptions dataflowBlockOptions = null)
    {
        return new TransformBlock<string, HeaderAndRecordJoiningResult>(async csvRecord =>
            {
                var headerParsingResult = await parsedHeaderContainer.ReceiveAsync();

                // If the header couldn't be parsed, a critical error is already on its way to the failure logger so we don't need to continue
                if (!headerParsingResult.WasSuccessful) return new HeaderAndRecordJoiningResult(identifier: "N.A.", wasSuccessful: false, null, null);

                // The entity parser can't do anything with the header record, so we send a message with wasSuccessful false
                var isHeaderRecord = true;
                foreach (var entityFieldName in headerParsingResult.EntityFieldNames)
                {
                    isHeaderRecord &= csvRecord.Contains(entityFieldName);
                }
                if (isHeaderRecord) return new HeaderAndRecordJoiningResult(identifier: "N.A.", wasSuccessful: false, null, null);

                return new HeaderAndRecordJoiningResult(identifier: "N.A.", wasSuccessful: true, headerParsingResult, csvRecord);
            }, dataflowBlockOptions ?? new ExecutionDataflowBlockOptions());
    }
}

问题详情

在当前的实现中,ParsedHeaderAndRecordJoiner 从对 ParsedHeaderContainer 的 ReceiveAsync() 调用中正确接收数据并按预期返回,但是没有消息到达 EntityParser。

此外,当 Complete 信号被发送到流的前端(ProducerQueue)时,它会传播到 RecordDistributor,然后在 ParsedHeaderAndRecordJoiner 处停止(它确实从 HeaderRowContainer 继续向前,因此 RecordDistributor 正在传递它)。

如果我删除 ReceiveAsync() 调用并自己模拟数据,则该块的行为符合预期。

标签: c#.nettpl-dataflow

解决方案


我认为这部分是关键

但是没有消息到达 EntityParser。

根据示例,唯一EntityParser不接收消息输出的ParsedHeaderAndRecordJoiner方法是在WasSuccessful返回 false 时。链接中使用的谓词不包括失败的消息,但这些消息无处可去,因此它们会在ParsedHeaderAndRecordJoiner输出缓冲区中累积,并且还会阻止Completion传播。您需要链接一个空目标来转储失败的消息。

parsedHeaderAndRecordJoiner.LinkTo(DataflowBlock.NullTarget<HeaderParsingResult>());

此外,如果您的模拟数据总是返回WasSuccessfultrue,那么这可能会将您指向await ...ReceiveAsync()

不一定是确凿的证据,而是一个很好的起点。您能否确认ParsedHeaderAndRecordJoiner管道卡住时输出缓冲区中所有消息的状态。


推荐阅读