apache-flink - How to prioritize StreamRecord selection of One Stream over another based on availability?
问题描述
Given Two Streams A
and B
in Flink, I want to process Stream A
until it is empty and start reading B
until there are records arriving at A
.
I am looking for a loose contract. I found InputSelectable
interface seems to provide the notion of providing priority of reads.
Based on this answer, I see a round-robin implementation of Stream reads. However, I am unsure from the documentation on what happens if one of the streams becomes empty?, does the operator stop processing records altogether?
One naive way to implement this would be to use Timers to poll and detect inactivity of a Stream before switching to a lower priority stream but this might be too inefficient.
Qs:
- Is there a built-in stream operator to achieve the above use case?
- What is the behavior of InputSelectable if one input becomes empty?
- Is Timer-based InputSelectable implementation the way to go?
解决方案
答案:
- 不。
- 我不相信行为是指定/保证的。
- 应该是可行的,但你需要小心。
有可能让自己陷入困境InputSelectable
。如果您完全饿死其中一个输入,您将阻止检查点屏障对齐完成,从而阻止检查点。也可以构建死锁的拓扑。
您可能需要考虑网络缓冲区超时与计时器的交互。我认为您可能希望将流的网络缓冲区超时设置为A
小于您用于计时器的超时。
推荐阅读
- pytorch - 梯度等于“无”
- c# - C# - 应用程序 Windows 服务崩溃
- java - 启动 Struts2 应用程序时出现消息错误:Can't find bundle for base name , locale en_US
- sql - 在 sql 查询中添加 if 子句
- node.js - 错误:使用 Youtube 数据 API 时未经授权
- cassandra - 在 CQL 中准备和执行查询
- swift - 如何分解包含选择器的函数?
- python - 如何对二维矩阵进行一次热编码?
- haskell - 是否可以拥有具有不同扩展名的 Haskell 源文件?
- r - R中同一图中的两个折线图