c++ - 自旋锁退避策略背后的原因
问题描述
我正在查看来自 OpenJDK12 的 JVM HotSpot 中的自旋锁实现。以下是它的实现方式(保留评论):
// Polite TATAS spinlock with exponential backoff - bounded spin.
// Ideally we'd use processor cycles, time or vtime to control
// the loop, but we currently use iterations.
// All the constants within were derived empirically but work over
// over the spectrum of J2SE reference platforms.
// On Niagara-class systems the back-off is unnecessary but
// is relatively harmless. (At worst it'll slightly retard
// acquisition times). The back-off is critical for older SMP systems
// where constant fetching of the LockWord would otherwise impair
// scalability.
//
// Clamp spinning at approximately 1/2 of a context-switch round-trip.
// See synchronizer.cpp for details and rationale.
int Monitor::TrySpin(Thread * const Self) {
if (TryLock()) return 1;
if (!os::is_MP()) return 0;
int Probes = 0;
int Delay = 0;
int SpinMax = 20;
for (;;) {
intptr_t v = _LockWord.FullWord;
if ((v & _LBIT) == 0) {
if (Atomic::cmpxchg (v|_LBIT, &_LockWord.FullWord, v) == v) {
return 1;
}
continue;
}
SpinPause();
// Periodically increase Delay -- variable Delay form
// conceptually: delay *= 1 + 1/Exponent
++Probes;
if (Probes > SpinMax) return 0;
if ((Probes & 0x7) == 0) {
Delay = ((Delay << 1)|1) & 0x7FF;
// CONSIDER: Delay += 1 + (Delay/4); Delay &= 0x7FF ;
}
// Stall for "Delay" time units - iterations in the current implementation.
// Avoid generating coherency traffic while stalled.
// Possible ways to delay:
// PAUSE, SLEEP, MEMBAR #sync, MEMBAR #halt,
// wr %g0,%asi, gethrtime, rdstick, rdtick, rdtsc, etc. ...
// Note that on Niagara-class systems we want to minimize STs in the
// spin loop. N1 and brethren write-around the L1$ over the xbar into the L2$.
// Furthermore, they don't have a W$ like traditional SPARC processors.
// We currently use a Marsaglia Shift-Xor RNG loop.
if (Self != NULL) {
jint rv = Self->rng[0];
for (int k = Delay; --k >= 0;) {
rv = MarsagliaXORV(rv);
if (SafepointMechanism::should_block(Self)) return 0;
}
Self->rng[0] = rv;
} else {
Stall(Delay);
}
}
}
在Atomic::cmpxchg
x86 上实现为
template<>
template<typename T>
inline T Atomic::PlatformCmpxchg<8>::operator()(T exchange_value,
T volatile* dest,
T compare_value,
atomic_memory_order /* order */) const {
STATIC_ASSERT(8 == sizeof(T));
__asm__ __volatile__ ("lock cmpxchgq %1,(%3)"
: "=a" (exchange_value)
: "r" (exchange_value), "a" (compare_value), "r" (dest)
: "cc", "memory");
return exchange_value;
}
我不明白的是“旧 SMP”系统退避背后的原因。在commnets中说
回退对于旧的 SMP 系统至关重要,在这些系统中,不断获取 LockWord 会损害可伸缩性。
我可以想象的原因是在较旧的 SMP 系统上,当获取然后 CASingLockWord
总线锁时总是断言(而不是缓存锁)。正如英特尔手册第 3 卷 8.1.4 所述:
对于 Intel486 和 Pentium 处理器,
LOCK#
信号总是在LOCK
操作期间在总线上断言,即使被锁定的内存区域被缓存在处理器中。对于 P6 和更新的处理器系列,如果在LOCK
操作期间被锁定的内存区域被缓存在执行LOCK
操作的处理器中作为回写内存并且完全包含在缓存行中,则处理器可能不会断言该LOCK#
信号在公交车上。
这是真正的原因吗?或者那是什么?
解决方案
推荐阅读
- javascript - MongoDB查询具有整数值的字段
- javascript - 检查href是否是网站路径之后的路径
- sql - CakePHP 2:如何加入选择的结果
- python - Pyro 和条件概率
- c# - MassTransit Kafka 消费者未被调用
- matlab - 如何在 matlab 中初始化具有任意数量变量的方程?
- flutter - 注释会影响何时生成颤振应用程序包?
- javascript - 错误:文件验证失败:内容:需要路径“内容”。标题:需要路径“标题”
- excel - 在 Excel 中绘制具有两个日期的装运数据
- spring-boot - “mvn package”和“mvn compile war:war”之间的区别