首页 > 解决方案 > mongodb一直在吸内存

问题描述

版本是 4.0.6 社区。

1 个 mongos,1 个配置服务器,1 个 8 核 32gb ram ubuntu16.04 服务器上的分片。

WiredTiger 的缓存大小设置为 5gb(尝试了 4gb、6gb、10gb、16gb,所有相同的行为)。

总数据集为 40gb,总索引集约为 10gb,并非全部处于活动状态。

20小时后:

db.serverStatus().tcmalloc.tcmalloc.formattedString 显示:

MALLOC:    22843995584 (21785.7 MiB) Bytes in use by application
MALLOC: +   1307635712 ( 1247.1 MiB) Bytes in page heap freelist
MALLOC: +    766107360 (  730.6 MiB) Bytes in central cache freelist
MALLOC: +        39168 (    0.0 MiB) Bytes in transfer cache freelist
MALLOC: +    210563680 (  200.8 MiB) Bytes in thread cache freelists
MALLOC: +    200433920 (  191.1 MiB) Bytes in malloc metadata
MALLOC:   ------------
MALLOC: =  25328775424 (24155.4 MiB) Actual memory used (physical + swap)
MALLOC: +   1736736768 ( 1656.3 MiB) Bytes released to OS (aka unmapped)
MALLOC:   ------------
MALLOC: =  27065512192 (25811.7 MiB) Virtual address space used
MALLOC:
MALLOC:        2979451              Spans in use
MALLOC:           1147              Thread heaps in use
MALLOC:           4096              Tcmalloc page size

mongostat 显示:

insert query update delete getmore command dirty  used flushes vsize   res qrw arw net_in net_out conn       set repl                time
 6  5917    603      1       0    82|0  3.8% 78.9%       0 27.6G 23.6G 0|0 2|0  1.45m   4.31m 1094 air_shard  PRI Jun  6 02:51:18.205
 2  5864    605     *0       0   102|0  3.9% 78.9%       0 27.6G 23.6G 0|0 1|2  1.44m   4.29m 1094 air_shard  PRI Jun  6 02:51:19.205
 4  5555    684     *0       0   100|0  4.1% 79.0%       0 27.6G 23.6G 0|1 2|0  1.41m   4.08m 1094 air_shard  PRI Jun  6 02:51:20.204
 2  5430    624     *0       0    71|0  4.3% 79.0%       0 27.6G 23.6G 0|0 1|0  1.36m   4.05m 1094 air_shard  PRI Jun  6 02:51:21.205
 3  5606    807     *0       0    77|0  4.5% 79.0%       0 27.6G 23.6G 0|0 1|2  1.48m   4.03m 1095 air_shard  PRI Jun  6 02:51:22.203
 2  5809    685     *0       0    93|0  4.6% 79.1%       0 27.6G 23.6G 0|0 1|0  1.47m   4.21m 1095 air_shard  PRI Jun  6 02:51:23.203
 2  5701    715     *0       0   116|0  4.7% 79.1%       0 27.6G 23.6G 1|0 1|0  1.46m   4.15m 1095 air_shard  PRI Jun  6 02:51:24.205
 1  6082    649     *0       0    80|0  4.8% 79.1%       0 27.6G 23.6G 0|0 1|0  1.50m   4.42m 1095 air_shard  PRI Jun  6 02:51:25.208
 6  6016    727      1       0    90|0  5.0% 79.2%       0 27.6G 23.6G 0|0 2|0  1.52m   4.36m 1095 air_shard  PRI Jun  6 02:51:26.202
 1  5797    698     *0       0    97|0  5.1% 79.1%       0 27.6G 23.6G 0|0 3|0  1.47m   4.22m 1095 air_shard  PRI Jun  6 02:51:27.203

db.serverStatus().wiredTiger.cache(24 小时后 (tcmalloc 28gb))

{
"application threads page read from disk to cache count" : 8213668,
"application threads page read from disk to cache time (usecs)" : 2081239762,
"application threads page write from cache to disk count" : 23636538,
"application threads page write from cache to disk time (usecs)" : 373385784,
"bytes belonging to page images in the cache" : 3678160542,
"bytes belonging to the cache overflow table in the cache" : 182,
"bytes currently in the cache" : 4283010946,
"bytes dirty in the cache cumulative" : 708202598553,
"bytes not belonging to page images in the cache" : 604850403,
"bytes read into cache" : 142476442004,
"bytes written from cache" : 493106511738,
"cache overflow cursor application thread wait time (usecs)" : 0,
"cache overflow cursor internal thread wait time (usecs)" : 0,
"cache overflow score" : 0,
"cache overflow table entries" : 0,
"cache overflow table insert calls" : 0,
"cache overflow table remove calls" : 0,
"checkpoint blocked page eviction" : 15206,
"eviction calls to get a page" : 24221700,
"eviction calls to get a page found queue empty" : 266843,
"eviction calls to get a page found queue empty after locking" : 281267,
"eviction currently operating in aggressive mode" : 0,
"eviction empty score" : 0,
"eviction passes of a file" : 20120988,
"eviction server candidate queue empty when topping up" : 109812,
"eviction server candidate queue not empty when topping up" : 286120,
"eviction server evicting pages" : 0,
"eviction server slept, because we did not make progress with eviction" : 2053931,
"eviction server unable to reach eviction goal" : 0,
"eviction state" : 32,
"eviction walk target pages histogram - 0-9" : 19436220,
"eviction walk target pages histogram - 10-31" : 283191,
"eviction walk target pages histogram - 128 and higher" : 0,
"eviction walk target pages histogram - 32-63" : 157161,
"eviction walk target pages histogram - 64-128" : 244416,
"eviction walks abandoned" : 2151028,
"eviction walks gave up because they restarted their walk twice" : 15229599,
"eviction walks gave up because they saw too many pages and found no candidates" : 975857,
"eviction walks gave up because they saw too many pages and found too few candidates" : 37269,
"eviction walks reached end of tree" : 33311884,
"eviction walks started from root of tree" : 18426527,
"eviction walks started from saved location in tree" : 1694461,
"eviction worker thread active" : 4,
"eviction worker thread created" : 0,
"eviction worker thread evicting pages" : 23588288,
"eviction worker thread removed" : 0,
"eviction worker thread stable number" : 0,
"failed eviction of pages that exceeded the in-memory maximum count" : 6712,
"failed eviction of pages that exceeded the in-memory maximum time (usecs)" : 35676,
"files with active eviction walks" : 0,
"files with new eviction walks started" : 18082285,
"force re-tuning of eviction workers once in a while" : 0,
"hazard pointer blocked page eviction" : 22282,
"hazard pointer check calls" : 23735427,
"hazard pointer check entries walked" : 1040247438,
"hazard pointer maximum array length" : 2,
"in-memory page passed criteria to be split" : 22349,
"in-memory page splits" : 6499,
"internal pages evicted" : 96859,
"internal pages split during eviction" : 99,
"leaf pages split during eviction" : 19160,
"maximum bytes configured" : 5368709120,
"maximum page size at eviction" : 8388389,
"modified pages evicted" : 17068834,
"modified pages evicted by application threads" : 0,
"operations timed out waiting for space in cache" : 0,
"overflow pages read into cache" : 0,
"page split during eviction deepened the tree" : 0,
"page written requiring cache overflow records" : 0,
"pages currently held in the cache" : 183243,
"pages evicted because they exceeded the in-memory maximum count" : 6499,
"pages evicted because they exceeded the in-memory maximum time (usecs)" : 2617799,
"pages evicted because they had chains of deleted items count" : 45671,
"pages evicted because they had chains of deleted items time (usecs)" : 82342,
"pages evicted by application threads" : 88378,
"pages queued for eviction" : 39570786,
"pages queued for urgent eviction" : 17086,
"pages queued for urgent eviction during walk" : 3747,
"pages read into cache" : 8214055,
"pages read into cache after truncate" : 831,
"pages read into cache after truncate in prepare state" : 0,
"pages read into cache requiring cache overflow entries" : 0,
"pages read into cache requiring cache overflow for checkpoint" : 0,
"pages read into cache skipping older cache overflow entries" : 0,
"pages read into cache with skipped cache overflow entries needed later" : 0,
"pages read into cache with skipped cache overflow entries needed later by checkpoint" : 0,
"pages requested from the cache" : 4862957119,
"pages seen by eviction walk" : 299339913,
"pages selected for eviction unable to be evicted" : 53658,
"pages walked for eviction" : 12906348724,
"pages written from cache" : 38874718,
"pages written requiring in-memory restoration" : 305125,
"percentage overhead" : 8,
"tracked bytes belonging to internal pages in the cache" : 102426754,
"tracked bytes belonging to leaf pages in the cache" : 4180584192,
"tracked dirty bytes in the cache" : 288426173,
"tracked dirty pages in the cache" : 16896,
"unmodified pages evicted" : 6596106

}

tcmalloc 不断分配内存。最后它会交换和滞后并死亡。

我的应用程序使用 python + pymongo(1000 个连接)。

在开始交换之前,应用程序都运行良好。

在我看来,总内存使用量应该是 5gb(wiredTiger 缓存设置)+ 最多 2~3gb 的 mongodb 基本使用量(用于连接维护等......)

不知道为什么它一直在分配内存。

标签: databasemongodb

解决方案


推荐阅读