mongodb - 由于 Invariant failure rs.get() src/mongo/db/catalog/database.cpp,MongoDB 修复失败
问题描述
MongoDB 版本:3.4.24
托管在 Linux 服务器上的 MongoDB 由于内存过度使用而突然关闭。
使用以下方式启动mongodb修复:sudo mongod -f /etc/mongodrepair.conf --repair
整个数据库为2.5TB,在修复/重新索引db时,成功修复了接近2.4TB,但由于Invariantfailure错误,最后972MB的DB修复失败。
修复日志
2020-07-04T17:17:07.441+0000 I INDEX [initandlisten] building index using bulk method; build may temporarily use up to 50 megabytes of RA$
2020-07-04T17:17:07.448+0000 I INDEX [initandlisten] build index on: test.summary properties: { v: 1, key: { totalVolume: -1 }, name: "totalV$
lume_-1", ns: "test.summary", background: true }
2020-07-04T17:17:07.448+0000 I INDEX [initandlisten] building index using bulk method; build may temporarily use up to 50 megabytes of RA$
2020-07-04T17:17:07.456+0000 I INDEX [initandlisten] build index on: test.summary properties: { v: 1, key: { ts: -1 }, name: "ts_-1", ns: "test.summary", background: true }
2020-07-04T17:17:07.456+0000 I INDEX [initandlisten] building index using bulk method; build may temporarily use up to 50 megabytes of RA$
2020-07-04T17:17:08.673+0000 I - [initandlisten] Invariant failure rs.get() src/mongo/db/catalog/database.cpp 195
2020-07-04T17:17:08.673+0000 I - [initandlisten]
***aborting after invariant() failure
2020-07-04T17:17:08.717+0000 F - [initandlisten] Got signal: 6 (Aborted).
重启日志
2020-07-04T17:39:14.476+0000 I CONTROL [main] ***** SERVER RESTARTED *****
2020-07-04T17:39:14.480+0000 I CONTROL [initandlisten] MongoDB starting : pid=20485 port=27017 dbpath=/home/db324 64-bit host=ip-*-*-*-*
2020-07-04T17:39:14.480+0000 I CONTROL [initandlisten] db version v3.4.24
2020-07-04T17:39:14.480+0000 I CONTROL [initandlisten] allocator: tcmalloc
2020-07-04T17:39:14.480+0000 I CONTROL [initandlisten] modules: none
2020-07-04T17:39:14.480+0000 I CONTROL [initandlisten] build environment:
2020-07-04T17:39:14.480+0000 I CONTROL [initandlisten] distarch: x86_64
2020-07-04T17:39:14.480+0000 I CONTROL [initandlisten] target_arch: x86_64
2020-07-04T17:39:14.480+0000 I CONTROL [initandlisten] options: { config: "/etc/mongod.conf", net: { bindIp: "-.-.-.-", port: 27017 }, replication: $
oplogSizeMB: 10240, replSetName: "rs1" }, storage: { dbPath: "/home/db324", directoryPerDB: true, engine: "wiredTiger", journal: { enabled$
true }, wiredTiger: { engineConfig: { cacheSizeGB: 108.0 } } }, systemLog: { destination: "file", logAppend: true, path: "/var/log/mongodb/mongod.lo$
" } }
2020-07-04T17:39:14.480+0000 W - [initandlisten] Detected unclean shutdown - /home/db324/mongod.lock is not empty.
2020-07-04T17:39:14.499+0000 W STORAGE [initandlisten] Recovering data from the last clean checkpoint.
2020-07-04T17:39:14.499+0000 I STORAGE [initandlisten]
2020-07-04T17:39:14.499+0000 I STORAGE [initandlisten] ** WARNING: The configured WiredTiger cache size is more than 80% of available RAM.
2020-07-04T17:39:14.499+0000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=110592M,session_max=20000,eviction=(threads_min=4,t$
reads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000)$
checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),verbose=(recovery_progress),
2020-07-04T17:39:14.667+0000 I STORAGE [initandlisten] WiredTiger message [1593884354:667272][20485:0x7fdbcb287580], txn-recover: Main recovery loop$
starting at 73368/128
2020-07-04T17:39:14.667+0000 I STORAGE [initandlisten] WiredTiger message [1593884354:667951][20485:0x7fdbcb287580], txn-recover: Recovering log 733$
8 through 73369
2020-07-04T17:39:14.733+0000 I STORAGE [initandlisten] WiredTiger message [1593884354:733044][20485:0x7fdbcb287580], txn-recover: Recovering log 733$
9 through 73369
2020-07-04T17:39:15.164+0000 E STORAGE [initandlisten] WiredTiger error (-31802) [1593884355:164908][20485:0x7fdbcb287580], test/collectio$
-56-3854974571131417844.wt, WT_SESSION.open_cursor: /home/db324/test/collection-56-3854974571131417844.wt: handle-read: pread: failed $
to read 4096 bytes at offset 28672: WT_ERROR: non-specific WiredTiger error
2020-07-04T17:39:15.164+0000 I - [initandlisten] Invariant failure: ret resulted in status UnknownError: -31802: WT_ERROR: non-specific WiredT$
ger error at src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp 113
2020-07-04T17:39:15.165+0000 I - [initandlisten]
***aborting after invariant() failure
有没有办法修复/恢复数据库的最后一部分?或者
有没有办法忽略损坏的数据库?或者
是否可以在没有最后一个错误数据库的情况下删除整个 2.4TB 数据并创建一个 2.4TB 的新 MongoDB 实例?
我将非常感谢您的帮助。
提前致谢
解决方案
修复日志表明它未能在 上建立索引ns: "test.summary"
,
另一个日志为您提供文件名和错误的偏移量:
/home/db324/test/collection-56-3854974571131417844.wt:handle-read:pread:在偏移量 28672 处读取 4096 个字节失败 $
文件中该点之后的数据可能无法挽救。你可以试试:
- 备份现有文件
- 删除文件 /home/db324/test/collection-56-3854974571131417844.wt
- 在这个 dbpath 上重新运行 mongod --repair
如果一切顺利,它将为该集合创建一个新的空文件。
如果您需要尝试挽救该数据,则在上述成功后,您知道其余数据文件是一致的,然后从备份中重新复制该文件并再次尝试修复。
推荐阅读
- reactjs - React import html - 模块解析失败:您可能需要适当的加载器来处理此文件类型
- java - 使用同步器将线程写入文件同步
- c# - 如何使面板中的表单在 VisualStudio 中停止移动?
- velo - 插入后未调用 WIX 挂钩
- firebase - 无法转换未来
转字符串并在 Dart (Flutter) 中使用 - sql-server - 以相同的 ID 重新插入已删除的记录
- python-3.x - "NoReverseMatch" 未找到带有参数 '('',)' 的 'by_rubric' 的反向
- amazon-web-services - 为什么 MultiAZ db.t2.small 预留实例需要两个标准化单元?
- python - 边缘检测 Python
- python - 在 Pygame 中使用 WASD 移动立方体?