hadoop - HDFS 无法读取数据(出现错误,状态消息 opReadBlock)
问题描述
我做过haddop升级和降级操作,然后老数据打不开。你能帮我检查一下这个问题吗
[root@master current]# hdfs dfs -cat /test/test.csv
21/03/02 14:42:17 WARN hdfs.BlockReaderFactory: I/O error constructing remote block reader.
java.io.IOException: Got error, status message opReadBlock BP-1289313299-192.168.1.26-1533200460191:blk_1073762237_21423
received exception java.io.IOException: BlockId 1073762237 is not valid.,
for OP_READ_BLOCK, self=/10.10.202.26:47930, remote=/10.10.202.26:50010,
for file /test/test.csv, for pool BP-1289313299-192.168.1.26-1533200460191 block 1073762237_21423
at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:140)
at org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:456)
at org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:424)
at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:818)
at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:69
...
org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1289313299-192.168.1.26-1533200460191:blk_1073762237_21423 file=/test/test.csv
at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:983)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:642)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:882)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:934)
at java.io.DataInputStrea
-rw-r--r-- 1 root supergroup 359 2018-08-21 11:19 /test/test.csv
[root@master current]# hdfs fsck /test/
Connecting to namenode via http://master:50070/fsck?ugi=root&path=%2Ftest
FSCK started by root (auth:SIMPLE) from /10.10.202.26 for path /test at Tue Mar 02 15:05:23 CST 2021
...................................................Status: HEALTHY
Total size: 6720241197 B
Total dirs: 17
Total files: 51
Total symlinks: 0
Total blocks (validated): 85 (avg. block size 79061661 B)
Minimally replicated blocks: 85 (100.0 %)
Over-replicated blocks: 0
该文件也未能下载 在此处输入图像描述
解决方案
确保时间正确并在所有服务器上同步。确保 datanode 文件在 linux 文件系统上具有正确的权限。
尝试:
hadoop fsck /test/ -files -blocks
hadoop fsck /test/ -list-corruptfileblocks
在某些情况下会更改 hdfs-site.xml 文件:
<property>
<name>dfs.client.use.datanode.hostname</name>
<value>true</value>
</property>
帮助解决了这个问题。
推荐阅读
- tree - 图是序言中的树吗
- python-3.x - TensorFlow Federated - 适应现有的 keras 模型
- python - 如何修复“pygame中的背景故障”连续滑动背景?
- php - 我应该在我的情况下编写什么样的测试?
- vue.js - 从淘汰赛切换到 VueJs:缺少概念
- javascript - 如何使用 Bash 命令在包含一些 Javascript 变量的文件中查找和替换
- ajax - Wordpress admin-ajax.php 适用于 localhost 但不适用于服务器
- c# - Draw continous line from unordered points
- react-admin - AutocompleteInput 将当前值显示为第一个建议
- r - 年龄转变率 R