hadoop - 为什么名称节点只分配两个块而不是 3,尽管复制器设置设置为 3?
问题描述
我注意到这种行为,即使我的块复制设置设置为 3,在从客户端上传期间,名称节点有时会分配 2 个块,有时会分配 3 个块。有没有办法一直强制执行 3 个块?我发现dfs.replication.min
Hadoop 版本 2.7.3 中已弃用该属性。
我也可以在我的 hdfs-client 上设置它,还是需要在客户端、namenode 和 snamenode 上设置它并重新启动 nn 和 sn?
在我的hdfs-site.xml
中,我在 Namenode、Snamenode 和 hdfs-client 机器(本地机器)上都将它设置为 3。
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
Hadoop版本信息:
> hadoop version
20/08/26 10:57:36 DEBUG util.VersionInfo: version: 2.7.3
Hadoop 2.7.3
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r baa91f7c6bc9cb92be5982de4719c1c8af91ccff
Compiled by root on 2016-08-18T01:41Z
Compiled with protoc 2.5.0
我在设置时看到了相同的行为dfs.replication=2
,有时只分配 1 个块用于写入,有时 2 个。
顺便说一句,我正在使用 fsck 命令检查块和位置
> hdfs fsck /tmp/file1.txt -files -locations -blocks
更新#2
> hdfs fsck /tmp/25082020_test/88.txt -files -locations -blocks
FSCK started by sharad.mishra (auth:SIMPLE) from /10.3.61.108 for path /tmp/25082020_test/88.txt at Thu Aug 27 09:30:29 EDT 2020
/tmp/25082020_test/88.txt 40 bytes, 1 block(s): OK
0. BP-378822342-x.x.x.x-1515189431494:blk_1141207020_67468539 len=40 repl=2 [DatanodeInfoWithStorage[x.x.x.x:50010,DS-9eca5bb6-5d91-400c-8d59-ea0ed44a330d,DISK], DatanodeInfoWithStorage[x.x.x.x:50010,DS-d4b510b6-a6aa-4139-b9d0-64576ef2de6f,DISK]]
Status: HEALTHY
Total size: 40 B
Total dirs: 0
Total files: 1
Total symlinks: 0
Total blocks (validated): 1 (avg. block size 40 B)
Minimally replicated blocks: 1 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 2.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 42
Number of racks: 2
FSCK ended at Thu Aug 27 09:30:29 EDT 2020 in 1 milliseconds
The filesystem under path '/tmp/25082020_test/88.txt' is HEALTHY
更新#4
在复制过程中,namenode 只分配了一个块(DatanodeInfoStorage)并且不是以机架感知方式,但是第二个块以机架感知方式异步复制
❯ hdfs dfs -Ddfs.replication=2 -copyFromLocal file1.txt
/tmp/25082020_test/85.txt
20/08/26 14:08:16 DEBUG util.Shell: setsid is not available on this machine. So not using it.
20/08/26 14:08:16 DEBUG util.Shell: setsid exited with exit code 0
20/08/26 14:08:16 DEBUG conf.Configuration: parsing URL jar:file:/Users/sharad.mishra/Library/hadoop/hadoop-2.7.3/share/hadoop/common/hadoop-common-2.7.3.jar!/core-default.xml
20/08/26 14:08:16 DEBUG conf.Configuration: parsing input stream sun.net.www.protocol.jar.JarURLConnection$JarURLInputStream@4b952a2d
20/08/26 14:08:16 DEBUG conf.Configuration: parsing URL file:/Users/sharad.mishra/Library/hadoop/hadoop-2.7.3/etc/hadoop/core-site.xml
20/08/26 14:08:16 DEBUG conf.Configuration: parsing input stream java.io.BufferedInputStream@528931cf
20/08/26 14:08:16 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(always=false, about=, sampleName=Ops, type=DEFAULT, valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)])
20/08/26 14:08:16 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(always=false, about=, sampleName=Ops, type=DEFAULT, valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)])
20/08/26 14:08:16 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(always=false, about=, sampleName=Ops, type=DEFAULT, valueName=Time, value=[GetGroups])
20/08/26 14:08:16 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group related metrics
20/08/26 14:08:16 DEBUG util.KerberosName: Kerberos krb5 configuration not found, setting default realm to empty
20/08/26 14:08:16 DEBUG security.Groups: Creating new Groups object
20/08/26 14:08:16 DEBUG util.NativeCodeLoader: Trying to load the custom-built native-hadoop library...
20/08/26 14:08:16 DEBUG util.NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
20/08/26 14:08:16 DEBUG util.NativeCodeLoader: java.library.path=/Users/sharad.mishra/Library/hadoop/hadoop-2.7.3/lib/native
20/08/26 14:08:16 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/08/26 14:08:16 DEBUG util.PerformanceAdvisory: Falling back to shell based
20/08/26 14:08:16 DEBUG security.JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
20/08/26 14:08:16 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
20/08/26 14:08:16 DEBUG security.UserGroupInformation: hadoop login
20/08/26 14:08:16 DEBUG security.UserGroupInformation: hadoop login commit
20/08/26 14:08:16 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: sharad.mishra
20/08/26 14:08:16 DEBUG security.UserGroupInformation: Using user: "UnixPrincipal: sharad.mishra" with name sharad.mishra
20/08/26 14:08:16 DEBUG security.UserGroupInformation: User entry: "sharad.mishra"
20/08/26 14:08:16 DEBUG security.UserGroupInformation: UGI loginUser:sharad.mishra (auth:SIMPLE)
20/08/26 14:08:16 DEBUG hdfs.BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false
20/08/26 14:08:16 DEBUG hdfs.BlockReaderLocal: dfs.client.read.shortcircuit = true
20/08/26 14:08:16 DEBUG hdfs.BlockReaderLocal: dfs.client.domain.socket.data.traffic = false
20/08/26 14:08:16 DEBUG hdfs.BlockReaderLocal: dfs.domain.socket.path = /var/lib/hadoop-hdfs/dn_socket
20/08/26 14:08:16 WARN hdfs.DFSUtil: Namenode for eventlog-dev-nameservice remains unresolved for ID nn1. Check your hdfs-site.xml file to ensure namenodes are configured properly.
20/08/26 14:08:16 WARN hdfs.DFSUtil: Namenode for eventlog-dev-nameservice remains unresolved for ID nn2. Check your hdfs-site.xml file to ensure namenodes are configured properly.
20/08/26 14:08:16 DEBUG hdfs.HAUtil: No HA service delegation token found for logical URI hdfs://sample-nameservice
20/08/26 14:08:16 DEBUG hdfs.BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false
20/08/26 14:08:16 DEBUG hdfs.BlockReaderLocal: dfs.client.read.shortcircuit = true
20/08/26 14:08:16 DEBUG hdfs.BlockReaderLocal: dfs.client.domain.socket.data.traffic = false
20/08/26 14:08:16 DEBUG hdfs.BlockReaderLocal: dfs.domain.socket.path = /var/lib/hadoop-hdfs/dn_socket
20/08/26 14:08:16 DEBUG retry.RetryUtils: multipleLinearRandomRetry = null
20/08/26 14:08:16 DEBUG ipc.Server: rpcKind=RPC_PROTOCOL_BUFFER, rpcRequestWrapperClass=class org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper, rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@70e9c95d
20/08/26 14:08:16 DEBUG ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@4145bad8
20/08/26 14:08:17 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
20/08/26 14:08:17 DEBUG sasl.DataTransferSaslUtil: DataTransferProtocol not using SaslPropertiesResolver, no QOP found in configuration for dfs.data.transfer.protection
20/08/26 14:08:17 DEBUG ipc.Client: The ping interval is 60000 ms.
20/08/26 14:08:17 DEBUG ipc.Client: Connecting to sample-hw-namenode.casalemedia.com/x.x.x.220:8020
20/08/26 14:08:17 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra: starting, having connections 1
20/08/26 14:08:17 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra sending #0
20/08/26 14:08:17 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra got value #0
20/08/26 14:08:17 DEBUG ipc.ProtobufRpcEngine: Call: getFileInfo took 149ms
20/08/26 14:08:17 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra sending #1
20/08/26 14:08:17 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra got value #1
20/08/26 14:08:17 DEBUG ipc.ProtobufRpcEngine: Call: getFileInfo took 39ms
20/08/26 14:08:17 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra sending #2
20/08/26 14:08:17 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra got value #2
20/08/26 14:08:17 DEBUG ipc.ProtobufRpcEngine: Call: getFileInfo took 38ms
20/08/26 14:08:17 DEBUG hdfs.DFSClient: /tmp/25082020_test/85.txt._COPYING_: masked=rw-r--r--
20/08/26 14:08:17 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra sending #3
20/08/26 14:08:17 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra got value #3
20/08/26 14:08:17 DEBUG ipc.ProtobufRpcEngine: Call: create took 39ms
20/08/26 14:08:17 DEBUG hdfs.DFSClient: computePacketChunkSize: src=/tmp/25082020_test/85.txt._COPYING_, chunkSize=516, chunksPerPacket=126, packetSize=65016
20/08/26 14:08:17 DEBUG hdfs.LeaseRenewer: Lease renewer daemon for [DFSClient_NONMAPREDUCE_1494735345_1] with renew id 1 started
20/08/26 14:08:17 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra sending #4
20/08/26 14:08:17 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra got value #4
20/08/26 14:08:17 DEBUG ipc.ProtobufRpcEngine: Call: getFileInfo took 36ms
20/08/26 14:08:17 DEBUG hdfs.DFSClient: DFSClient writeChunk allocating new packet seqno=0, src=/tmp/25082020_test/85.txt._COPYING_, packetSize=65016, chunksPerPacket=126, bytesCurBlock=0
20/08/26 14:08:17 DEBUG hdfs.DFSClient: Queued packet 0
20/08/26 14:08:17 DEBUG hdfs.DFSClient: Queued packet 1
20/08/26 14:08:17 DEBUG hdfs.DFSClient: Allocating new block
20/08/26 14:08:17 DEBUG hdfs.DFSClient: Waiting for ack for: 1
20/08/26 14:08:17 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra sending #5
20/08/26 14:08:17 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra got value #5
20/08/26 14:08:17 DEBUG ipc.ProtobufRpcEngine: Call: addBlock took 43ms
20/08/26 14:08:17 DEBUG hdfs.DFSClient: pipeline = DatanodeInfoWithStorage[x.x.x.231:50010,DS-9eca5bb6-5d91-400c-8d59-ea0ed44a330d,DISK]
20/08/26 14:08:17 DEBUG hdfs.DFSClient: Connecting to datanode x.x.x.231:50010
20/08/26 14:08:17 DEBUG hdfs.DFSClient: Send buf size 131072
20/08/26 14:08:17 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra sending #6
20/08/26 14:08:17 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra got value #6
20/08/26 14:08:17 DEBUG ipc.ProtobufRpcEngine: Call: getServerDefaults took 35ms
20/08/26 14:08:17 DEBUG sasl.SaslDataTransferClient: SASL client skipping handshake in unsecured configuration for addr = /x.x.x.231, datanodeId = DatanodeInfoWithStorage[x.x.x.231:50010,DS-9eca5bb6-5d91-400c-8d59-ea0ed44a330d,DISK]
20/08/26 14:08:17 DEBUG hdfs.DFSClient: DataStreamer block BP-378822342-x.x.x.220-1515189431494:blk_1141207054_67468573 sending packet packet seqno: 0 offsetInBlock: 0 lastPacketInBlock: false lastByteOffsetInBlock: 40
20/08/26 14:08:18 DEBUG hdfs.DFSClient: DFSClient seqno: 0 reply: SUCCESS downstreamAckTimeNanos: 0 flag: 0
20/08/26 14:08:18 DEBUG hdfs.DFSClient: DataStreamer block BP-378822342-x.x.x.220-1515189431494:blk_1141207054_67468573 sending packet packet seqno: 1 offsetInBlock: 40 lastPacketInBlock: true lastByteOffsetInBlock: 40
20/08/26 14:08:18 DEBUG hdfs.DFSClient: DFSClient seqno: 1 reply: SUCCESS downstreamAckTimeNanos: 0 flag: 0
20/08/26 14:08:18 DEBUG hdfs.DFSClient: Closing old block BP-378822342-x.x.x.220-1515189431494:blk_1141207054_67468573
20/08/26 14:08:18 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra sending #7
20/08/26 14:08:18 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra got value #7
20/08/26 14:08:18 DEBUG ipc.ProtobufRpcEngine: Call: complete took 37ms
20/08/26 14:08:18 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra sending #8
20/08/26 14:08:19 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra got value #8
20/08/26 14:08:19 DEBUG ipc.ProtobufRpcEngine: Call: rename took 875ms
20/08/26 14:08:19 DEBUG ipc.Client: stopping client from cache: org.apache.hadoop.ipc.Client@4145bad8
20/08/26 14:08:19 DEBUG ipc.Client: removing client from cache: org.apache.hadoop.ipc.Client@4145bad8
20/08/26 14:08:19 DEBUG ipc.Client: stopping actual client because no more references remain: org.apache.hadoop.ipc.Client@4145bad8
20/08/26 14:08:19 DEBUG ipc.Client: Stopping client
20/08/26 14:08:19 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra: closed
20/08/26 14:08:19 DEBUG ipc.Client: IPC Client (1030684756) connection to sample-hw-namenode.casalemedia.com/x.x.x.220:8020 from sharad.mishra: stopped, remaining connections 0
fsck 命令的输出
❯ hdfs fsck /tmp/25082020_test/85.txt -files -locations -blocks
20/08/27 13:01:37 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/08/27 13:01:38 WARN hdfs.DFSUtil: Namenode for eventlog-dev-nameservice remains unresolved for ID nn1. Check your hdfs-site.xml file to ensure namenodes are configured properly.
20/08/27 13:01:38 WARN hdfs.DFSUtil: Namenode for eventlog-dev-nameservice remains unresolved for ID nn2. Check your hdfs-site.xml file to ensure namenodes are configured properly.
20/08/27 13:01:38 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
Connecting to namenode via http://sample-hw-namenode.casalemedia.com:50070/fsck?ugi=sharad.mishra&files=1&locations=1&blocks=1&path=%2Ftmp%2F25082020_test%2F85.txt
FSCK started by sharad.mishra (auth:SIMPLE) from /10.3.61.108 for path /tmp/25082020_test/85.txt at Thu Aug 27 13:01:38 EDT 2020
/tmp/25082020_test/85.txt 40 bytes, 1 block(s): OK
0. BP-378822342-x.x.x.220-1515189431494:blk_1141207054_67468573 len=40 repl=2 [DatanodeInfoWithStorage[x.x.x.231:50010,DS-05f16460-cb85-41e3-98e1-f6f7366b2738,DISK], DatanodeInfoWithStorage[10.7.24.197:50010,DS-fa4ebf78-9bfc-404f-b1d6-909098c0b394,DISK]]
Status: HEALTHY
Total size: 40 B
Total dirs: 0
Total files: 1
Total symlinks: 0
Total blocks (validated): 1 (avg. block size 40 B)
Minimally replicated blocks: 1 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 2.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 42
Number of racks: 2
FSCK ended at Thu Aug 27 13:01:38 EDT 2020 in 0 milliseconds
The filesystem under path '/tmp/25082020_test/85.txt' is HEALTHY
解决方案
推荐阅读
- shell - Testing the Success of a Command Involving Pipes
- logstash - 在生产环境中部署 Logstash
- jenkins - 配置更改后对 In progress Build 的影响
- python - 将 float64 转换为 int(excel 到 pandas)
- sprite - 如何让一个精灵跟随 Java 中的另一个精灵?
- android - 如何在 Firestore 中为 Android 创建本地化数据方案(按语言)?
- mysql - 未找到 XAMPP MYSQL
- python - 为什么 django rest api root 没有列出经典端点
- django - 如何按 Django 中的 ManyToMany 对象的属性进行排序?
- here-api - 关于 HERE 批量地理编码的问题 - HERE 批量地理编码需要多长时间才能完成 - 处于接受状态