首页 > 技术文章 > HDFS的常用命令操作

unknowsthing 2021-12-02 23:14 原文

HDFS的常用命令操作
1、基本语法
2、常用命令操作
#######################
1、基本语法
bin/hadoop fs 具体命令

bin/hdfs dfs 具体命令
(dfs是fs的实现类)

二者区别

hadoop fs:可以作用于除hdfs外的其他文件系统,作用范围更广。通用的文件系统命令,针对任何系统,比如本地文件、HDFS文件、HFTP文件、S3文件系统等。
(曾经还有hadoop dfs:专门针对hdfs分布式文件系统,已经不推荐使用)

hdfs dfs:专门针对hdfs分布式文件系统,相比于上面的命令更为推荐,并且当使用hadoop dfs时内部会被转为hdfs dfs命令。
2、常用命令操作
》启动hadoop集群
sbin/start-dfs.sh
sbin/start-yarn.sh
(1)-help:可数出命令所需的参数
hadoop@hadoop101:/opt/module/hadoop-3.1.3/bin$ hadoop fs -help rm
-rm [-f] [-r|-R] [-skipTrash] [-safely] <src> ... :
Delete all files that match the specified file pattern. Equivalent to the Unix
command "rm <src>"

-f If the file does not exist, do not display a diagnostic message or
modify the exit status to reflect an error.
-[rR] Recursively deletes directories.
-skipTrash option bypasses trash, if enabled, and immediately deletes <src>.
-safely option requires safety confirmation, if enabled, requires
confirmation before deleting large directory with more than
<hadoop.shell.delete.limit.num.files> files. Delay is expected when
walking over large directory recursively to count the number of
files to be deleted before the confirmation.

(2)-ls:显示目录信息
hadoop@hadoop101:/opt/module/hadoop-3.1.3/bin$ hadoop fs -ls /
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2020-02-29 00:11 /home
(3)-mkdir:在HDFS上创建目录
hadoop@hadoop101:/opt/module/hadoop-3.1.3/bin$ hadoop fs -mkdir -p /home/hadoop/op_test
hadoop@hadoop101:/opt/module/hadoop-3.1.3/bin$ hadoop fs -ls /home/hadoop
Found 2 items
drwxr-xr-x - hadoop supergroup 0 2020-02-29 00:14 /home/hadoop/input
drwxr-xr-x - hadoop supergroup 0 2020-03-02 06:42 /home/hadoop/op_test
(4)-moveFromLocal:从本地剪切粘贴到HDFS
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ touch op_test.txt
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ echo "hello,hdfs" >> op_test.txt
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -moveFromLocal ./op_test.txt /home/hadoop/op_test
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ ls
test.input

(5)-appendToFile:追加一个文件到已经存在的文件末尾
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ touch op_append.txt
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ echo "append-hdfs" >> op_append.txt
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop -appendToFile op_append.txt /home/hadoop/op_test/op_test.txt
(6)-cat:显示文件内容
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -cat /home/hadoop/op_test/op_test.txt
hello,hdfs
append-hdfs
(7)-chgrp、-chmod、-chown:Linux文件系统中的用法一样,修改文件所述权限
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -chmod 666 /home/hadoop/op_test/op_test.txt
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -chown hadoop:hadoop /home/hadoop/op_test/op_test.txt

(8)-copyFromLocal:从本地文件系统中拷贝文件到HDFS路径去
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ ls
op_append.txt test.input
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -copyFromLocal test.input /home/hadoop/op_test

(9)-copyToLocal:从HDFS拷贝到本地
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -copyToLocal /home/hadoop/op_test/op_test.txt ./
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ ls
op_append.txt op_test.txt test.input
(10)-cp:从HDFS的一个路径拷贝到HDFS的另一个路径
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -cp /home/hadoop/op_append.txt /home/hadoop/op_test

(11)-mv:在HDFS目录中移动文件
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -mv /home/hadoop/op_test/op_append.txt /home/hadoop
(12)-get:等同于copyToLocal,从HDFS下载文件到本地
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -get /home/hadoop/op_append.txt ./
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ ls
op_append.txt op_test.txt test.input
(13)-getmerge:从HDFS的目录下,将文件合并起来后下载
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -getmerge /home/hadoop/op_test/* ./common.txt

hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ ls
common.txt op_append.txt op_test.txt test.input

hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ cat common.txt
append-hdfs
hello,hadoop!
(14)-put:等同于copyFromLocal,从HDFS目录下拷贝文件到本地
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -put ./common.txt /home/hadoop/op_test/
2020-03-02 07:15:57,049 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -ls /home/hadoop/op_test
Found 3 items
-rw-r--r-- 3 hadoop supergroup 26 2020-03-02 07:15 /home/hadoop/op_test/common.txt
-rw-r--r-- 3 hadoop supergroup 12 2020-03-02 07:06 /home/hadoop/op_test/op_append.txt
-rw-r--r-- 3 hadoop supergroup 14 2020-03-02 06:56 /home/hadoop/op_test/test.input
(15)-tail:显示一个文件的末尾
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -tail
home/hadoop/op_test/common.txt
append-hdfs
hello,hadoop!
(16)-rm:删除文件或文件夹
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -rm /home/hadoop/*.txt
Deleted /home/hadoop/op_append.txt
Deleted /home/hadoop/op_test.txt
(17)-rmdir:删除空目录
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -mkdir /test
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -rmdir /test
(18)-du:统计文件夹的大小信息
-h:显示带MB及以上的可读形式
-s:得到文件夹下文件大小的总和
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -du /home/hadoop/op_test
26 78 /home/hadoop/op_test/common.txt
12 36 /home/hadoop/op_test/op_append.txt
14 42 /home/hadoop/op_test/test.input
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -du -h /home/hadoop/op_test
26 78 /home/hadoop/op_test/common.txt
12 36 /home/hadoop/op_test/op_append.txt
14 42 /home/hadoop/op_test/test.input
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -du -s /home/hadoop/op_test
52 156 /home/hadoop/op_test
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -du -s -h /home/hadoop/op_test

(19)-setrep:设置HDFS中文件的副本数量
hadoop@hadoop101:/opt/module/hadoop-3.1.3/testinput$ hadoop fs -setrep 2 /home/hadoop/op_test/common.txt
Replication 2 set: /home/hadoop/op_test/common.txt
1
2


注:实际可存在的副本数受DataNode的数量限制,不可大于DataNode的数量。若设置的副本数大于DataNode的数量,只在NameNode中记录本次设置的数量,而实际的副本数量将自动被控制为与DataNode相同数量。

推荐阅读