$ HDFS Issue

$ 1.DataNode汇报块不及时导致报错

2020-11-02 16:39:41,515 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK* checkFileProgress: blk_1653438437_582833232{blockUCState=COMMITTED, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-b2dc6b07-b55a-4176-a5db-206250c4aed6:NORMAL:
172.16.36.158:1004|RBW], ReplicaUnderConstruction[[DISK]DS-06c29767-45cf-43de-a68b-86b103ea9c0a:NORMAL:
172.16.36.179:1004|RBW], ReplicaUnderConstruction[[DISK]DS-9a76064f-6b52-4cb3-b469-f86efedc582b:NORMAL:
172.16.22.209:1004|RBW]]} has not reached minimal replication 1

2020-11-02 16:39:41,918 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK* checkFileProgress: blk_1653438437_582833232{
  blockUCState=COMMITTED, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[
    [DISK]DS-b2dc6b07-b55a-4176-a5db-206250c4aed6:NORMAL:172.16.36.158:1004|RBW], 
    ReplicaUnderConstruction[[DISK]DS-06c29767-45cf-43de-a68b-86b103ea9c0a:NORMAL:
    172.16.36.179:1004|RBW], ReplicaUnderConstruction[[DISK]DS-9a76064f-6b52-4cb3-b469-f86efedc582b:NORMAL:172.16.22.209:1004|RBW]]} has not reached minimal replication 1
2020-11-02 16:39

$ 2.没有足够副本数

参考:

$ 3.小文件问题

所谓的小文件是指远远小于文件块大小的文件。众所周知HDFS中过多的小文件会给NameNode造成很大的压力,在使用HDFS的过程中,应尽量避免生成过多的小文件。

将本地的小文件合并上传到HDFS:

hadoop fs -appendToFile 1.txt 2.txt hdfs://cdh5/tmp/merged.txt

下载HDFS的小文件并合并到本地:

hadoop fs -getmerge hdfs://cdh5/tmp/*.txt local.txt

参考:

更新时间: 12/20/2020, 3:59:42 AM