unix—在linux中使用hadoop fsck命令时是否可以跳过文件检查?

ovfsdjhp  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(396)

我想使用hadoop fsck命令跳过指定路径上的文件检查。我们能做到吗?我正在使用以下命令:
hadoop fsck>/output.txt
我也检查了hdfs指南,但是没有什么可以从上面的命令中排除路径。
请帮忙。

9rbhqvlz

9rbhqvlz1#

从hadoop2.9.0开始,没有办法在hadoopfsck命令中指定排除路径。
但是您可以使用webhdfsrestapi来获得与fsck相同的文件系统健康信息。使用这个api,我们可以使用liststatusapi获取目录中所有文件的信息,或者使用getfilestatusapi获取单个文件的信息。
对于目录:

curl -i  "http://<HOST>:<PORT>/webhdfs/v1/<DIRECTORY_PATH>?op=LISTSTATUS"

对于文件:

curl -i  "http://<HOST>:<PORT>/webhdfs/v1/<FILE_PATH>?op=GETFILESTATUS"

它们将返回带有filestatuses json对象的响应。
请在下面找到nn为目录返回的示例响应:

curl -i "http://<NN_HOST>:<HTTP_PORT>/webhdfs/v1/<DIRECTORY_PATH>?op=LISTSTATUS"
HTTP/1.1 200 OK
Cache-Control: no-cache
Content-Type: application/json
Transfer-Encoding: chunked
Server: Jetty(6.1.26.hwx)

{"FileStatuses":{"FileStatus":[
{"accessTime":1489059994224,"blockSize":134217728,"childrenNum":0,"fileId":209158298,"group":"hdfs","length":0,"modificationTime":1489059994227,"owner":"XXX","pathSuffix":"_SUCCESS","permission":"644","replication":3,"storagePolicy":0,"type":"FILE"},
{"accessTime":1489059969939,"blockSize":134217728,"childrenNum":0,"fileId":209158053,"group":"hdfs","length":0,"modificationTime":1489059986846,"owner":"XXX","pathSuffix":"part-m-00000","permission":"644","replication":3,"storagePolicy":0,"type":"FILE"},
{"accessTime":1489059982614,"blockSize":134217728,"childrenNum":0,"fileId":209158225,"group":"hdfs","length":0,"modificationTime":1489059993497,"owner":"XXX","pathSuffix":"part-m-00001","permission":"644","replication":3,"storagePolicy":0,"type":"FILE"},
{"accessTime":1489059977524,"blockSize":134217728,"childrenNum":0,"fileId":209158188,"group":"hdfs","length":0,"modificationTime":1489059983034,"owner":"XXX","pathSuffix":"part-m-00002","permission":"644","replication":3,"storagePolicy":0,"type":"FILE"}]}}

相关问题