阿里云 OSS-HDFS 服务(JindoFS 服务)元数据导出使用说明§

(从 4.6.0 开始支持)

介绍§

使用元数据导出功能,可以将当前 OSS-HDFS bucket 下的文件元数据清单导出到 /.sysinfo/inventory 目录下,格式为 json 文件,方便用户对元数据进行统计分析

./jindofs admin -dumpInventory oss://<hdfs_bucket>/

此时可以观察到输出路径

=============Dump Inventory=============
Job Id: 0177388834774116055076952082867238
Data Location: /.sysinfo/inventory/1773888347741.0177388834774116055076952082867238
..........
FINISHED.

该命令为阻塞命令,请耐心等待10秒钟~10分钟(根据元数据量大小),知道最后输出FINISHED表示导出成功。

  • 下载结果文件
./jindofs fs -get oss://<oss_bucket>/.sysinfo/inventory/1773888347741.0177388834774116055076952082867238

下载到本地,使用vi/vim打开即可。

示例结果参考

{"id":16385,"path":"/","type":"directory","size":0,"user":"admin","group":"supergroup","atime":0,"mtime":1666581702933,"permission":511,"state":1}
{"id":6246684106789500068,"path":"/dls-1000326249","type":"directory","size":0,"user":"hadoop","group":"supergroup","atime":0,"mtime":1660889124590,"permission":511,"state":0}
{"id":6246684106789500069,"path":"/dls-1000326249/benchmark","type":"directory","size":0,"user":"hadoop","group":"supergroup","atime":0,"mtime":1660889124590,"permission":511,"state":0}
{"id":6246684106789500070,"path":"/dls-1000326249/benchmark/n1","type":"directory","size":0,"user":"hadoop","group":"supergroup","atime":0,"mtime":1660889124590,"permission":511,"state":0}
{"id":6246684106789500071,"path":"/dls-1000326249/benchmark/n1/490747449","type":"directory","size":0,"user":"hadoop","group":"supergroup","atime":0,"mtime":1660895613953,"permission":511,"state":0}

输出字段参考

字段 说明
id 文件或目录的唯一标识符
path 文件或目录的绝对路径
type 类型,可选值:directory(目录)或 file(文件)
size 文件大小,单位为字节(Byte),目录大小为 0
user 文件或目录的所属用户
group 文件或目录的所属用户组
ctime 文件创建时间(Create Time),Unix 时间戳,单位为毫秒
atime 最后访问时间(Access Time),Unix 时间戳,单位为毫秒
mtime 最后修改时间(Modify Time),Unix 时间戳,单位为毫秒
storagePolicy 存储策略,可选值:UNSPECIFIED(默认值,等同于标准)、CLOUD_STD(标准)、CLOUD_IA(低频)、CLOUD_AR(归档)、CLOUD_COLD_AR(冷归档)、CLOUD_DEEP_COLD_AR(深度冷归档)、CLOUD_AR_RESTORED(归档已解冻)、CLOUD_COLD_AR_RESTORED(冷归档已解冻)、CLOUD_DEEP_COLD_AR_RESTORED(深度冷归档已解冻)
permission 权限值,以十进制数值表示(如 511 对应八进制 777)
state 内部字段
storageConvertTime 内部字段
storageState 内部字段

进阶使用§

1. 指定元数据输出字段§

(从 6.9.1 开始支持)

该功能用于指定所需文件信息字段,默认输出所有字段。

用法:

## -field field : 指定元数据字段
## path为必选字段,另外还需指定一个及以上字段
## 可选字段 : id type size user group ctime atime mtime permission state storagePolicy storageConvertTime storageState
./jindofs admin -dumpInventory oss://<hdfs_bucket>/ -field path -field mtime

示例结果参考

{"path":"/","mtime":1666581702933}
{"path":"/dls-1000326249","mtime":1660889124590}
{"path":"/dls-1000326249/benchmark","mtime":1660889124590}
{"path":"/dls-1000326249/benchmark/n1","mtime":1660889124590}
{"path":"/dls-1000326249/benchmark/n1/490747449","mtime":1660895613953}

2. 指定元数据分析路径§

(从 6.10.0 开始支持)

该功能用于指定文件清单分析路径,默认分析根路径。

用法:

## -path path : 指定元数据分析路径
./jindofs admin -dumpInventory oss://<hdfs_bucket>/ -path oss://<hdfs_bucket>/dls-1000326249/benchmark

示例结果参考

{"id":6246684106789500069,"path":"/dls-1000326249/benchmark","type":"directory","size":0,"user":"hadoop","group":"supergroup","atime":0,"mtime":1660889124590,"permission":511,"state":0}
{"id":6246684106789500070,"path":"/dls-1000326249/benchmark/n1","type":"directory","size":0,"user":"hadoop","group":"supergroup","atime":0,"mtime":1660889124590,"permission":511,"state":0}
{"id":6246684106789500071,"path":"/dls-1000326249/benchmark/n1/490747449","type":"directory","size":0,"user":"hadoop","group":"supergroup","atime":0,"mtime":1660895613953,"permission":511,"state":0}