我正在尝试使用 bash 脚本从服务器收集基本的磁盘空间信息,并以 JSON 格式存储输出。我希望记录可用和已用磁盘空间。
df -h 的示例输出:
Filesystem Size Used Avail Use% Mounted on
udev 2.0G 4.0K 2.0G 1% /dev
tmpfs 394M 288K 394M 1% /run
/dev/mapper/nodequery--vg-root 45G 1.4G 41G 4% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
none 5.0M 0 5.0M 0% /run/lock
none 2.0G 0 2.0G 0% /run/shm
none 100M 0 100M 0% /run/user
/dev/sda2 237M 47M 178M 21% /boot
/dev/sda1 511M 3.4M 508M 1% /boot/efi
例如,这就是我希望最终输出的外观。
{
"diskarray": [{
"mount": "/dev/disk1",
"spacetotal": "35GB",
"spaceavail": "1GB"
},
{
"mount": "/dev/disk2",
"spacetotal": "35GB",
"spaceavail": "4GB"
}]
}
到目前为止,我已经尝试使用awk:
df -P -B 1 | grep '^/' | awk '{ print $1" "$2" "$3";" }'
具有以下输出:
/dev/mapper/nodequery--vg-root 47710605312 1439592448;
/dev/sda2 247772160 48645120;
/dev/sda1 535805952 3538944;
但我不确定如何获取这些数据并以 JSON 格式存储。
下面做你想要的,bash外部的唯一要求是Python解释器:
python_script=$(cat <<'EOF'
import sys, json
data = {'diskarray': []}
for line in sys.stdin.readlines():
mount, avail, total = line.rstrip(';').split()
data['diskarray'].append(dict(mount=mount, spacetotal=total, spaceavail=avail))
sys.stdout.write(json.dumps(data))
EOF
)
df -Ph | awk '/^// { print $1" "$2" "$3";" }' | python -c "$python_script"
使用 jq
的替代实现可能如下所示:
df -Ph |
jq -R -s '
[
split("n") |
.[] |
if test("^/") then
gsub(" +"; " ") | split(" ") | {mount: .[0], spacetotal: .[1], spaceavail: .[2]}
else
empty
end
]'
替代单行
$ df -hP | awk 'BEGIN {printf"{"discarray":["}{if($1=="Filesystem")next;if(a)printf",";printf"{"mount":""$6"","size":""$2"","used":""$3"","avail":""$4"","use%":""$5""}";a++;}END{print"]}";}'
{
"discarray":[
{
"mount":"/",
"size":"3.9G",
"used":"2.2G",
"avail":"1.5G",
"use%":"56%"
},
{
"mount":"/dev",
"size":"24G",
"used":"0",
"avail":"24G",
"use%":"0%"
}
]
}
JSON 解析器 xidel 可以做你想做的事:
$ df -h | xidel -se '
{
"diskarray":array{
for $disk in x:lines($raw)[starts-with(.,"/dev")]
let $item:=tokenize($disk,"s+")
return {
"mount":$item[1],
"spacetotal":$item[2],
"spaceavail":$item[4]
}
}
}
'
{
"diskarray": [
{
"mount": "/dev/mapper/nodequery--vg-root",
"spacetotal": "45G",
"spaceavail": "41G"
},
{
"mount": "/dev/sda2",
"spacetotal": "237M",
"spaceavail": "178M"
},
{
"mount": "/dev/sda1",
"spacetotal": "511M",
"spaceavail": "508M"
}
]
}
-
x:lines($raw)
是tokenize($raw,"rn?|n")
的简写,它将输入转换为每个新行都是另一个项目的序列。在这种情况下,仅选择那些以"/dev"开头的行。 -
tokenize($disk,"s+")
通过使用(过多的)空格作为分隔符将单行转换为序列。
你可以做:
$ df -Ph | awk '/^// {print $1"t"$2"t"$4}' | python -c 'import json, fileinput; print json.dumps({"diskarray":[dict(zip(("mount", "spacetotal", "spaceavail"), l.split())) for l in fileinput.input()]}, indent=2)'
{
"diskarray": [
{
"mount": "/dev/disk1",
"spacetotal": "931Gi",
"spaceavail": "623Gi"
},
{
"mount": "/dev/disk2s2",
"spacetotal": "1.8Ti",
"spaceavail": "360Gi"
}
]
}
您可以使用
各种系统指标收集工具代替df
。
例如facter
:
$ facter --json mountpoints
{
"mountpoints": {
"/": {
"available": "14.33 GiB",
"available_bytes": 15385493504,
"capacity": "39.12%",
"device": "/dev/vda1",
"filesystem": "ext4",
...
另一个例子是prometheus-node-exporter
- 它作为 http 服务运行。它的输出不是 JSON,但很容易解析:
$ curl -sS 0:9100/metrics | egrep '^node_filesystem_.+_bytes'
node_filesystem_avail_bytes{device="/dev/vda1",fstype="ext4",mountpoint="/"} 1.551777792e+10
node_filesystem_free_bytes{device="/dev/vda1",fstype="ext4",mountpoint="/"} 1.6629563392e+10
node_filesystem_size_bytes{device="/dev/vda1",fstype="ext4",mountpoint="/"} 2.638553088e+10