解析文件列表

想象一下，我们正在运行简单的git diff --name-only。输出将如下所示：

/path1/path2/path3/path4/file1
/path1/path2/path3/path4/file2
/path1/path2/file3
/file4
/path1/file5

目标是拥有这样的函数，能够解析和计算路径中的任何部分。例如，如果我运行：

// 1 -> is the column to watch/count.
// In example output above it is: path1, path1, path1, path4, path1
// For 2 -> path2, path2, path2, null, file5
// For 3 -> path3, path3, file3, null, null
git diff --name-only | someFn(1)

它应该输出不同匹配项的总数。例如：

1 -> should output 2 (path1, file4)
2 -> should output 3 (path2, null, file5)
3 -> should output 3 (path3, file3, null)

函数的输出应该是一个简单的数字0, 1, 2..

谁能帮我？谢谢

尝试使用特定的字段分隔符awk：

git diff --name-only | awk -F "/" '{ print $2 }'

将显示

path1
path1
path1
file4
path1

awk基本上在/上分裂了字符串

您还可以使用awk，sort和uniq对匹配进行计数。

git diff --name-only | awk -F "/" '{ print $3 }' | awk 'NF' | sort -u | wc -l
>2

这将输出第三列，删除空行，对结果进行排序并删除重复项，最后计算结果。这些命令的组合应该可以解决您的需求。

您可以定义如下函数：

function fun() { cut -d / -f $(($1+1)) | sort -u | wc -l ;}

然后：

for i in $(seq 6) ; do
  git diff --name-only | fun $i
done

function可以省略。

使用 GAWK：

$ git diff --name-only | awk -F "/" 'NF > 2 { a[$3]=1 }
                                     END    { print length(a) }'
2

你也可以

使用 cut ：

git diff --name-only | cut -d '/' -f2

会给你

path1
path1
path1
file4
path1

要对唯一值进行排序和计数（如前所述）：

git diff --name-only | cut -d '/' -f2 | sort -u | wc -l

我怀疑具有cut的解决方案会比awk更大的输入运行得更快。

相关内容

最新更新

热门标签：