接受filename作为参数，并计算重复单词和计数

我需要从文本文件中找到数字或重复字符，并需要将filename作为参数传递。

示例：test.txt数据包含

Zoom

输出应该像：

z 1
o 2
m 1

我需要一个命令，它将接受filename作为参数，然后列出该文件中的字符数。在我的示例中，我有一个test.txt，其中包含zoom单词。因此，输出将类似于每个字母重复了多少次。

我的尝试：

vi测试.sh

#!/bin/bash
FILE="$1" --to pass filename as argument
sort file1.txt | uniq -c --to count the number of letters

只是猜测？

cat test.txt |
tr '[:upper:]' '[:lower:]' |
fold -w 1 |
sort |
uniq -c |
awk '{print $2, $1}'

m 1
o 2
z 1

建议计算所有类型字符的awk脚本：

awk '
BEGIN{FS = ""}  # make each char a field
{
for (i = 1; i <= NF; i++) { # iteratre over all fields in line
++charsArr[$i]; # count each field occourance in array
}
}
END {
for (char in charsArr) { # iterrate over chars array
printf("%3d %sn", charsArr[char], char);  # cournt char-occourances and the char
}
}' |sort -n

或者在一行中：

awk '{for(i=1;i<=NF;i++)++arr[$i]}END{for(char in arr)printf("%3d %sn",arr[char],char)}' FS="" input.1.txt|sort -n

#!/bin/bash
#get the argument for further processing
inputfile="$1"
#check if file exists
if [ -f $inputfile ]
then
#convert file to a usable format
#convert all characters to lowercase
#put each character on a new line
      #output to temporary file
cat $inputfile | tr '[:upper:]' '[:lower:]' | sed -e 's/(.)/1n/g' > tmp.txt
#loop over every character from a-z
for char in {a..z}
do
#count how many times a character occurs
count=$(grep -c "$char" tmp.txt)
#print if count > 0
if [ "$count" -gt "0" ]
then
echo -e "$char" "$count"
fi
done
rm tmp.txt
else
echo "file not found!"
exit 1
fi

相关内容

最新更新

热门标签：