所以我写了一个脚本,将文本文件作为参数将它们连接成一个临时文本文件,然后获取该文件中的每个单词,将其分开,列出并计算它出现的次数。
剩下的唯一问题是我无法过滤掉空格以分隔单词,这会给我的代码带来麻烦。
我尝试将" '与*一起分配给一个名为PUNISHMENT的变量,因为我试图只计算单词而不是符号。
PUNISHED=" *' ' "
if [ -z "$@" ]; then
echo "You need to give this a filename."
exit 1
elif [ "$@" > 1 ]; then
echo "You have more than one argument, commencing now."
fi
test -e temp.txt || echo >> temp.txt
for i in $@
do
cat $@ > temp.txt
done
tr ' ' 'n' < "temp.txt"| grep -v "${PUNISHED}" temp.txt | sort | uniq -c | sort -nr > result.txt | cut -c 15 result.txt
~
编辑:修复了错误消息。
以下是字数统计脚本的实现:
#!/usr/bin/env sh
if [ $# -gt 0 ]; then
echo "You have more than one argument, commencing now."
cat "$@" | # Output all files in arguments to stdout
tr -d '[:punct:]' | # Remove punctuation
tr '[:upper:]' '[:lower:]' | # Lowercase all
xargs -n 1 | # Lay all words one by line into a words list
sort | # Sort the words list
uniq -c | # Count number of occurrences of words in list
sort -k 1nr # Sort the list on the count column numerical reverse order
else
echo "You need to give this a filename." >&2 # Output error message to stderr
exit 1 # Exit with error code
fi