我想根据任意切入点将文本中的长句子拆分为较小的句子。我的方法考虑空格来计算单词。给定包含内容的输入文件input.txt
:
ciao
ciao ciao
ciao ciao ciao ciao ciao ciao
ciao ciao ciao ciao
ciao ciao ciao
我期待:
ciao
ciao ciao
ciao ciao ciao
ciao ciao ciao
ciao ciao ciao
ciao
ciao ciao ciao
对于切割点3
.
我使用以下代码解决了这个问题:
#include<stdlib.h>
#include<stdio.h>
#include<ctype.h>
/* MAIN */
int main(int argc, char *argv[]){
FILE *inp = fopen(argv[1], "r");
char c;
int word_counter = 0;
while((c = fgetc(inp)) != EOF){
printf("%c", c);
if(isspace(c))
++word_counter;
/* Cutter */
if(word_counter == 3){
printf("n");
word_counter = 0; /* counter to zero */
}
}
return 0;
}
返回,作为输出:
ciao
ciao ciao
ciao ciao ciao
我无法理解这种行为的原因。当满足条件时,代码是否应该简单地打印一个额外的换行符?为什么跳过整个句子?
您需要在读取换行符后将word_counter
重置为零。
此外,如果 word_counter
!= 3,您将打印每个c
两次:
printf("%c", c); // ** here
if(isspace(c))
++word_counter;
/* Cutter */
if(word_counter == 3){
printf("n");
word_counter = 0;
}
else
printf("%c", c); // ** and here
也许试试这个(未测试(:
while((c = fgetc(inp)) != EOF){
if (isspace(c) && ++word_counter == 3 ) {
printf("n");
word_counter = 0; /* counter to zero */
continue;
}
if (c == 'n') {
word_counter = 0;
}
printf("%c", c);
}
甚至更短:
while((c = fgetc(inp)) != EOF){
if ( (isspace(c) && ++word_counter == 3) || (c == 'n') ) {
printf("n");
word_counter = 0; /* counter to zero */
continue;
}
printf("%c", c);
}
还要记住,如果 isspace(c( c == 'n'
将返回 true
,因此也处理rn
的更 robuust 版本将是:
while((c = fgetc(inp)) != EOF){
if ( (c == ' ' || c == 't') && (++word_counter == 3) ) {
word_counter = 0;
printf("n");
continue;
}
if ( c == 'r' || c == 'n' ) {
word_counter = 0;
}
printf("%c", c);
}