C语言 拆分从文本文件中读取的长句子



我想根据任意切入点将文本中的长句子拆分为较小的句子。我的方法考虑空格来计算单词。给定包含内容的输入文件input.txt

ciao
ciao ciao
ciao ciao ciao ciao ciao ciao
ciao ciao ciao ciao
ciao ciao ciao

我期待:

ciao
ciao ciao
ciao ciao ciao 
ciao ciao ciao
ciao ciao ciao 
ciao
ciao ciao ciao

对于切割点3 .

我使用以下代码解决了这个问题:

#include<stdlib.h>
#include<stdio.h>
#include<ctype.h>                                      
/* MAIN */
int main(int argc, char *argv[]){
        FILE *inp = fopen(argv[1], "r");
        char c;
        int word_counter = 0;
        while((c = fgetc(inp)) != EOF){
                printf("%c", c);
                if(isspace(c))
                        ++word_counter;
                /* Cutter */
                if(word_counter == 3){
                        printf("n");
                        word_counter = 0;  /* counter to zero */
                } 
        }
        return 0;
}

返回,作为输出:

ciao
ciao  ciao
ciao  ciao  ciao

我无法理解这种行为的原因。当满足条件时,代码是否应该简单地打印一个额外的换行符?为什么跳过整个句子?

您需要在读取换行符后将word_counter重置为零。

此外,如果 word_counter != 3,您将打印每个c两次:

printf("%c", c);  // ** here
if(isspace(c))
        ++word_counter;
/* Cutter */
if(word_counter == 3){
        printf("n");
        word_counter = 0;
}
else
        printf("%c", c);  // ** and here

也许试试这个(未测试(:

while((c = fgetc(inp)) != EOF){
    if (isspace(c) && ++word_counter == 3 ) {
            printf("n");
            word_counter = 0;  /* counter to zero */
            continue;
    } 
    if (c == 'n') {
        word_counter = 0;
    }
    printf("%c", c);
}

甚至更短:

while((c = fgetc(inp)) != EOF){
    if ( (isspace(c) && ++word_counter == 3) || (c == 'n') ) {
            printf("n");
            word_counter = 0;  /* counter to zero */
            continue;
    } 
    printf("%c", c);
}

还要记住,如果 isspace(c( c == 'n' 将返回 true ,因此也处理rn的更 robuust 版本将是:

while((c = fgetc(inp)) != EOF){
    if ( (c == ' ' || c == 't') && (++word_counter == 3) ) {
        word_counter = 0;
        printf("n");
        continue;
    }
    if ( c == 'r' || c == 'n' ) {
        word_counter = 0;
    }
    printf("%c", c);
}

最新更新