Python:用循环写CSV列

使用Python，我想创建一个循环，当一行包含文本时在CSV文件中写入文本。

原始CSV格式为:

user_id,    text
0,  
1,  
2,  
3,  sample text
4,  sample text

我正在寻找添加另一列"text_number"这将插入字符串"text_x"，其中x表示列中的文本数。我想迭代这个，并为每个新文本增加字符串的值+1。最终产品看起来像:

user_id,    Text,   text_number
0,      
1,      
2,      
3,  sample text,    text_0
4,  sample text,    text_1

用我的工作代码，我可以插入标题"text_number"，但我有困难放在一起的循环为text_x。

import csv
output = list()
with open("test.csv") as file:
    csv_reader = csv.reader(file)
    for i, row in enumerate(csv_reader):
        if i == 0:
            output = [row+["text_number"]]
            continue
        # here's where I'm stuck
            
with open("output2.csv", "w", newline="") as file:
    csv_writer = csv.writer(file, delimiter=",")
    for row in output:
        csv_writer.writerow(row)

任何想法吗?

在注释中查找描述

# asuming the file
# user_id,text
# 0,  
# 1,  
# 2,  
# 3,sample text
# 4,sample text
# 5, 
# 6,sample text
# import the library
import pandas as pd
df = pd.read_csv('test.csv').fillna('')
# creating column text_number initializing with ''
df['text_number'] = ''
# getting the index where text is valid
index = df.loc[df['text'].str.strip().astype(bool)].index
# finally creating the column text_number with increment as 0, 1, 2 ...
df.loc[index, 'text_number'] = [f'text_{i}' for i in range(len(index))]
print(df)
# save it to disk
df.to_csv('output2.csv')

#    user_id         text text_number
# 0        0                         
# 1        1                         
# 2        2                         
# 3        3  sample text      text_0
# 4        4  sample text      text_1
# 5        5                         
# 6        6  sample text      text_2

您可以尝试对第一部分进行以下修改:

output = list()
with open("test.csv") as file:
    csv_reader = csv.reader(file)
    output.append(next(csv_reader) + ['text_number'])
    text_no = 0
    for row in csv_reader:
        if row[1].strip():
            row.append(f'text_{text_no}')
            text_no += 1
        output.append(row)

你可以试试:

import csv
output = list()
x=0
with open("test.csv") as file:
    csv_reader = csv.reader(file)
    for i, row in enumerate(csv_reader):
        row[1]=row[1].strip()
        if i == 0:
            row.append("text_number")
        else:
            if row[1]=="":
                row.append(" ")
            else:
                row.append(f"text_{x}")
                x+=1
        output.append(row)            
with open("output2.csv", "w", newline="") as file:
    csv_writer = csv.writer(file, delimiter=",")
    for row in output:
        csv_writer.writerow(row)

我没有更改您的代码中应该更改的任何内容。我我只是adding, new element, row, every iteration。和append，每一个row在output，为制造新的list of row。

如果你对pandas很满意，那么你也可以试试这个:


import pandas as pd
df=pd.read_csv("test.csv")
r=[]
x=0
for i in range(df.shape[0]):
    if df["    text"][i].strip()=="":
        r.append(f" ")
    else:
        r.append(f"text_{x}")
        x+=1
df["text_number"]=r
print(df)
"""
   user_id           text   text_number
0        0                     
1        1                     
2        2                     
3        3    sample text      text_0
4        4    sample text      text_1
"""
pd.to_csv("output2.csv")

这是text_number列的列表。

相关内容

最新更新

热门标签：