Python字典的迭代和替换每次都在相同的情况下失败



我在字典中循环。键是我想用字典中的"新值"替换的"旧值"。

我能代替大部分人。然而,我总是发现字典的第二个条目("02 more text"(总是在经过其他处理的输出文件中。

我做错了什么?我读到Python不喜欢对它正在迭代的列表进行替换。因此,我有了一个新的列表,"for循环"会附加到该列表中。在其中,我有一个"临时行",它将行复制到原始的"csv_rows"中。

为什么输出文件中总是"02 more text"?

原始文件是CSV文件。将数据帧转到"列表"会使数据帧的每一行都成为一个较大列表"csv_rows"中的列表。

import pandas as pd
import csv 
from csv import writer
dictionary = {
"01-some text" : "replacement",
"02-more text" : "replacement",
"03-even more text" : "replacement",
"01-text" : "replacement",
"02-another lorem" : "replacement",
"03-ipsum" : "replacement",
"04-dolorem" : "replacement"
}
def append_list_as_row(file_name, list_of_elem):
# Open file in append mode
with open(file_name, 'a+', newline='', encoding='utf-8') as write_obj:
# Create a writer object from csv module
csv_writer = writer(write_obj)
# Add contents of list as last row in the csv file
csv_writer.writerow(list_of_elem)
def get_file_encoding(src_file_path):
"""
Get the encoding type of a file
:param src_file_path: file path
:return: str - file encoding type
"""
with open(src_file_path) as src_file:
return src_file.encoding
data = 'ANQAR.csv'
my_encoding = str(get_file_encoding(data))
df = pd.read_csv(data, encoding=my_encoding)
csv_rows = df.values.tolist()
new_list = []
for key in dictionary:  
for row in csv_rows:
temp_row = row
if key in row:
#find the index
i = row.index(key)
#replace value with new one
temp_row[i] = dictionary[key]
new_list.append(temp_row)

for row in new_list:
append_list_as_row('newANQAR.csv', row)

您没有正确进行清理。

试着运行下面的代码片段,看看它是否适合你:

replacements_map = {
"01-some text": "replacement",
"02-more text": "replacement",
"03-even more text": "replacement",
"01-text": "replacement",
"02-another lorem": "replacement",
"03-ipsum": "replacement",
"04-dolorem": "replacement"
}
csv_rows = [["01-some text", " other data here"],
["wont need replacement", "02-another lorem"]]
sanitized_rows = []
for row in csv_rows:
sanitized_rows.append(
[(replacements_map[item] if (item in replacements_map) else item)
for item
in row]
)
print(sanitized_rows)

最新更新