相同的python代码块在不同时间给出不同的输出



我想创建一个单词字典。字典看起来像

words_meanings= {
"rekindle": "relight",
"pesky":"annoying", 
"verge": "border",
"maneuver": "activity",
"accountability":"responsibility",
}
keys_letter=[]
for x in words_meanings:
keys_letter.append(x)
print(keys_letter)

输出:rekindle , pesky, verge, maneuver, accountability

这里rekindle , pesky, verge, maneuver, accountability是键,relight, annoying, border, activity, responsibility是值。

现在我想创建一个csv文件,我的代码将从文件中获取输入。

文件看起来像

rekindle | pesky   |  verge |  maneuver |  accountability
relight  | annoying|  border|  activity |  responsibility
到目前为止,我使用这段代码来加载文件并从中读取数据。
from google.colab import files
uploaded = files.upload()
import pandas as pd 
data = pd.read_csv("words.csv")
data.head()
import csv
reader = csv.DictReader(open("words.csv", 'r'))
words_meanings = []
for line in reader:
words_meanings.append(line)
print(words_meanings)

print(words_meanings)

的输出
[OrderedDict([('ufeffrekindle', 'relight'), ('pesky', 'annoying')])]

我觉得很奇怪。

keys_letter=[]
for x in words_meanings:
keys_letter.append(x)
print(keys_letter)

现在我创建了一个空列表,并且只想附加键值。但是输出是[OrderedDict([('ufeffrekindle', 'relight'), ('pesky', 'annoying')])]

我很困惑。根据第一个代码块,它只包括键,但现在它包括键和它们的值。我该如何克服这种情况?

我建议您在同一行上使用键和值来格式化csv。这样的

rekindle,relight
pesky,annoying
verge,border

这样下面的代码就可以工作了。

words_meanings = {}
with open(file_name, 'r') as file:
for line in file.readlines():
key, value = line.split(",")
word_meanings[key] = value.rstrip("n")

如果你想要一个键的列表:list_of_keys = list(word_meanings.keys())

向文件中添加键和值:

def add_values(key:str, value:str, file_name:str):
with open(file_name, 'a') as file:
file.writelines(f"n{key},{value}")
key = input("Input the key you want to save: ")
value = input(f"Input the value you want to save to {key}:")
add_values(key, value, file_name)```

你运行相同的代码块,但是你将它用于不同的对象,这会给出不同的结果。


首先使用普通字典(检查type(words_meanings))

words_meanings = {
"rekindle": "relight",
"pesky":"annoying", 
"verge": "border",
"maneuver": "activity",
"accountability":"responsibility",
}

for-loop从这个字典中得到keys

也可以得到相同的结果
keys_letter = list(words_meanings.keys())

或者

keys_letter = list(words_meanings)

之后你在list中使用单个字典(检查type(words_meanings))

words_meanings = [OrderedDict([('ufeffrekindle', 'relight'), ('pesky', 'annoying')])]

for-loop给出列表中的元素,而不是列表中字典中的键。所以你把整个字典从一个列表移动到另一个列表。

也可以得到相同的结果
keys_letter = words_meanings.copy()

或者相同的

keys_letter = list(words_meanings)

from collections import OrderedDict
words_meanings = {
"rekindle": "relight",
"pesky":"annoying", 
"verge": "border",
"maneuver": "activity",
"accountability":"responsibility",
}
print(type(words_meanings))
keys_letter = []
for x in words_meanings:
keys_letter.append(x)
print(keys_letter)
#keys_letter = list(words_meanings.keys())
keys_letter = list(words_meanings)
print(keys_letter)

words_meanings = [OrderedDict([('ufeffrekindle', 'relight'), ('pesky', 'annoying')])]
print(type(words_meanings))
keys_letter = []
for x in words_meanings:
keys_letter.append(x)
print(keys_letter)
#keys_letter = words_meanings.copy()
keys_letter = list(words_meanings)
print(keys_letter)

csv模块的默认字段分隔符是逗号。您的CSV文件使用管道或条形符号|,并且字段的宽度似乎也是固定的。因此,您需要指定|作为创建CSV阅读器时使用的分隔符。

同样,您的CSV文件被编码为大端UTF-16 Unicode文本(UTF-16- be)。该文件包含一个字节顺序标记(BOM),但Python没有剥离它,因此您会注意到字符串'ufeffrekindle'包含FEFF UTF-16-BE BOM。这可以通过在打开文件时指定encoding='utf16'来处理。

import csv
with open('words.csv', newline='', encoding='utf-16') as f:
reader = csv.DictReader(f, delimiter='|', skipinitialspace=True)
for row in reader:
print(row)

在CSV文件上运行此命令会产生如下结果:

{'rekindle ': 'relight ', 'pesky ': 'annoying', 'verge ': 'border', 'maneuver ': 'activity ', 'accountability': 'responsibility'}

注意键和值后面有空格。skipinitialspace=True删除了前导空格,但没有删除尾随空格的选项。这可以通过从Excel导出CSV文件而不指定字段宽度来解决。如果不能这样做,那么可以通过使用生成器预处理文件来修复:

import csv
def preprocess_csv(f, delimiter=','):
# assumes that fields can not contain embedded new lines
for line in f:
yield delimiter.join(field.strip() for field in line.split(delimiter))
with open('words.csv', newline='', encoding='utf-16') as f:
reader = csv.DictReader(preprocess_csv(f, '|'), delimiter='|', skipinitialspace=True)
for row in reader:
print(row)

现在输出剥离的键和值:

{'rekindle': 'relight', 'pesky': 'annoying', 'verge': 'border', '机动':'activity', 'accountability': 'responsibility'}

我发现没有人能帮我回答这个问题。最后,我把答案贴在这里。希望对别人有所帮助。

import csv
file_name="words.csv"
words_meanings = {}
with open(file_name, newline='', encoding='utf-8-sig') as file:
for line in file.readlines():
key, value = line.split(",")
words_meanings[key] = value.rstrip("n")
print(words_meanings)

这是将csv文件传输到字典的代码。享受! !

相关内容

  • 没有找到相关文章

最新更新