我有四个特殊字符串列表,例如™'.我将它们作为键1到4的值存储在字典中。
我正在尝试循环遍历每个字典列表中的所有项,然后将每个值更新为utf-8编码的等效值。到目前为止,我拥有的是:
li1 = ['€','‚','ƒ','„','…','†','ˆ','‰']
li2 = ['Š','‹','Œ','Ž']
li3 = ['‘','’','“','”','•','–','—','˜','™']
li4 = ['š','›','œ','ž','Ÿ']
uni_dic = {
'1':li1,
'2':li2,
'3':li3,
'4':li4
}
for key in uni_dic:
for val in uni_dic[key]:
uni_dic.update({key,val.encode('utf-8')})
这返回一个'ValueError:dictionary update sequence元素#0的长度为1;需要2’
取出最后三行并写入:
for key, values in uni_dic.items():
uni_dic[key] = [string.encode('utf-8') for string in values]
或者在一行中:
uni_dic = {key: [string.encode('utf-8') for string in uni_dic[key]] for key in uni_dic}
uni_dic}
在这种情况下,您可以创建另一个列表,如下所示:
for key in uni_dic:
uni_dic[key] = [val.encode('utf-8') for val in uni_dic[key]]
如果你想在适当的地方做:
for key in uni_dic:
for idx, val in enumerate(uni_dic[key]):
uni_dic[key][idx] = val.encode('utf-8')
{key,val.encode('utf-8')}
(实际上是set
(而不是{key: val.encode('utf-8')}
。然而,无论如何,使用update
来更新单个键都有点傻,而且它们会让你在列表中循环,你只需要依次用每个元素重复覆盖列表(最终以最后一项结束(。
相反,用编码的值构建一个新的列表,并将整个列表分配给uni_dic[key]
:
for key in uni_dic:
encoded_vals = []
for val in uni_dic[key]:
encoded_vals.append(val.encode('utf-8'))
uni_dic[key] = encoded_vals
使用列表理解来构建encoded_vals
会让你变得更加简洁:
for key in uni_dic:
uni_dic[key] = [val.encode('utf-8') for val in uni_dic[key]]
使用dict.items()
的字典理解使其更加简单:
uni_dic = {k: [val.encode('utf-8') for val in v] for k, v in uni_dic.items()}
您的问题的单行解决方案是:
uni_dic = {key:[val.encode('utf-8') for val in charList] for key, charList in uni_dic.items()}
为了在对问题中的代码进行最小修改的情况下获得相同的结果,您可以这样做:
li1 = ['€','‚','ƒ','„','…','†','ˆ','‰']
li2 = ['Š','‹','Œ','Ž']
li3 = ['‘','’','“','”','•','–','—','˜','™']
li4 = ['š','›','œ','ž','Ÿ']
uni_dic = {
'1':li1,
'2':li2,
'3':li3,
'4':li4
}
for key in uni_dic:
charList = uni_dic[key]
for i, val in enumerate(charList):
charList[i] = val.encode('utf-8')
[print(charList) for charList in uni_dic.values()]
输出:
[b'xe2x82xac', b'xe2x80x9a', b'xc6x92', b'xe2x80x9e', b'xe2x80xa6', b'xe2x80xa0', b'xcbx86', b'xe2x80xb0']
[b'xc5xa0', b'xe2x80xb9', b'xc5x92', b'xc5xbd']
[b'xe2x80x98', b'xe2x80x99', b'xe2x80x9c', b'xe2x80x9d', b'xe2x80xa2', b'xe2x80x93', b'xe2x80x94', b'xcbx9c', b'xe2x84xa2']
[b'xc5xa1', b'xe2x80xba', b'xc5x93', b'xc5xbe', b'xc5xb8']