我有一本这样的字典:
dict = {
key1: <http://www.link1.org/abc/f><http://www.anotherlink.com/ght/y2>,
key2: <http://www.link1.org/abc/f><http://www.anotherOneLink.en/ttta/6jk>,
key3: <http://www.somenewlink.xxw/o192/ggh><http://www.link4.com/jklu/wepdo9>,
key4: <http://www.linkkk33.com/fgkjc><http://www.linknew2.com/poii/334hsj>,
...
}
实现目标:
我想在字典的每个值中分离两个链接,然后计算每个第一个值在整个字典中出现的次数。像这样:
new_dict = {
key1: [<http://www.link1.org/abc/f>, <http://www.anotherlink.com/ght/y2>],
key2: [<http://www.link1.org/abc/f>, <http://www.anotherOneLink.en/ttta/6jk>],
key3: [<http://www.somenewlink.xxw/o192/ggh>, <http://www.link4.com/jklu/wepdo9>],
key4: [<http://www.linkkk33.com/fgkjc>, <http://www.linknew2.com/poii/334hsj>],
...
}
first_value_count = {
<http://www.link1.org/abc/f> : 2,
<http://www.somenewlink.xxw/o192/ggh> : 1,
<http://www.linkkk33.com/fgkjc> : 1,
....
}
我的代码:
分割值我已经尝试过了,但它不起作用:
new_dict = {k: v[0].split(">") for k, v in dict.items()}
计算在我的字典中出现的值:
from collections import Counter
all_dictionary_values = []
for v[0] in new_dict.values():
all_dictionary_values.append(x)
count = Counter(all_dictionary_values)
我有一个非常大的字典(1M+键),这是计算字典中所有值出现的最快方法吗?
我试过你的代码,但它不工作在我这边,所以我改变了它如下:
dict = {
'key1' : "<http://www.link1.org/abc/f><http://www.anotherlink.com/ght/y2>",
'key2': "<http://www.link1.org/abc/f><http://www.anotherOneLink.en/ttta/6jk>",
'key3' : "<http://www.somenewlink.xxw/o192/ggh><http://www.link4.com/jklu/wepdo9>",
'key4': "<http://www.linkkk33.com/fgkjc><http://www.linknew2.com/poii/334hsj>",
}
new_dict = {k: v.split("><") for k, v in dict.items()}
new_dict
{'key1': ['<http://www.link1.org/abc/f', 'http://www.anotherlink.com/ght/y2>'],
'key2': ['<http://www.link1.org/abc/f',
'http://www.anotherOneLink.en/ttta/6jk>'],
'key3': ['<http://www.somenewlink.xxw/o192/ggh',
'http://www.link4.com/jklu/wepdo9>'],
'key4': ['<http://www.linkkk33.com/fgkjc',
'http://www.linknew2.com/poii/334hsj>']}
我们在这里加上计数器:
from collections import Counter
all_dictionary_values = []
for v in new_dict.values():
all_dictionary_values.append(v[0]+">")
count = Counter(all_dictionary_values)
count
输出
Counter({'<http://www.link1.org/abc/f': 2,
'<http://www.somenewlink.xxw/o192/ggh': 1,
'<http://www.linkkk33.com/fgkjc': 1})