我有一个字典,其中有许多键/值对。
键是日期,值是全球顶级域。
我想将字典输出到文本文件中,以便计数和alpha类似的值,但仅在同一键
中for example:
*key: value1:count value2:count*
date1: au:4 be:12 com:44
date2: az:4 com:14 net:5
代码:
with open('access_logshort.txt','rU') as f:
for line in f:
list1 = re.search(r'(?P<Date>[0-9]{2}/[a-zA-Z]{3}/[0-9]{4})(.+)(GET|POST)s(http://|https://)([a-zA-Z.]+)(.)(?P<tld>[a-zA-Z]+)(/).+?"s200',line)
if list1 != None:
print list1.groupdict()
one_tuple = list1.group(1,7)
my_dict[one_tuple[0]]=one_tuple[1]
output:
print my_dict
{'09/Mar/2004': 'hu'}
{'09/Mar/2004': 'hu'}
{'09/Mar/2004': 'com'}
{'09/Mar/2004': 'ru'}
{'09/Mar/2004': 'ru'}
{'09/Mar/2004': 'com'}
t
这应该适合您的案件。
from collections import defaultdict
from dateutil.parser import parse
import csv
import re
data = defaultdict(lambda: defaultdict(int))
with open('access_logshort.txt','rU') as f:
for line in f:
list1 = re.search(r'(?P<Date>[0-9]{2}/[a-zA-Z]{3}/[0-9]{4})(.+)(GET|POST)s(http://|https://)([a-zA-Z.]+)(.)(?P<tld>[a-zA-Z]+)(/).+?"s200',line)
if list1 is not None:
date, domain = list1.group(1,7)
data[date.lower()][domain.lower()] += 1
with open('my_data.csv', 'wb') as ofile:
# add delimiter='t' to the argument list of csv.writer if you want
# tsv rather than csv
writer = csv.writer(ofile)
for key, value in sorted(data.iteritems(), key=lambda x: parse(x[0])):
domains = sorted(value.iteritems())
writer.writerow([key] + ['{}:{}'.format(*d) for d in domains])
输出:
10/Mar/2004,com:2,hu:2,ru:2
09/Mar/2004,com:2,hu:2,ru:2