在 Python 中解析、聚合和排序文本文件

我有一个名为"names.txt"的文件，其内容如下:

{"1":[1988, "Anil 4"], "2":[2000, "Chris 4"], "3":[1988, "Rahul 1"],
"4":[2001, "Kechit 3"], "5":[2000, "Phil 3"], "6":[2001, "Ravi 4"],
"7":[1988, "Ramu 3"], "8":[1988, "Raheem 5"], "9":[1988, "Kranti 2"],
"10":[2000, "Wayne 1"], "11":[2000, "Javier 2"], "12":[2000, "Juan 2"],
"13":[2001, "Gaston 2"], "14":[2001, "Diego 5"], "15":[2001, "Fernando 1"]}

问题陈述:文件"names.txt"包含一些学生记录，格式为-

{"number": [year of birth, "name rank"]}

解析此文件并按年份进行隔离，然后按等级对名称进行排序。先分离，再分类。输出格式应为-

{year : [Names of students in sorted order according to rank]}

所以期望输出是-

{1988:["Rahul 1","Kranti 2","Rama 3","Anil 4","Raheem 5"],
2000:["Wayne 1","Javier 2","Jaan 2","Phil 3","Chris 4"],
2001:["Fernando 1","Gaston 2","Kechit 3","Ravi 4","Diego 5"]}

第一如何将文件内容存储在字典对象中?然后按年份分组然后按等级排序?如何在Python中实现这一点?

谢谢. .

这很简单:)

#!/usr/bin/python
# Program: Parsing, Aggregating & Sorting text file in Python
# Developed By: Pratik Patil
# Date: 22-08-2015
import pprint;
# Open file & store the contents in a dictionary object
file = open("names.txt","r");
file_contents=eval(file.readlines().pop(0));
# Extract all lists from file contents
file_contents_values=file_contents.values();
# Extract Unique Years & apply segregation
year=sorted(set(map(lambda x:x[0], file_contents_values)));
file_contents_values_grouped_by_year = [ [y[1] for y in file_contents_values if y[0]==x ] for x in year];
# Create Final Dictionary by combining respective keys & values
output=dict(zip(year, file_contents_values_grouped_by_year));
# Apply Sorting based on ranking
for NameRank in output.values():
    NameRank.sort(key=lambda x: int(x.split()[1]));
# Print Output by ascending order of keys
pprint.pprint(output);

隔离可以在一个简单的循环中使用collections.defaultdict来完成。然后再循环遍历学生列表，根据学生条目最后一部分的整数值对它们进行排序。如果将defaultdict转换为常规输出，则pprint()打印所需输出:

#!/usr/bin/env python
from __future__ import absolute_import, division, print_function
import json
from collections import defaultdict
from pprint import pprint

def main():
    with open('test.json') as student_file:
        id2student = json.load(student_file)
    # 
    # Segregate by year.
    # 
    year2students = defaultdict(list)
    for year, student_and_rank in id2student.itervalues():
        year2students[year].append(student_and_rank.encode('utf8'))
    # 
    # Sort by rank.
    # 
    for students in year2students.itervalues():
        students.sort(key=lambda s: int(s.rsplit(' ', 1)[-1]))
    pprint(dict(year2students))

if __name__ == '__main__':
    main()

相关内容

最新更新

热门标签：