从体育游戏列表的不等长列表创建csv文件



我有体育比赛列表:

table = [['Volleyball', ' Europe', 'European Championships', 'Today 17:00', 'Moldova - Cyprus', '2.00', '1.72'],
['Handball', ' Slovenia', '1. NLB Liga', 'Today 17:00', 'Krka - Slovenj Gradec', '2.05', '1.98'],
['American Football', ' USA', 'NCAA', 'Today 17:00', 'Marshall - Eastern Kentucky', '1.90', '1.90', 'Today 20:00', 'Army - Middle Tennessee St', '2.01', '1.99', 'Tomorrow 20:00', 'West Virginia - Florida State', '2.50', '1.50'],
['Soccer', ' World', 'Club Friendly', 'Today 17:00', 'UE Sants (Esp) - CE Europa (Esp)', '1.84', '1.88', 'Today 17:00', 'Spain - France', '1.20', '2.80'],
['Tennis', ' USA', 'ATP US Open', 'Today 17:30', 'Berrettini M. - Ruud C.', '1.81', '2.02']]

列为:

sport  country  competition  date  match  odd_1  odd_2

在每个嵌套列表中,前3个元素始终为:sport, country, competition。在这3个第一元素之后,是date match odd_1 odd_2,一次或多次(意味着每个嵌套列表可以在同一项运动和一场比赛中有许多比赛(

我想从这些数据创建csv,但一些嵌套列表包含多个匹配项:

with open(filename.csv, 'a', encoding='utf_8_sig') as csv_file: 
w = csv.writer(csv_file, lineterminator='n')
header = 
w.writerow(header)
for row in table:
w.writerow(row)

AFAIK所有csv处理库都希望csv文件的每一行都有固定数量的元素。因此,即使按原样拆分行,它也不会是一个正确的csv,每行中的元素数量不同。

如果您可以将[date, match, odd1, odd2 ...]列表作为csv中的单个元素,您可以按如下方式进行操作(当您加载cource的csv时,您需要拆分该列的数据(

#python3
import csv
with open('filename.csv', 'w') as fp:
dr = csv.DictWriter(fp, ['sport', 'country', 'competition', 'matches'])
dr.writeheader()
for row in table: 
sport,  country,  competition, *rest = row
dr.writerow({'sport': sport, 'country': country, 'competition': competition, 'matches': rest})

您可以按照以下读取csv

with open('data.csv') as fp:
reader = csv.DictReader(fp)
for row in reader:
print(row['sport'], row['matches'])

然而,现在row['matches']将是string(而不是list(,因此如果您想访问单独的数据,则必须再次转换到列表中

您可以使用ast模块进行string->list转换。然而,我建议研究这种数据的酸洗或任何其他表示(当然取决于您的用例(

import ast
with open('data.csv') as fp:
reader = csv.DictReader(fp)
for row in reader:
print(ast.literal_eval(row['matches']))  # Prints the list not str

我找到了一种使用更多工具对列表进行切片的方法:

rows = []
for i in table:
if len(i) == 12:
row = i[:7]+i[-2:]
rows.append(row)
if len(i) > 12:
m = i[:3]
n = i[3:]
n_list = list(more_itertools.chunked(n, 7))
for k in n_list:
row = m + k
rows.append(row)

它工作得很好

最新更新