我不知道该怎么做
我有一个cvs文件,有这个信息:
1 | A | B
1 | C | D
2 | E | F
3 | G |H
3 | I | J
我的列表是管道分隔,而不是逗号。我想做的是:'如果cvs列的第一个元素是相同的,那么合并内容'。
所需输出:
<h1>1</h1>
<h2>A</h2>
B
<h2>C</h2>
D
<h1>2</h1>
<h2>E</h2>
F
<h1>3</h1>
<h2>G</h2>
H
<h2>I</h2>
J
我的代码for row in csv_reader:
for each group of rows where the first element is the same (in this case "1")
print =' <h1>row[0]</h1>
<h2>row[1]</h2>
row[2]
<h2>C (this is from second row)</h2>
D (this is from second row)
假设文件data.csv
看起来像这样:
1 | A | B
1 | C | D
2 | E | F
3 | G |H
3 | I | J
您可以使用pandas和collections.defaultdict
:
import pandas as pd
from collections import defaultdict
data = pd.read_csv('data.csv', header=None, sep='|')
object_col_names = data.select_dtypes('O').columns
data[object_col_names] = data[object_col_names].apply(
lambda x: x.str.strip()
)
d = defaultdict(list)
for i in data.index:
d[data[0].loc[i]].extend(data[[1, 2]].loc[i].tolist())
for i in sorted(d.keys()):
print(f'<h1>{i}</h1>')
for j, k in enumerate(d[i]):
if j % 2 == 0:
print(f'<h2>{k}</h2>')
else:
print(k)
输出:
<h1>1</h1>
<h2>A</h2>
B
<h2>C</h2>
D
<h1>2</h1>
<h2>E</h2>
F
<h1>3</h1>
<h2>G</h2>
H
<h2>I</h2>
J