使用条件从DataFrame python打印XML文件



我有一个熊猫df,如下所示:

ID  EpisodeID  Origin   Destination
1      1         A          B
1      2         B          A
2      1         C          D
2      2         D          E
2      3         E          C
3      1         A          D
3      2         D          A

我想制作一个以这个df为源的txt文件。因此,我使用这样的代码:

with open("output.txt","w+") as f:
for index, row in df.iterrows():
f.write("  <person id ="%s">n" % (row['ID']))
f.write("     <activity  O="%s"   D="%s">n % (row[Origin], row[Destination]))
f.write("     </activity>n")
f.write("  </person>n")

输出显示类似于:

<person id="1">
<activity O="A"  D="B">
</activity>
</person>
<person id="1">
<activity O="B"  D="A">
</activity>
</person>

然而,我想做的并不是这样的。我如何迭代或编写代码,以便输出类似于:

<person id="1">
<activity O="A"  D="B">
</activity>
<activity O="B"  D="A">
</activity>
</person>
<person id="2">
<activity O="C"  D="D">
</activity>
<activity O="D"  D="E">
</activity>
<activity O="E"  D="C"
</activity>
</person>

所以,我试图为每个ID而不是所有索引(如果这有意义的话(做些什么。

请帮助:(

编写嵌套循环,首先按ID列分组,然后为每组写入person标签,在每组内,循环并写入activity:

with open("output.txt","w+") as f:
for _id, g in df.groupby('ID'):
f.write(f'  <person id ="{_id}">n')
for t in g.itertuples():  # use itertuples since it's faster than iterrows
f.write(f'     <activity  O="{t.Origin}" D="{t.Destination}">n')
f.write("     </activity>n")
f.write("  </person>n")

输出:

with open("output.txt", "r") as f:
print(''.join(f.readlines()))

<person id ="1">
<activity  O="A" D="B">
</activity>
<activity  O="B" D="A">
</activity>
</person>
<person id ="2">
<activity  O="C" D="D">
</activity>
<activity  O="D" D="E">
</activity>
<activity  O="E" D="C">
</activity>
</person>
<person id ="3">
<activity  O="A" D="D">
</activity>
<activity  O="D" D="A">
</activity>
</person>

最新更新