将一个奇怪的嵌套字典列表转换为数据框架



我有一个奇怪的列表,其中嵌套了一些字典,看起来像这样:

lst = [{"uniqueid": "100","Content":[{"SaleNum":"1","Date":"12","Price":"230"}, {"SaleNum":"2","Date":"13","Price":"234"}, {"SaleNum":"3","Date":"14","Price":"382"}]}, 
{"uniqueid": "101","Content":[{"SaleNum":"1","Date":"25","Price":"382"}, {"SaleNum":"2","Date":"26","Price":"493"}, {"SaleNum":"3","Date":"28","Price":"384"}]},
{"uniqueid": "102","Content":[{"SaleNum":"1","Date":"25","Price":"334"}, {"SaleNum":"2","Date":"26","Price":"273"}, {"SaleNum":"3","Date":"28","Price":"394"}]}]

我想把它转换成表格

<表类>uniqueidsalenum日期价格tbody><<tr>100112230100213234100314382101125382101226493101328384102125334102226273102328394

explodejson_normalize的一个选项:

df = (pd.DataFrame.from_records(lst).explode('Content', ignore_index=True)
.pipe(lambda d: d.join(pd.json_normalize(d.pop('Content'))))
)

或者使用DataFrame构造函数和列表推导式:

df = (pd.DataFrame([d|d2 for d in lst for d2 in d['Content']])
.drop(columns='Content')
)
# or if you only have uniqueid as external key
df = pd.DataFrame([{'uniqueid': d['uniqueid']}|d2
for d in lst for d2 in d['Content']])

输出:

uniqueid SaleNum Date Price
0      100       1   12   230
1      100       2   13   234
2      100       3   14   382
3      101       1   25   382
4      101       2   26   493
5      101       3   28   384
6      102       1   25   334
7      102       2   26   273
8      102       3   28   394
def function1(ss:pd.Series):
ss1=pd.Series(ss.Content)
return pd.concat([ss[["uniqueid"]],ss1])
pd.DataFrame(lst).explode("Content").apply(function1,1)

:

SaleNum Date Price uniqueid
0       1   12   230      100
0       2   13   234      100
0       3   14   382      100
1       1   25   382      101
1       2   26   493      101
1       3   28   384      101
2       1   25   334      102
2       2   26   273      102
2       3   28   394      102