将序列的字典转换为数据帧

我有一个字典，其中每个值都是一个Series。为了便于讨论，我们假设所有索引都是相同的。我希望最终得到一个具有相同索引的数据框(或者实际上是任何表-不必是pandas)，并且每个列都是这些序列中的一个，以字典键作为列头。我可以遍历字典，然后给数据框赋值，但我很好奇是否有一种不需要循环的更python的方法。

然后作为第二个问题，如果索引不完全相同(某些序列有一个或两个缺失，因此应该显示为零或零)，那么解决方案会有什么不同?

一些示例数据，例如:

import pandas as pd    
import random

test = dict.fromkeys(['a','b','c','d'])
for key in test:
randomlist = []
for i in range(0,5):
n = random.randint(1,30)
randomlist.append(n)
test[key] = pd.Series(randomlist)
test
Out[16]: 
{'a': 0    10
1    16
2     9
3    23
4    30
dtype: int64,
'b': 0     7
1     9
2     1
3    16
4    29
dtype: int64,
'c': 0    30
1    21
2    25
3     1
4    22
dtype: int64,
'd': 0    28
1    29
2     7
3    25
4    25
dtype: int64}

，我想以这样的结尾:

a   b   c   d
1  16  9   21  29
2  9   1   25  7
3  23  16  1   25
4  30  29  22  25

我肯定没有在这个例子中捕捉到什么，所以这里是我的实际字典的一小段:

'C:\Users\name\Google Drive\Simulations\030012-OffMed-VRF - ap\CTZ14S22AMeter.csv': Fans:Electricity [J](Hourly)                                     49.785923
InteriorEquipment:Electricity [J](Hourly)                        72.889315
InteriorLights:Electricity [J](Hourly)                           16.140645
Electricity:Facility [J](Hourly)                                205.964746
Cooling:Electricity [J](Hourly)                                  57.205236
Pumps:Electricity [J](Hourly)                                     0.000000
Heating:Electricity [J](Hourly)                                   6.073830
WaterSystems:Electricity [J](Hourly)                              3.869797
Receptacle:InteriorEquipment:Electricity [J](Hourly)             62.582398
Internal Transport:InteriorEquipment:Electricity [J](Hourly)     10.306918
ComplianceLtg:InteriorLights:Electricity [J](Hourly)             16.140645
dtype: float64,
'C:\Users\name\Google Drive\Simulations\030012-OffMed-VRF - ap\CTZ15S22AMeter.csv': Fans:Electricity [J](Hourly)                                     46.432982
InteriorEquipment:Electricity [J](Hourly)                        71.004371
InteriorLights:Electricity [J](Hourly)                           15.900494
Electricity:Facility [J](Hourly)                                216.518008
Cooling:Electricity [J](Hourly)                                  78.686596
Pumps:Electricity [J](Hourly)                                     0.000000
Heating:Electricity [J](Hourly)                                   0.687672
WaterSystems:Electricity [J](Hourly)                              3.805893
Receptacle:InteriorEquipment:Electricity [J](Hourly)             60.888104
Internal Transport:InteriorEquipment:Electricity [J](Hourly)     10.116267
ComplianceLtg:InteriorLights:Electricity [J](Hourly)             15.900494
dtype: float64,
'C:\Users\name\Google Drive\Simulations\030012-OffMed-VRF - ap\CTZ16S22AMeter.csv': Fans:Electricity [J](Hourly)                                     52.634381
InteriorEquipment:Electricity [J](Hourly)                        66.367556
InteriorLights:Electricity [J](Hourly)                           15.400713
Electricity:Facility [J](Hourly)                                183.632062
Cooling:Electricity [J](Hourly)                                  28.867642
Pumps:Electricity [J](Hourly)                                     0.000000
Heating:Electricity [J](Hourly)                                  16.713066
WaterSystems:Electricity [J](Hourly)                              3.648704
Receptacle:InteriorEquipment:Electricity [J](Hourly)             56.827636
Internal Transport:InteriorEquipment:Electricity [J](Hourly)      9.539920
ComplianceLtg:InteriorLights:Electricity [J](Hourly)             15.400713
dtype: float64}

如果我理解对了你的问题:

你可以传递你的字典作为数据:

data = {'a': pd.Series([1, 2, 3]), 'b': pd.Series([4, 5, 6]), 'c': pd.Series([7, 8, 9])}  
pd.DataFrame(data=data)
a  b  c
0  1  4  7
1  2  5  8
2  3  6  9

没什么不同，你只需要传递索引:

data = {'a': pd.Series({1: 1, 2: 2, 3: 3, 5: 6}), 'b': pd.Series({1: 4, 2: 5, 4: 6}), 'c': pd.Series({2: 7, 3: 8, 4: 9, 5: 10})}
pd.DataFrame(data=data)
a    b     c
1  1.0  4.0   NaN
2  2.0  5.0   7.0
3  3.0  NaN   8.0
4  NaN  6.0   9.0
5  6.0  NaN  10.0

相关内容

最新更新

热门标签：