替换 Pandas 数据帧行将覆盖所有列的 dtype

当我替换df的一行时，它会导致现有的dtype=int列变成float。我想让它保持原样。

创建df:

testdate = pd.datetime(2014, 1, 1)
adddata = {'intcol':0,'floatcol':0.0}
df = pd.DataFrame(data=adddata, index=pd.date_range(testdate, periods=1))

按照预期，一列是整型，另一列是浮点型，如df.dtypes:

所示

floatcol    float64
intcol        int64
dtype: object

然后使用df.ix[testdate] = pd.Series(adddata)覆盖现有行(在本例中只有1行)。我故意使用相同的数据来显示这个问题:intcol变成了float。df.dtypes:

floatcol    float64
intcol      float64
dtype: object

请注意，我可以单独更改单元格(例如df.ix[testdate,'floatcol'] = 0.0)并维护列dtypes，但实际上我想同时覆盖的列远远超过2列，因此每次只覆盖一个列是很麻烦的。

有趣的是，即使指定数据类型为object也没有帮助:

>>> df.loc[testdate,:] = pd.Series(adddata, dtype='object')
>>> df.dtypes
floatcol    float64
intcol      float64
dtype: object

可能有人有更好的解决方案，但我注意到这是有效的:

>>> df.loc[testdate,:] = pd.Series(list(adddata.values()), adddata.keys(), dtype='object')
>>> df.dtypes
floatcol    float64
intcol        int64
dtype: object

但是，如果行值是dict格式，这可能会更容易:

>>> df.loc[testdate,:] = list(map(adddata.get, df.columns))
>>> df.dtypes
floatcol    float64
intcol        int64
dtype: object

相关内容

最新更新

热门标签：