pandas:使用(row，col)索引设置值

pandas提供了按行和列索引列表查找的功能，

In [49]: index = ['a', 'b', 'c', 'd']
In [50]: columns = ['one', 'two', 'three', 'four']
In [51]: M = pandas.DataFrame(np.random.randn(4,4), index=index, columns=columns)
In [52]: M
Out[52]: 
        one       two     three      four
a -0.785841 -0.538572  0.376594  1.316647
b  0.530288 -0.975547  1.063946 -1.049940
c -0.794447 -0.886721  1.794326 -0.714834
d -0.158371  0.069357 -1.003039 -0.807431
In [53]: M.lookup(index, columns) # diagonal entries
Out[53]: array([-0.78584142, -0.97554698,  1.79432641, -0.8074308 ])

我想使用相同的索引方法来设置M的元素。我该怎么做？

这个答案已经写了很多年了，所以我想我可能会贡献一点。随着Panda的重构，尝试使用在某个位置设置值

M.iloc[index][col]

可能会对尝试在切片中设置值发出警告。

SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame
See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

在0.21之后的熊猫版本中，正确的"蟒蛇"方式现在是熊猫。操作员的DataFrame.at

它看起来像这样：

M.at[index,col] = new_value

旧版本的答案：在旧版本中，更"蟒蛇式"的做法是用熊猫。DataFrame.set_value指令。请注意，此指令返回生成的DataFrame。

M.set_value(index,column,new_value)

我只是想，在弄清楚.iloc或.ix方法可能产生的警告的来源后，我会在这里发布这篇文章。

set_value方法也适用于多索引DataFrames，方法是将索引的多个级别作为元组（例如，用（col，subcol）替换列）

我不确定我是否理解您的意思，但您是否使用DataFrame.ix来选择/设置单个元素：

In [79]: M
Out[79]: 
        one       two     three      four
a -0.277981  1.500188 -0.876751 -0.389292
b -0.705835  0.108890 -1.502786 -0.302773
c  0.880042 -0.056620 -0.550164 -0.409458
d  0.704202  0.619031  0.274018 -1.755726
In [75]: M.ix[0]
Out[75]: 
one     -0.277981
two      1.500188
three   -0.876751
four    -0.389292
Name: a
In [78]: M.ix[0,0]
Out[78]: -0.27798082190723405
In [81]: M.ix[0,0] = 1.0
In [82]: M
Out[82]: 
        one       two     three      four
a  1.000000  1.500188 -0.876751 -0.389292
b -0.705835  0.108890 -1.502786 -0.302773
c  0.880042 -0.056620 -0.550164 -0.409458
d  0.704202  0.619031  0.274018 -1.755726
In [84]: M.ix[(0,1),(0,1)] = 1
In [85]: M
Out[85]: 
        one       two     three      four
a  1.000000  1.000000 -0.876751 -0.389292
b  1.000000  1.000000 -1.502786 -0.302773
c  0.880042 -0.056620 -0.550164 -0.409458
d  0.704202  0.619031  0.274018 -1.755726

您也可以按索引进行切片：

In [98]: M.ix["a":"c","one"] = 2.0
In [99]: M
Out[99]: 
        one       two     three      four
a  2.000000  1.000000 -0.876751 -0.389292
b  2.000000  1.000000 -1.502786 -0.302773
c  2.000000 -0.056620 -0.550164 -0.409458
d  0.704202  0.619031  0.274018 -1.755726

我面临着完全相同的问题，我认为目前Pandas没有为此提供内置方法。注意，OP的目标和通常的值设置之间的区别在于，OP只希望由（行，列）对索引的特定负载设置为特定值，而不是所有负载（以类似矩阵的方式，如df.loc[rows, cols]=xxx所做的那样）。事实上，甚至lookup函数也已被弃用（请参阅此处）。

简而言之，我认为可以：

（1）用于循环；或

（2）首先转换为numpy，然后索引numpy数组，然后转换回pandas数据帧（如上面的链接所示）。

尽管如此，我认为Pandas应该重新添加这样的功能！

相关内容

最新更新

热门标签：