如何舍入具有与另一列相同小数数量的数据帧列



我有一个熊猫数据帧,如下所示:

df = pd.DataFrame([
{'A': 'aaa',  'B': 0.01,    'C': 0.00001,  'D': 0.00999999999476131,   'E': 0.00023191546403037534},
{'A': 'bbb',  'B': 0.01,    'C': 0.0001,   'D': 0.010000000000218279,  'E': 0.002981781316158273},
{'A': 'ccc',  'B': 0.1,     'C': 0.001,    'D': 0.0999999999999659,    'E': 0.020048115477145148},
{'A': 'ddd',  'B': 0.01,    'C': 0.01,     'D': 0.019999999999999574,  'E': 0.397456279809221},
{'A': 'eee',  'B': 0.00001, 'C': 0.000001, 'D': 0.09500000009999432,   'E': 0.06821282401091405},
])
A          B            C                       D                         E
0  aaa       0.01      0.00001     0.00999999999476131    0.00023191546403037534
1  bbb       0.01       0.0001    0.010000000000218279      0.002981781316158273
2  ccc        0.1        0.001      0.0999999999999659      0.020048115477145148
3  ddd       0.01         0.01    0.019999999999999574         0.397456279809221
4  eee    0.00001     0.000001     0.09500000009999432       0.06821282401091405 

我曾尝试将D和E列四舍五入到与B和C列中的值相同的小数位数,但没有成功。

我试试这个:

df['b_decimals'] = df['B'].astype(str).str.split('.').str[1].str.len()
df['c_decimals'] = df['C'].astype(str).str.split('.').str[1].str.len()
df['D'] = [np.around(x, y) for x, y in zip(df['D'], df['b_decimals'])]
df['E'] = [np.around(x, y) for x, y in zip(df['E'], df['c_decimals'])]

但是我得到了这个错误:

Traceback (most recent call last):
File "C:Program FilesPython37libsite-packagesnumpycorefromnumeric.py", line 56, in _wrapfunc
return getattr(obj, method)(*args, **kwds)
AttributeError: 'float' object has no attribute 'round'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:/_GITHUB/python/test.py", line 30, in <module>
main()
File "D:/_GITHUB/python/test.py", line 24, in main
df['E'] = [np.around(x, y) for x, y in zip(df['E'], df['c_decimals'])]
File "D:/_GITHUB/python/test.py", line 24, in <listcomp>
df['E'] = [np.around(x, y) for x, y in zip(df['E'], df['c_decimals'])]
File "C:Program FilesPython37libsite-packagesnumpycorefromnumeric.py", line 3007, in around
return _wrapfunc(a, 'round', decimals=decimals, out=out)
File "C:Program FilesPython37libsite-packagesnumpycorefromnumeric.py", line 66, in _wrapfunc
return _wrapit(obj, method, *args, **kwds)
File "C:Program FilesPython37libsite-packagesnumpycorefromnumeric.py", line 46, in _wrapit
result = getattr(asarray(obj), method)(*args, **kwds)
TypeError: integer argument expected, got float

问题是,当创建b_decimals和c_decimals列时,它们存储NaN值:

A        B          C                      D                        E  b_decimals   c_decimals
0  aaa     0.01    0.00001    0.00999999999476131   0.00023191546403037534           2          NaN
1  bbb     0.01     0.0001   0.010000000000218279     0.002981781316158273           2            4
2  ccc      0.1      0.001     0.0999999999999659     0.020048115477145148           1            3
3  ddd     0.01       0.01   0.019999999999999574        0.397456279809221           2            2
4  eee  0.00001   0.000001    0.09500000009999432      0.06821282401091405         NaN          NaN

创建列时发生这种情况的原因是什么?有没有其他方法可以获得如下所需的转换?

A          B           C           D           E
0     aaa       0.01     0.00001        0.01     0.00023
1     bbb       0.01      0.0001        0.01      0.0030
2     ccc        0.1       0.001         0.1       0.020
3     ddd       0.01        0.01        0.02        0.40
4     eee    0.00001    0.000001     0.09600    0.068212

我读了!。。。谢谢

您可以使用-log10运算来获得一个数字之前的小数位数(此处为@Willem van Onsem的答案(。

然后,您可以将其合并到lambda函数中,apply按行:

import numpy as np
df['D'] = df.apply(lambda row: round(row['D'], int(-np.floor(np.log10(row['B'])))),axis=1)
df['E'] = df.apply(lambda row: round(row['E'], int(-np.floor(np.log10(row['C'])))),axis=1)

结果:

>>> df
A       B         C      D         E
0  aaa  0.0100  0.000010  0.010  0.000230
1  bbb  0.0100  0.000100  0.010  0.003000
2  ccc  0.1000  0.001000  0.100  0.020000
3  ddd  0.0100  0.010000  0.020  0.400000
4  eee  0.0001  0.000001  0.095  0.068213
>>> df.values
array([['aaa', 0.01, 1e-05, 0.01, 0.00023],
['bbb', 0.01, 0.0001, 0.01, 0.003],
['ccc', 0.1, 0.001, 0.1, 0.02],
['ddd', 0.01, 0.01, 0.02, 0.4],
['eee', 0.0001, 1e-06, 0.095, 0.068213]], dtype=object)

我使用了Derek上面的部分解决方案,并制作了我的解决方案:

df['b_decimals'] = -np.floor(np.log10(df['B']))
df['c_decimals'] = -np.floor(np.log10(df['C']))
df['D'] = [np.around(x, y) for x, y in zip(df['D'], df['b_decimals'].astype(int))]
df['E'] = [np.around(x, y) for x, y in zip(df['E'], df['c_decimals'].astype(int))]

获得以下内容:

A          B           C       D           E    b_decimals    c_decimals
0  aaa       0.01     0.00001    0.01     0.00023             2             5
1  bbb       0.01      0.0001    0.01       0.003             2             4
2  ccc        0.1       0.001     0.1        0.02             1             3
3  ddd       0.01        0.01    0.02         0.4             2             2
4  eee    0.00001    0.000001     0.1    0.068213             5             6

最新更新