我有一个熊猫数据帧,如下所示:
df = pd.DataFrame([
{'A': 'aaa', 'B': 0.01, 'C': 0.00001, 'D': 0.00999999999476131, 'E': 0.00023191546403037534},
{'A': 'bbb', 'B': 0.01, 'C': 0.0001, 'D': 0.010000000000218279, 'E': 0.002981781316158273},
{'A': 'ccc', 'B': 0.1, 'C': 0.001, 'D': 0.0999999999999659, 'E': 0.020048115477145148},
{'A': 'ddd', 'B': 0.01, 'C': 0.01, 'D': 0.019999999999999574, 'E': 0.397456279809221},
{'A': 'eee', 'B': 0.00001, 'C': 0.000001, 'D': 0.09500000009999432, 'E': 0.06821282401091405},
])
A B C D E
0 aaa 0.01 0.00001 0.00999999999476131 0.00023191546403037534
1 bbb 0.01 0.0001 0.010000000000218279 0.002981781316158273
2 ccc 0.1 0.001 0.0999999999999659 0.020048115477145148
3 ddd 0.01 0.01 0.019999999999999574 0.397456279809221
4 eee 0.00001 0.000001 0.09500000009999432 0.06821282401091405
我曾尝试将D和E列四舍五入到与B和C列中的值相同的小数位数,但没有成功。
我试试这个:
df['b_decimals'] = df['B'].astype(str).str.split('.').str[1].str.len()
df['c_decimals'] = df['C'].astype(str).str.split('.').str[1].str.len()
df['D'] = [np.around(x, y) for x, y in zip(df['D'], df['b_decimals'])]
df['E'] = [np.around(x, y) for x, y in zip(df['E'], df['c_decimals'])]
但是我得到了这个错误:
Traceback (most recent call last):
File "C:Program FilesPython37libsite-packagesnumpycorefromnumeric.py", line 56, in _wrapfunc
return getattr(obj, method)(*args, **kwds)
AttributeError: 'float' object has no attribute 'round'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:/_GITHUB/python/test.py", line 30, in <module>
main()
File "D:/_GITHUB/python/test.py", line 24, in main
df['E'] = [np.around(x, y) for x, y in zip(df['E'], df['c_decimals'])]
File "D:/_GITHUB/python/test.py", line 24, in <listcomp>
df['E'] = [np.around(x, y) for x, y in zip(df['E'], df['c_decimals'])]
File "C:Program FilesPython37libsite-packagesnumpycorefromnumeric.py", line 3007, in around
return _wrapfunc(a, 'round', decimals=decimals, out=out)
File "C:Program FilesPython37libsite-packagesnumpycorefromnumeric.py", line 66, in _wrapfunc
return _wrapit(obj, method, *args, **kwds)
File "C:Program FilesPython37libsite-packagesnumpycorefromnumeric.py", line 46, in _wrapit
result = getattr(asarray(obj), method)(*args, **kwds)
TypeError: integer argument expected, got float
问题是,当创建b_decimals和c_decimals列时,它们存储NaN值:
A B C D E b_decimals c_decimals
0 aaa 0.01 0.00001 0.00999999999476131 0.00023191546403037534 2 NaN
1 bbb 0.01 0.0001 0.010000000000218279 0.002981781316158273 2 4
2 ccc 0.1 0.001 0.0999999999999659 0.020048115477145148 1 3
3 ddd 0.01 0.01 0.019999999999999574 0.397456279809221 2 2
4 eee 0.00001 0.000001 0.09500000009999432 0.06821282401091405 NaN NaN
创建列时发生这种情况的原因是什么?有没有其他方法可以获得如下所需的转换?
A B C D E
0 aaa 0.01 0.00001 0.01 0.00023
1 bbb 0.01 0.0001 0.01 0.0030
2 ccc 0.1 0.001 0.1 0.020
3 ddd 0.01 0.01 0.02 0.40
4 eee 0.00001 0.000001 0.09600 0.068212
我读了!。。。谢谢
您可以使用-log10
运算来获得一个数字之前的小数位数(此处为@Willem van Onsem的答案(。
然后,您可以将其合并到lambda函数中,apply
按行:
import numpy as np
df['D'] = df.apply(lambda row: round(row['D'], int(-np.floor(np.log10(row['B'])))),axis=1)
df['E'] = df.apply(lambda row: round(row['E'], int(-np.floor(np.log10(row['C'])))),axis=1)
结果:
>>> df
A B C D E
0 aaa 0.0100 0.000010 0.010 0.000230
1 bbb 0.0100 0.000100 0.010 0.003000
2 ccc 0.1000 0.001000 0.100 0.020000
3 ddd 0.0100 0.010000 0.020 0.400000
4 eee 0.0001 0.000001 0.095 0.068213
>>> df.values
array([['aaa', 0.01, 1e-05, 0.01, 0.00023],
['bbb', 0.01, 0.0001, 0.01, 0.003],
['ccc', 0.1, 0.001, 0.1, 0.02],
['ddd', 0.01, 0.01, 0.02, 0.4],
['eee', 0.0001, 1e-06, 0.095, 0.068213]], dtype=object)
我使用了Derek上面的部分解决方案,并制作了我的解决方案:
df['b_decimals'] = -np.floor(np.log10(df['B']))
df['c_decimals'] = -np.floor(np.log10(df['C']))
df['D'] = [np.around(x, y) for x, y in zip(df['D'], df['b_decimals'].astype(int))]
df['E'] = [np.around(x, y) for x, y in zip(df['E'], df['c_decimals'].astype(int))]
获得以下内容:
A B C D E b_decimals c_decimals
0 aaa 0.01 0.00001 0.01 0.00023 2 5
1 bbb 0.01 0.0001 0.01 0.003 2 4
2 ccc 0.1 0.001 0.1 0.02 1 3
3 ddd 0.01 0.01 0.02 0.4 2 2
4 eee 0.00001 0.000001 0.1 0.068213 5 6