我正在使用Google Colab,我想使用panda分析Google电子表格中的一个文件。我成功导入了它们,我可以用pd.DataFrame
打印出来
data_tablet = gc.open_by_url(f'https://docs.google.com/spreadsheets/d/{sheet_id}/edit#gid={tablet_gid}')
tablet_var = data_tablet.worksheet('tablet')
tablet_data = tablet_var.get_all_records()
df_tablet = pd.DataFrame(tablet_data)
print(df_tablet)
name 1st quarter ... 4th quarter total
0 Albendazol 400 mg 18.0 ... 60.0 78
1 Alopurinol 100 mg 125.0 ... 821.0 946
2 Ambroksol 30 mg 437.0 ... 798.0 1,235.00
3 Aminofilin 200 mg 70.0 ... 522.0 592
4 Amitriptilin 25 mg 83.0 ... 178.0 261
.. ... ... ... ... ...
189 Levoflaksin 250 mg 611.0 ... 822.0 1,433.00
190 Linezolid 675.0 ... 315.0 990
191 Moxifloxacin 400 mg 964.0 ... 99.0 1,063.00
192 Pyrazinamide 500 mg 395.0 ... 189.0 584
193 Vitamin B 6 330.0 ... 825.0 1,155.00
[194 rows x 6 columns]
我想从total
的194个项目中选择前10个,但它不起作用。
- 从
total
中选择前10个并运行下面的命令,我得到cannot use method 'nlargest' with this dtype
# Ambil data 10 terbesar dari 194 item
df_tablet_top10 = df_tablet.nlargest(10, 'total')
print(df_tablet_top10)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-7-a7295330f7a9> in <module>()
1 # Ambil data 10 terbesar dari 194 item
----> 2 df_tablet_top10 = df_tablet.nlargest(10, 'total')
3 print(df_tablet_top10)
2 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/algorithms.py in compute(self, method)
1273 if not self.is_valid_dtype_n_method(dtype):
1274 raise TypeError(
-> 1275 f"Column {repr(column)} has dtype {dtype}, "
1276 f"cannot use method {repr(method)} with this dtype"
1277 )
TypeError: Column 'total' has dtype object, cannot use method 'nlargest' with this dtype
- 但当我从
1st quarter
中选择它时,它工作得很好
df_tablet_top10 = df_tablet.nlargest(10, '1st quarter')
print(df_tablet_top10)
nama 1st quarter ... 4th quarter total
154 Salbutamol 4 mg 981.0 ... 23.0 1,004.00
74 MDT FB dewasa (obat kusta) 978.0 ... 910.0 1,888.00
155 Paracetamol 500 mg Tablet 976.0 ... 503.0 1,479.00
33 Furosemid 40 mg 975.0 ... 524.0 1,499.00
23 Deksametason 0,5 mg 972.0 ... 793.0 1,765.00
21 Bisakodil (dulkolax) 5 mg 970.0 ... 798.0 1,768.00
191 Moxifloxacin 400 mg 964.0 ... 99.0 1,063.00
85 Metronidazol 250 mg 958.0 ... 879.0 1,837.00
96 Nistatin 500.000 IU 951.0 ... 425.0 1,376.00
37 Glimepirid 2 mg 947.0 ... 890.0 1,837.00
[10 rows x 6 columns]
知道是什么导致了这种情况吗?
此外,我已经在谷歌工作表上将1st quarter
的格式更改为total
作为number
,但它仍然不能在中工作
我找到了解决方案,但没有找到解释。
我所做的只是用将total
列转换为float
df_tablet['total'] = df_tablet['total'].astype(float)
df_tablet['total'] = df_tablet['total'].astype(float)
df_tablet_top10 = df_tablet.nlargest(10, '1st quarter')
print(df_tablet_top10)