列具有 dtype 对象,不能对此 dtype 使用方法 'nlargest'



我正在使用Google Colab,我想使用panda分析Google电子表格中的一个文件。我成功导入了它们,我可以用pd.DataFrame打印出来

data_tablet = gc.open_by_url(f'https://docs.google.com/spreadsheets/d/{sheet_id}/edit#gid={tablet_gid}')
tablet_var = data_tablet.worksheet('tablet')
tablet_data = tablet_var.get_all_records()
df_tablet = pd.DataFrame(tablet_data)
print(df_tablet)
name  1st quarter  ...  4th quarter     total
0      Albendazol 400 mg         18.0  ...         60.0        78
1      Alopurinol 100 mg        125.0  ...        821.0       946
2        Ambroksol 30 mg        437.0  ...        798.0  1,235.00
3      Aminofilin 200 mg         70.0  ...        522.0       592
4    Amitriptilin 25 mg          83.0  ...        178.0       261
..                   ...          ...  ...          ...       ...
189   Levoflaksin 250 mg        611.0  ...        822.0  1,433.00
190            Linezolid        675.0  ...        315.0       990
191  Moxifloxacin 400 mg        964.0  ...         99.0  1,063.00
192  Pyrazinamide 500 mg        395.0  ...        189.0       584
193          Vitamin B 6        330.0  ...        825.0  1,155.00
[194 rows x 6 columns]

我想从total的194个项目中选择前10个,但它不起作用。

  • total中选择前10个并运行下面的命令,我得到cannot use method 'nlargest' with this dtype
# Ambil data 10 terbesar dari 194 item
df_tablet_top10 = df_tablet.nlargest(10, 'total')
print(df_tablet_top10)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-a7295330f7a9> in <module>()
1 # Ambil data 10 terbesar dari 194 item
----> 2 df_tablet_top10 = df_tablet.nlargest(10, 'total')
3 print(df_tablet_top10)
2 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/algorithms.py in compute(self, method)
1273             if not self.is_valid_dtype_n_method(dtype):
1274                 raise TypeError(
-> 1275                     f"Column {repr(column)} has dtype {dtype}, "
1276                     f"cannot use method {repr(method)} with this dtype"
1277                 )
TypeError: Column 'total' has dtype object, cannot use method 'nlargest' with this dtype
  • 但当我从1st quarter中选择它时,它工作得很好
df_tablet_top10 = df_tablet.nlargest(10, '1st quarter')
print(df_tablet_top10)
nama  1st quarter  ...  4th quarter     total
154             Salbutamol 4 mg        981.0  ...         23.0  1,004.00
74   MDT FB dewasa (obat kusta)        978.0  ...        910.0  1,888.00
155   Paracetamol 500 mg Tablet        976.0  ...        503.0  1,479.00
33              Furosemid 40 mg        975.0  ...        524.0  1,499.00
23          Deksametason 0,5 mg        972.0  ...        793.0  1,765.00
21    Bisakodil (dulkolax) 5 mg        970.0  ...        798.0  1,768.00
191         Moxifloxacin 400 mg        964.0  ...         99.0  1,063.00
85          Metronidazol 250 mg        958.0  ...        879.0  1,837.00
96          Nistatin 500.000 IU        951.0  ...        425.0  1,376.00
37             Glimepirid 2 mg         947.0  ...        890.0  1,837.00
[10 rows x 6 columns]

知道是什么导致了这种情况吗?

此外,我已经在谷歌工作表上将1st quarter的格式更改为total作为number,但它仍然不能在中工作

我找到了解决方案,但没有找到解释。

我所做的只是用将total列转换为float

df_tablet['total'] = df_tablet['total'].astype(float)
df_tablet['total'] = df_tablet['total'].astype(float)
df_tablet_top10 = df_tablet.nlargest(10, '1st quarter')
print(df_tablet_top10)

最新更新