推送到包含列表和dic的BigQuery表时出错



以下代码在Google Colab中可用,但在我的本地计算机上不可用。

import pandas as pd
from google.oauth2 import service_account
df = pd.DataFrame()
df['A'] = [[1], [2], [3]]
credentials = service_account.Credentials.from_service_account_info({--credential infos--},)
df.to_gbq(destination_table='raw.test', project_id='project-test', credentials = credentials, if_exists='replace')

我得到的错误是pyarrow.lib.ArrowTypeError: Expected bytes, got a 'list' object

我尝试过google-cloud-bigquerypandas-gbq,但都出现了相同的错误。

我正在运行Python 3.10,Pandas 1.5.1。Google Colab运行Python 3.7和Pandas 1.3.5。

Google BigQuery不接受python列表或字典,下面是它接受的类型列表。

我认为您将这些类型与JSON格式的字符串混淆了(请注意,"JSON纯粹是一个具有指定数据格式的字符串——它只包含属性,不包含方法。"(

此:

df.DataFrame()
df['A'] = ['[1]', '[2]', '[3]']
df['B'] = [[1], [2], [3]]
print(df)

打印:

A    B
0  [1]  [1]
1  [2]  [2]
2  [3]  [3]

看起来一样,但不同。

这看起来也很相似,但再次不同:

df_test = pd.DataFrame()
df_test['Python_Dictionary'] = [{"name": {"0": "banana"}, "color": {"0": "yellow"}}]
df_test['JSON'] =['{"name": {"0": "banana"}, "color": {"0": "yellow"}}']

print(type(df_test['Python_Dictionary'][0]))
print(type(df_test['JSON'][0]))

打印:

<class 'dict'>
<class 'str'>

检查您在Google Colab中使用的值的数据类型;也就是说,你从网络报废中获得的数据。如果你想的话,也可以在你的电脑上试试这个:

import pandas as pd
from google.oauth2 import service_account
df = pd.DataFrame()
df['A'] = ['[1]', '[2]', '[3]']
credentials = service_account.Credentials.from_service_account_info({--credential infos--},)
df.to_gbq(destination_table='raw.test', project_id='project-test', credentials = credentials, if_exists='replace')

最新更新