保存数据帧的结果是列表字符串而不是列表

假设我有一个DF的列，其中的条目如下:

df["column"] = [["this","is","a","tokenized","sentence","."],[...]]

当我保存DF并在另一个文件中读取它时，它看起来像:

df2["column"] = "[["this","is","a","tokenized","sentence","."],[...]]"

如果我现在想在列表中使用words (=strings)我会得到:

df2["column"][0] = [

而不是期望的:

df2["column"][0] = [["this","is","a","tokenized","sentence","."],[...]]

和进一步指出:

df2["column"][0][0] = ["this","is","a","tokenized","sentence","."]

我试过使用pd。Eval, Eval, literal_eval，各种as_type操作，但没有一个返回期望的结果。

如果你想保存你的数据帧为csv，你必须使用converters当你想加载你的数据:

>>> df
column
0  [[this, is, a, tokenized, sentence, .], [another, sentence]]
>>> df.to_csv('data.csv', index=False)

def string_as_list(s):
return pd.eval(s)
df = pd.read_csv('data.csv', converters={'column': string_as_list})
print(df.loc[0, 'column'])
print(type(df.loc[0, 'column']))
# Output
[['this', 'is', 'a', 'tokenized', 'sentence', '.'], ['another', 'sentence']]
<class 'list'>

相关内容

最新更新

热门标签：