使用熊猫读取文件并在两列上使用相关系数

>我有一个没有标题的文件，如下所示

0.000000 0.330001 0.280120 
1.000000 0.355590 0.298581 
2.000000 0.305945 0.280231

我想使用 pandas 数据帧读取此文件，并希望在第二列和第三列之间执行相关系数。

我正在尝试如下：

import pandas as pd
df = pd.read_csv('COLVAR_hbondnohead', header=None)
df['1'].corr(df['2'])

它弹出一个巨大的错误消息。我没有正确处理色谱柱吗？有什么建议或提示吗？

错误信息

Traceback (most recent call last):
File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
TypeError: an integer is required
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/sbhakat/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3063, in get_loc
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 164, in pandas._libs.index.IndexEngine.get_loc
KeyError: '1'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
TypeError: an integer is required
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/sbhakat/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 2685, in __getitem__
return self._getitem_column(key)
File "/home/sbhakat/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 2692, in _getitem_column
return self._get_item_cache(key)
File "/home/sbhakat/anaconda3/lib/python3.6/site-packages/pandas/core/generic.py", line 2486, in _get_item_cache
values = self._data.get(item)
File "/home/sbhakat/anaconda3/lib/python3.6/site-packages/pandas/core/internals.py", line 4115, in get
loc = self.items.get_loc(item)
File "/home/sbhakat/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3065, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/_libs/index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 164, in pandas._libs.index.IndexEngine.get_loc
KeyError: '1'

在读取文件时，您必须指定分隔符，即空格。然后使用位置访问列。下面的代码应该可以工作。

df = pd.read_csv('test.txt', sep=' ', header=None)
df[1].corr(df[2])

罗伊文件扩展名是什么？是吗.csv？如果是，您应该将其添加到文件名的末尾，例如pd.read_csv('COLVAR_hbondnohead.csv'，header=None(

你没有名为1和2的列，所以你必须先创建这些列。

import pandas as pd
df = pd.read_csv('COLVAR_hbondnohead', header=None)
df1 = df.reindex(columns=['1','2', '3'])

然后

df1['2'].corr(df1['3'])

相关内容

最新更新

热门标签：