使用Pandas将单列csv拆分为多列



我很想知道如何使用Pandas优雅地将以下格式的单列文件拆分为更经典的表格布局。

(文件作为眼动仪的输出接收)

当前格式:

TimeStampGazePointXLeftGazePointYLeftGazePointXRightGazePointYRight
00000000.11111111111111.22222222222222.33333333333333.4444444444444
00000000.11111111111111.22222222222222.33333333333333.4444444444444
00000000.11111111111111.22222222222222.33333333333333.4444444444444

所需格式:

TimeStamp GazePointXLeft GazePointYLeft GazePointXRight GazePointYRight
000000000 11111111111111 22222222222222 333333333333333 444444444444444
000000000 11111111111111 22222222222222 333333333333333 444444444444444
000000000 11111111111111 22222222222222 333333333333333 444444444444444

我被卡住的地方:我想解决方案将涉及熊猫的split方法,但我很难弄清楚如何到达那里。我想我得"手工"了。添加相应的列,同时以某种方式分割以句号分隔的数据行…

df = pd.DataFrame('data.csv')
headers = ["TimeStamp", ...,  "GazePointYRight"]
for header in headers:
df[header] = df[1:].split(".")[headers.index(header)] <--- # Splitting rows by period and taking data based on header index in list

请给我指路。提前谢谢。

pandas.read_...有几个有用的参数可供使用。

我相信你想要这样的东西?

import pandas as pd
columns_names = [
'TimeStamp',
'GazePointXLeft',
'GazePointYLeft',
'GazePointXRight',
'GazePointYRight',
]
df = pd.read_csv("lixo.csv", sep='.', skiprows=1, names=columns_names)

最好在读取csv时修复:

headers = ["TimeStamp", ...,  "GazePointYRight"]
df = pd.read_csv('data.csv', sep='.', skiprows=1, names=headers)

之后也可以这样做:

df = pd.read_csv('data.csv')
headers = ["TimeStamp", ...,  "GazePointYRight"]
df = df.TimeStampGazePointXLeftGazePointYLeftGazePointXRightGazePointYRight.str.split('.', expand=True)
df.rename(columns={n:name for n, name in enumerate(headers)}, inplace=True)

相关内容

  • 没有找到相关文章

最新更新