我有一个这样的CSV文件(逗号分隔)
ID, Name,Context, Location
123,"John","{"Organization":{"Id":12345,"IsDefault":false},"VersionNumber":-1,"NewVersionId":"88229ef9-e97b-4b88-8eba-31740d48fd15","ApiIntegrationType":0,"PortalIntegrationType":0}","Road 1"
234,"Mike","{"Organization":{"Id":23456,"IsDefault":false},"VersionNumber":-1,"NewVersionId":"88229ef9-e97b-4b88-8eba-31740d48fd15","ApiIntegrationType":0,"PortalIntegrationType":0}","Road 2"
我想创建这样的DataFrame:
ID | Name |Context |Location
123| John |{"Organization":{"Id":12345,"IsDefault":false},"VersionNumber":-1,"NewVersionId":"88229ef9-e97b-4b88-8eba-31740d48fd15","ApiIntegrationType":0,"PortalIntegrationType":0}|Road 1
234| Mike |{"Organization":{"Id":23456,"IsDefault":false},"VersionNumber":-1,"NewVersionId":"88229ef9-e97b-4b88-8eba-31740d48fd15","ApiIntegrationType":0,"PortalIntegrationType":0}|Road 2
你能告诉我如何使用pandas read_csv来做吗?
一个答案-如果你愿意接受字符被剥离:
pd.read_csv(your_filepath, escapechar='\')
ID Name Context Location
0 123 John {"Organization":{"Id":12345,"IsDefault":false}... Road 1
1 234 Mike {"Organization":{"Id":23456,"IsDefault":false}... Road 2
如果你真的想要反斜杠-使用自定义转换器:
def backslash_it(x):
return x.replace('"','\"')
pd.read_csv(your_filepath, escapechar='\', converters={'Context': backslash_it})
ID Name Context Location
0 123 John {"Organization":{"Id":12345,"IsDefault":... Road 1
1 234 Mike {"Organization":{"Id":23456,"IsDefault":... Road 2
read_csv
上的escapechar
用于实际读取csv
,然后自定义转换器将反斜杠放回。
注意,我调整了标题行,使列名匹配更容易:
ID,Name,Context,Location