尝试将逗号分隔转换为管道并获取"Label not contained in axis error"



我正在尝试创建一个脚本来解析CSV文件,并且它运行良好,直到我意识到需要界定管道的最终结果为止。有人告诉我最简单的方法是添加sep ='|'读取文件时。但是现在说我的标题不包含在轴上。使用Python 3.6想法?

import pandas as pd
import time
import datetime
##Open the target file as a datatable
df = pd.read_csv('C:\Users\jpe17a\Desktop\BE Patients\Jumpstart template.csv', sep='|')
##looks at the first name column and if it is empty it deletes the row
df = df[pd.notnull(df['First Name'])]
## creates a new csv file with same data
newfile = datetime.datetime.now().strftime("%m.%d.%y")
df.to_csv('PopulationEnrollmentJumpstartMRNRegistrationTemplate ' + newfile + '.csv', sep='|')
print(df)
##deletes phone and address2 column from oringinal file
df.drop (['Address 2'], axis=1, inplace = True)
df.drop (['Phone'], axis=1, inplace = True)
##saves original file
df.to_csv("C:\Users\jpe17a\Desktop\BE Patients\Test.csv", sep='|')
##opens the new file
df2 = pd.read_csv('C:\Users\jpe17a\Desktop\BE Patients\PopulationEnrollmentJumpstartMRNRegistrationTemplate ' + newfile + '.csv', sep='|')
##These columns where added because they do not come with provided spreadsheet but are needed for jumpstart. The columns are auto populated
df2['Patient Assigning Organization'] = 'MyHFN'
df2['Program Name']  = 'Case_Management'
df2['Sub Program Name']  = 'Community'
df2['Enrollment Start Date (YYYY-MM-DD'] = time.strftime("%x")
df2['Enrollment End Date (YYYY-MM-DD)'] = ''
df2['Status Description'] = 'Active'
##re ordering columns and not including uneccessary columns to drop them 
df2 = df2[['MRN', 'Facility', 'Patient Assigning Organization', 'Program Name', 'Sub Program Name', 'Enrollment Start Date (YYYY-MM-DD', 'Enrollment End Date (YYYY-MM-DD)', 'Status Description', 'First Name', 'Middle ', 'Last Name', 'Birthdate', 'Gender', 'Street', 'Address 2', 'City', 'State', 'Zip', 'Phone']]

df2.to_csv('PopulationEnrollmentJumpstartMRNRegistrationTemplate '+timeStr+'.csv, sep='|')

编辑:

还需要更改 sep=','或在第一个read_csv中删除,因为 sep=','是默认参数。


看来您忘记了 sep='|' in:

df2 = pd.read_csv('C:\Users\jpe17a\Desktop\BE Patients\PopulationEnrollmentJumpstartMRNRegistrationTemplate ' + newfile + '.csv', sep='|'))

还必须在to_csv中定义它:

df.to_csv('PopulationEnrollmentJumpstartMRNRegistrationTemplate ' + newfile + '.csv', sep='|'))
df.to_csv("C:\Users\jpe17a\Desktop\BE Patients\Test.csv", sep='|')

最新更新