有什么想法为什么这个不起作用吗??
正在转换的XML(比这个长得多(
<XML>
<ClinicalData StudyOID="XXXXXXXXX" MetaDataVersionOID="53" mdsol_AuditSubCategoryName="QueryAnswer">
<SubjectData SubjectKey="XXXXXXXX-b7cd-4f97-8d25-594219de192f" mdsol_SubjectKeyType="SubjectUUID" mdsol_SubjectName="XX-002">
<SiteRef LocationOID="15" XXXX_StudyEnvSiteNumber="15" />
<StudyEventData StudyEventOID="DAY1" StudyEventRepeatKey="DAY1[1]" mdsol_InstanceId="47077">
<FormData FormOID="SS_DISP" FormRepeatKey="1" mdsol_DataPageId="320656">
<ItemGroupData ItemGroupOID="SS_DISP" mdsol_RecordId="797737">
<ItemData ItemOID="SS_DISP.DISPDAT" TransactionType="Upsert">
<AuditRecord>
<UserRef UserOID="XXXX@XXXXX.com1" />
<LocationRef LocationOID="15" mdsol_StudyEnvSiteNumber="15" />
<DateTimeStamp>2022-01-28T05:27:54</DateTimeStamp>
<ReasonForChange>
</ReasonForChange>
<SourceID>12345678</SourceID>
</AuditRecord>
<mdsol_Query QueryRepeatKey="123456" Value="Date of XXXX does not equal the XXXY Date. Please review and correct else clarify." Status="Answered" Response="Issues with XXXXX IWRS XXXXXX" />
</ItemData>
</ItemGroupData>
</FormData>
</StudyEventData>
</SubjectData>
</ClinicalData>
</XML>
我正在使用这个python脚本进行转换,或者我正在尝试。我对此很陌生。
from xml.etree import ElementTree
tree = ElementTree.parse('xml.xml')
root = tree.getroot()
data = []
for ClinicalData in root:
StudyOID = getattr(child.find('StudyOID'), 'text', None)
MetaDataVersionOID = getattr(child.find('MetaDataVersionOID'), 'text', None)
mdsol_AuditSubCategoryName = getattr(child.find('mdsol_AuditSubCategoryName'), 'text', None)
SubjectKey = getattr(child.find('SubjectKey'), 'text', None)
#print('{}, {}, {}, {}'.format(StudyOID, MetaDataVersionOID, mdsol_AuditSubCategoryName, SubjectKey))
data.append('{}, {}, {}, {}'.format(StudyOID, MetaDataVersionOID, mdsol_AuditSubCategoryName, SubjectKey))
#print (data)
with open('output.csv', 'w') as f: f.write('n'.join([row for row in data[1:]]))
我得到的错误消息如下:
File "<stdin>", line 9
with open('output.csv', 'w') as f: f.write('n'.join([row for row in data[1:]]))
^^^^
SyntaxError: invalid syntax
在上面,数据列表("data"(将其转换为pandas数据帧,如下所示,并写入csv
cols = [StudyOID, MetaDataVersionOID, mdsol_AuditSubCategoryName, SubjectKey]
df = pd.DataFrame(data, columns=cols)
# Writing dataframe to csv
df.to_csv('output.csv')