我有以下代码来搜索文本文件并提取某些元素之间的文本部分:start="a owl: class "End =' .n',并将这些文本部分作为列表的元素追加。
contents = []
with open(r'C:/Users/Jupyter Notebooks/small.ttl', 'r', encoding="UTF-8") as f:
recording = False
content = ''
for line in f:
if start in line:
recording = True
if recording and end in line:
recording = False
contents.append(content)
if recording:
content += line
else:
content = ''
for i in contents
print (i)
列表内容包含两个由文本字符串组成的元素。每一行文字都以新的一行开始第一个元素:
http://purl.bioontology.org/ontology/SNOMEDCT/1075251000119104猫头鹰:类;skos:prefLabel "获得性左跖骨内收"@en;1075251000119104" ^^xsd:字符串;skos: altLabel后得性左跖骨内收(障碍)&;&;&;http://purl.bioontology.org/ontology/SNOMEDCT/has_finding_sitehttp://purl.bioontology.org/ontology/SNOMEDCT/726438004;http://purl.bioontology.org/ontology/SNOMEDCT/has_associated_morphologyhttp://purl.bioontology.org/ontology/SNOMEDCT/767172008;http://purl.bioontology.org/ontology/SNOMEDCT/occurs_inhttp://purl.bioontology.org/ontology/SNOMEDCT/767023003;rdfs: subClassOfhttp://purl.bioontology.org/ontology/SNOMEDCT/99701000119102;rdfs: subClassOfhttp://purl.bioontology.org/ontology/SNOMEDCT/774124003;
第二个元素:
http://purl.bioontology.org/ontology/SNOMEDCT/10308009 a owl:Class;skos:prefLabel ";skos:符号""10308009""^ ^ xsd: string;skos:altLabel "42 ""氩-42(物质)""rdfs: subClassOfhttp://purl.bioontology.org/ontology/SNOMEDCT/35016001;http://purl.bioontology.org/ontology/SNOMEDCT/SUBSET_MEMBER""900000000000508004 ~ ACCEPTABILITYID ~ 900000000000508004""^ ^ xsd: string;http://purl.bioontology.org/ontology/SNOMEDCT/TYPE_ID""900000000000013009""^ ^ xsd: string;http://purl.bioontology.org/ontology/SNOMEDCT/CASE_SIGNIFICANCE_ID""900000000000017005""^ ^ xsd: string;http://purl.bioontology.org/ontology/SNOMEDCT/SUBSET_MEMBER""900000000000509007 ~ ACCEPTABILITYID ~ 900000000000509007""^ ^ xsd: string;http://purl.bioontology.org/ontology/SNOMEDCT/INACTIVATION_INDICATOR""723277005""^ ^ xsd: string;http://purl.bioontology.org/ontology/SNOMEDCT/SUBSET_MEMBER""900000000000490003 ~ VALUEID ~ 900000000000490003""^ ^ xsd: string;http://purl.bioontology.org/ontology/SNOMEDCT/SUBSET_MEMBER""900000000000509007 ~ ACCEPTABILITYID ~ 900000000000509007""^ ^ xsd: string;http://purl.bioontology.org/ontology/SNOMEDCT/TYPE_ID""900000000000013009""^ ^ xsd: string;http://purl.bioontology.org/ontology/SNOMEDCT/CASE_SIGNIFICANCE_ID""900000000000448009""^ ^ xsd: string;http://purl.bioontology.org/ontology/SNOMEDCT/SUBSET_MEMBER""900000000000508004 ~ ACCEPTABILITYID ~ 900000000000508004""^ ^ xsd: string
;
我想逐行遍历每个列表元素并检查该行是否包含某个字符串。例如:
if "a:owl Class" in line:
print line
我的问题是我不能遍历列表中的元素行。
我就这么做了
首先,我把你的两个元素转换成字符串,并把它们放到一个列表中。
string1 = 'http://purl.bioontology.org/ontology/SNOMEDCT/1075251000119104 a owl:Class ; skos:prefLabel """Acquired left metatarsus adductus"""@en ; skos:notation """1075251000119104"""^^xsd:string ; skos:altLabel """Acquired left metatarsus adductus (disorder)"""@en ; http://purl.bioontology.org/ontology/SNOMEDCT/has_finding_site http://purl.bioontology.org/ontology/SNOMEDCT/726438004 ; http://purl.bioontology.org/ontology/SNOMEDCT/has_associated_morphology http://purl.bioontology.org/ontology/SNOMEDCT/767172008 ; http://purl.bioontology.org/ontology/SNOMEDCT/occurs_in http://purl.bioontology.org/ontology/SNOMEDCT/767023003 ; rdfs:subClassOf http://purl.bioontology.org/ontology/SNOMEDCT/99701000119102 ; rdfs:subClassOf http://purl.bioontology.org/ontology/SNOMEDCT/774124003 ;'
string2 = 'http://purl.bioontology.org/ontology/SNOMEDCT/10308009 a owl:Class ; skos:prefLabel """Argon-42"""@en ; skos:notation """10308009"""^^xsd:string ; skos:altLabel """42-Ar"""@en , """Argon-42 (substance)"""@en ; rdfs:subClassOf http://purl.bioontology.org/ontology/SNOMEDCT/35016001 ; http://purl.bioontology.org/ontology/SNOMEDCT/SUBSET_MEMBER """900000000000508004~ACCEPTABILITYID~900000000000548007"""^^xsd:string ; http://purl.bioontology.org/ontology/SNOMEDCT/TYPE_ID """900000000000013009"""^^xsd:string ; http://purl.bioontology.org/ontology/SNOMEDCT/CASE_SIGNIFICANCE_ID """900000000000017005"""^^xsd:string ; http://purl.bioontology.org/ontology/SNOMEDCT/SUBSET_MEMBER """900000000000509007~ACCEPTABILITYID~900000000000548007"""^^xsd:string ; http://purl.bioontology.org/ontology/SNOMEDCT/INACTIVATION_INDICATOR """723277005"""^^xsd:string ; http://purl.bioontology.org/ontology/SNOMEDCT/SUBSET_MEMBER """900000000000490003~VALUEID~723277005"""^^xsd:string ; http://purl.bioontology.org/ontology/SNOMEDCT/SUBSET_MEMBER """900000000000509007~ACCEPTABILITYID~900000000000548007"""^^xsd:string ; http://purl.bioontology.org/ontology/SNOMEDCT/TYPE_ID """900000000000013009"""^^xsd:string ; http://purl.bioontology.org/ontology/SNOMEDCT/CASE_SIGNIFICANCE_ID """900000000000448009"""^^xsd:string ; http://purl.bioontology.org/ontology/SNOMEDCT/SUBSET_MEMBER """900000000000508004~ACCEPTABILITYID~900000000000548007"""^^xsd:string ;'
contents = [string1,string2]
你应该能够从这部分开始复制我的代码。
首先遍历列表中的每一项。
然后使用;
作为分隔符将每行拆分为一个列表。
然后我遍历列表中的每个元素来寻找你的字符串。
for content in contents:
for line in content.split(';'):
if 'a owl:Class' in line:
print(i)
这是我的输出,
http://purl.bioontology.org/ontology/SNOMEDCT/1075251000119104 a owl:Class
http://purl.bioontology.org/ontology/SNOMEDCT/10308009 a owl:Class