如何在python中打开文件，阅读注释("#")，在注释后找到一个单词并选择其后的单词？

>我有一个函数，可以循环遍历一个看起来像这样的文件：

"#" XDI/1.0 XDAC/1.4 Athena/0.9.25
"#" Column.4:                      pre_edge
Content

也就是说，在"#"之后有一个注释。我的函数旨在读取每一行，如果它以特定单词开头，请选择"："后面的内容

例如，如果我有这两行。我想通读它们，如果该行以"#"开头并包含单词"Column.4"，则应存储单词"pre_edge"。

我当前方法的一个例子如下：

with open(file, "r") as f:
        for line in f:
            if line.startswith ('#'):
                word = line.split(" Column.4:")[1]
            else:
                print("n")

我认为我的麻烦是在找到以"#"开头的行之后，我如何解析/搜索它？如果它包含所需的单词，请保存其内容。

如果

#注释包含如上所述的str Column.4:，您可以通过这种方式解析它。

with open(filepath) as f:
    for line in f:
        if line.startswith('#'):
            # Here you proceed comment lines
            if 'Column.4' in line:
                first, remainder = line.split('Column.4: ')
                # Remainder contains everything after '# Column.4: '
                # So if you want to get first word ->
                word = remainder.split()[0]
        else:
            # Here you can proceed lines that are not comments
            pass

注意

此外，使用 for line in f: 语句而不是 f.readlines()（如其他答案中所述）也是一种很好的做法，因为这样您就不会将所有行加载到内存中，而是逐个进行。

您应该首先将文件读入列表，然后完成此操作：

file = 'test.txt' #<- call file whatever you want
with open(file, "r") as f:
    txt = f.readlines()
    for line in txt:
        if line.startswith ('"#"'):
            word = line.split(" Column.4: ")
            try:
                print(word[1])
            except IndexError:
                print(word)
        else:
            print("n")

输出：

>>> ['"#" XDI/1.0 XDAC/1.4 Athena/0.9.25n']
>>> pre_edge

使用了 try 和 except catch，因为第一行也以"#"开头，我们不能用您当前的逻辑拆分它。

另外，作为旁注，在问题中，您的文件带有以引号"#"开头的行，因此startswith()函数已更改。

with open('stuff.txt', 'r+') as f:
    data = f.readlines()
for line in data:
    words = line.split()
    if words and ('#' in words[0]) and ("Column.4:" in words):
        print(words[-1])
# pre_edge

注意

相关内容

最新更新

热门标签：