将文件尽可能方便地组合在一起



将文件尽可能方便地组合起来

假设我有以下markdown文件

    1.md # contains 'foo'
    2.md # contains 'bar'
    3.md # 'zoo'
    4.md # 'zxc'

使用命令cat

很容易合并它们
$ cat {1..4}.md > merged_5.md

尽管如此,Python需要多个步骤才能实现此结果。

  1. 创建读写方法

    def read(filename):
        with open(filename) as file:
            content = file.read()
            return content
    def write(filename, content):
        with open(filename, 'w') as file:
            file.write(content)
    
  2. 检索合格文件

    import glob
    filenames = glob.glob('*.md')
    In [17]: filenames
    Out[17]: ['1.md', '2.md', '3.md', '4.md']
    
  3. 阅读并组合

    def combine(filenames):
        merged_conent = ""
        for filename in filenames:
            content = read(filename)
            merged_content += content
        write('merged.md', merged_content)
    
  4. 将数据和方法封装在main模块中,并保存为'combine_files.py'

    def main():
        filenames = glob.glob('*.md')
        combine(filenames)
    if __name__ == '__main__':
        main()
    
  5. 在命令行上运行

    python3 combine_files.py

它不方便,因为命令'cat'

如何重构代码尽可能方便?

怎么样?:

with open('merged.md', 'w') as out_f:
    for filename in glob.glob('*.md'):
        with open(filename) as f:
            out_f.write(f.read())

易于轻松:

def cat(out, *src):
    '''Concatenate files'''
    with open(out, 'wb') as f:
        data = b'n'.join(open(i, 'rb').read() for i in src)
        f.write(data)

您现在可以使用cat('merged.md', glob.glob('*.md'))调用它。那方便怎么办?当然比GNU Coreutils的来源容易得多。

最新更新