如何打开文件,处理文件,使用原始文件名并将"Clean"添加到文件名,使用新名称保存已处理的文件



我想做的:

  • 打开文件并读取

  • 创建一个新文件名,使用文件的原始名称,并在名称中添加"Clean",这样我就知道它已经被处理过了,现在有了一个不同的名称。

  • 处理文件

  • 使用我创建的新名称保存文件

我已经在网上研究了我的脚本有什么问题。有很多关于更改文件名的帖子,但没有发现一个使用原始文件名,添加名称,然后以新名称保存处理后的文件。

这是文件脚本:

import os
import sys
import nltk
from nltk import word_tokenize
#walk through the files in the directory
for (dirpath, dirnames, filenames) in os.walk(cwd):
for filename in filenames:
with open(filename,'r',encoding='utf-8', errors='ignore') as filein:            
fileinA = filein.read() 
#create the new file name

name = "Clean " + filein.name
print ("New Name to use for Renaming : " ,name)        
print ( )
print("File read in")
print ( )
print ("fileinA", fileinA)
#change file contents to lower case
print ( )
print ("Lower Case")
print ( )
fileinB=fileinA.lower()
print("fileinB", fileinB)
print( )
#rename the file using the name created            
os.rename("fileinB","name")

这是输出:

New Name to use for Renaming :  Clean 2022 Q3 IOU ErnCls FirstEnerg.txt
File read in
fileinA 
Good afternoon, and thank you for joining NorthWestern Corporation's Financial Results Webcast for the Quarter Ending September 30, 2022. My name is Travis Meyer. I'm the Director of Corporate Finance and Investor Relations Officer 
Lower Case
fileinB 
good afternoon, and thank you for joining northwestern corporation's financial results webcast for the quarter ending september 30, 2022. my is travis meyer. i'm the director of corporate finance name and investor relations officer 
Traceback (most recent call last):  File "F:/Python/Scripts RJS/Cleanup File for Processing/File Rename.py", line 63, in <module>
os.rename("fileinB","name")
FileNotFoundError: [WinError 2] The system cannot find the file specified: 'fileinB' -> 'name'2022

变化

os.rename("fileinB","name")

:

with open(name, "w") as fileout:
fileout.write(fileinB)

我将使用glob递归搜索文件夹中的文件:

import os
import glob
# recursively go through files in `path_to_folder`:
for path in glob.glob(f"{path_to_folder}/**/*"):
with open(path) as f:
content = f.read()
# split filename and dirname
dirname, filename = os.path.dirname(path), os.path.basename(path)
# construct new path
new_path = os.path.join(dirname, f"clean_{filename}")
# make content lowercase and save it
with open(new_path, 'w') as f:
f.write(content.lower())
# remove old file
os.remove(path)

最新更新