我试图找到一种方法,在脚本中添加一个函数,以忽略或删除CSV文件的第一行。我知道我们可以用熊猫做到这一点,但没有熊猫也是可能的?
非常感谢你的帮助。
这是我的代码-
from os import mkdir
from os.path import join, splitext, isdir
from glob import iglob
from csv import DictReader
from collections import defaultdict
from urllib.request import urlopen
from shutil import copyfileobj
csv_folder = r"/Users/folder/PycharmProjects/pythonProject/CSVfiles/"
glob_pattern = "*.csv"
for file in iglob(join(csv_folder, glob_pattern)):
with open(file) as csv_file:
reader = DictReader(csv_file)
save_folder, _ = splitext(file)
if not isdir(save_folder):
mkdir(save_folder)
title_counter = defaultdict(int)
for row in reader:
url = row["link"]
title = row["title"]
title_counter[title] += 1
_, ext = splitext(url)
save_filename = join(save_folder, f"{title}_{title_counter[title]}{ext}".replace('/', '-'))
print(f"'{save_filename}'")
with urlopen(url) as req, open(save_filename, "wb") as save_file:
copyfileobj(req, save_file)
使用next()
函数跳过CSV的第一行。
with open(file) as csv_file:
reader = DictReader(csv_file)
# skip first row
next(reader)
您可以像往常一样从文件中读取原始文本,然后按新行分割文本并删除第一行:
file = open(filename, 'r') # Open the file
content = file.read() # Read the file
lines = content.split("n") # Split the text by the newline character
del lines[0] # Delete the first index from the resulting list, ie delete the first line.
尽管对于较大的CSV文件来说,这可能需要很长时间,因此这可能不是最佳解决方案。
或者,您可以简单地跳过for循环中的第一行。代替:
...
for row in reader:
...
你能用吗
...
for row_num, row in enumerate(list(reader)):
if row_num == 0:
continue
...
相反?我认为应该跳过第一排。