输出有三行,但txt文件只有一行



我有一个非常基本的问题:下面是关于我的代码的一个问题:

#Python code to scrape the shipment URLs 
from bs4 import BeautifulSoup
import urllib.request
import urllib.error
import urllib
# read urls of websites from text file > change it to where you stock the file
list_open = open(r"C:Users**data.csv")
#skips the header
read_list  = list_open.readlines()[1:]
import os
file_path = os.path.join('c:\**', 'ShipmentUpdates.txt')

for url in read_list:
soup = BeautifulSoup(urllib.request.urlopen(url).read(), "html5lib")
# parse shipment info
shipment = soup.find_all("span")
Preparation = shipment[0]
Sent = shipment[1]
InTransit = shipment[2]
Delivered = shipment[3]
url = url.strip()
line= f"{url} ; Preparation {Preparation.getText()}; Sent {Sent.getText()}; InTransit {InTransit.getText()}; Delivered {Delivered.getText()}"
print (line)
file='c:\**ShipmentUpdates.txt'
with open(file, 'w') as filetowrite:
filetowrite.write(line+'n')

在我的输出中,我有三行:

http://carmoov.fr/CfQd ; Preparation on 06/01/2022 at 17:45; Sent on 06/01/2022 at 18:14; InTransit ; Delivered on 07/01/2022 at 10:31
http://carmoov.fr/CfQs ; Preparation on 06/01/2022 at 15:01; Sent on 06/01/2022 at 18:14; InTransit ; Delivered on 07/01/2022 at 11:27
http://carmoov.fr/CfQz ; Preparation on 06/01/2022 at 11:18; Sent on 06/01/2022 at 18:14; InTransit ; Delivered on 07/01/2022 at 11:56

但在我的文本文件中,只有一行:

http://carmoov.fr/CfQz ; Preparation on 06/01/2022 at 11:18; Sent on 06/01/2022 at 18:14; InTransit ; Delivered on 07/01/2022 at 11:56

我需要文本中3行完全相同的结果。这里有什么问题吗?提前谢谢!

循环中的最后一行代码不断为line重新赋值,覆盖(替换(它以前的任何值。最终写入文件的line的最后一个值仅为。

我建议你在循环中保留一个lines的列表:

lines = []
for url in read_list:
...
line= f"{url} ; Preparation ..."
lines.append(line)
print (line)

然后,使用文件的writelines()方法编写该列表。

尽管它的名字是writelines(),但它不添加行尾(为了……形成文本的"行"(,所以你必须自己添加这些,line+'n':

file='c:\**ShipmentUpdates.txt'
with open(file, 'w') as filetowrite:
filetowrite.writelines([line+'n' for line in lines])

更改此项:

line= f"{url} ; Preparation {Preparation.getText()}; Sent {Sent.getText()}; InTransit {InTransit.getText()}; Delivered {Delivered.getText()}"

对此:

line += f"{url} ; Preparation {Preparation.getText()}; Sent {Sent.getText()}; InTransit {InTransit.getText()}; Delivered {Delivered.getText()}n"

您需要连接而不是替换

在完成循环后,您正在向文件进行写入,因此,在您的情况下,您将写入存储的最后一行。尝试存储所有线路

line += f"{url} ; Preparation {Preparation.getText()}; Sent {Sent.getText()}; InTransit {InTransit.getText()}; Delivered {Delivered.getText()}" + "n"

最新更新