小贝子编程

Python 从文件中读取 url 只会获取最后一个 url

本文关键字：url 获取最后一个读取文件 Python python python-2.7
更新时间 : 2023-09-09
英文 : Python read urls from file only gets last url

尝试读取URL列表，然后在类中输出html。它有效，但仅适用于列表中的最后一个URL，我似乎无法弄清楚原因。我已经设置了超时等，但它仍然只是返回和空响应，除了最后一个 url。

#!/usr/bin/env python
# -*- coding: utf-8 -*- 
from bs4 import BeautifulSoup
import requests
import time
with open('/Users/usrname/Desktop/links.txt') as f:
    for line in f:
        print(line)
        html_doc  = requests.get( line, verify=False, timeout=2 )
        soup = BeautifulSoup(html_doc.text, 'html.parser')
        #time.sleep(1.3) # seconds         
        print (soup.find_all("div", "location-content"))

文件中的最后一行没有回车符，而其他行有回车符，因此不是有效的 URL。您需要用rstrip()剥离回车

符

for line in f:
    line = line.rstrip()

Python 从文件中读取 url 只会获取最后一个 url

相关内容

最新更新

热门标签：