在CSV文件中循环,但得到重复的结果Python3/Selenium/BS4



我有一个csv文件,它有两个URL(可能是100(。我正试图在现有浏览器中打开每个链接(出于登录原因(,打印URL并打印抓取的地址。URL打印正确,但每次都会将地址打印为第一个地址。我肯定我错过了一些简单的东西。

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup
chrome_driver = "C:/chromedriver.exe"
Chrome_options = Options()
Chrome_options.add_experimental_option("debuggerAddress", "127.0.0.1:9015")
driver = webdriver.Chrome(chrome_driver, options=Chrome_options)
source = driver.page_source
soup = BeautifulSoup(source, "html.parser")
with open('UTlinks.csv') as file:
for line in file:
driver.get(line)
address = soup.find('span', class_='street-address').get_text()
print(line + address)

输出:

https://www.redfin.com/UT/Mapleton/175-E-600-N-84664/home/81569604
175 E 600 N 
https://www.redfin.com/UT/Mapleton/1918-W-800-N-84664/home/103092024
175 E 600 N

期望输出:

https://www.redfin.com/UT/Mapleton/175-E-600-N-84664/home/81569604
175 E 600 N 
https://www.redfin.com/UT/Mapleton/1918-W-800-N-84664/home/103092024
1918 W 800 N
import io
from selenium import webdriver
f = io.StringIO("""https://www.redfin.com/UT/Mapleton/175-E-600-N-84664/home/81569604
https://www.redfin.com/UT/Mapleton/1918-W-800-N-84664/home/103092024
""")
driver = webdriver.Firefox()
for url in f.readlines():
url = url.rstrip()
driver.execute_script("window.open()")
driver.switch_to_window(driver.window_handles[-1])
driver.get(url)
element = driver.find_element_by_css_selector("span.street-address").text
print(f"{driver.current_url}n{element}")

如果您在csv文件的不同行上有url,请尝试这种方法,

import csv
chrome_driver = "C:/chromedriver.exe"
Chrome_options = Options()
Chrome_options.add_experimental_option("debuggerAddress", "127.0.0.1:9015")
driver = webdriver.Chrome(chrome_driver, options=Chrome_options)
with open('UTlinks.csv') as file:
readCSV = csv.reader(file)
for row in readCSV:
url = str(row).replace("['","").replace("']","")
print(url)
driver.get(url)
html_content = driver.page_source
soup = BeautifulSoup(html_content, "lxml")
address = soup.find('span', class_='street-address')
print(address.text)

driver.quit()

这是输出:

https://www.redfin.com/UT/Mapleton/175-E-600-N-84664/home/81569604 175
E 600 N 
https://www.redfin.com/UT/Mapleton/1918-W-800-N-84664/home/103092024
1918 W 800 N

相关内容

  • 没有找到相关文章

最新更新