抓取时获取UnboundLocalError



我在抓取时出现此错误:

UnboundLocalError:在分配之前引用了本地变量"tag">

并且它似乎是由引起的

--->17返回tag.select_one(".b-plainlist__date"(.text,tag.select_one("-b-plainlist __title"(.tetext,tag.find_next(class_="b-plainlist__announce&"(.text.strip((

我使用的代码如下:

import requests
from bs4 import BeautifulSoup
from concurrent.futures import ThreadPoolExecutor
import pandas as pd
daterange = pd.date_range('02-25-2015', '09-16-2020', freq='D')
def main(req, date):
r = req.get(f"website/{date.strftime('%Y%m%d')}")
soup = BeautifulSoup(r.content, 'html.parser')
for tag in soup.select(".b-plainlist "):
print(tag.select_one(".b-plainlist__date").text)
print(tag.select_one(".b-plainlist__title").text)
print(tag.find_next(class_="b-plainlist__announce").text.strip())

return tag.select_one(".b-plainlist__date").text, tag.select_one(".b-plainlist__title").text, tag.find_next(class_="b-plainlist__announce").text.strip()

with ThreadPoolExecutor(max_workers=30) as executor:
with requests.Session() as req:
fs = [executor.submit(main, req, date) for date in daterange]
allin = []
for f in fs:
allin.append(f.result()) # the problem should be from here
df = pd.DataFrame.from_records(
allin, columns=["Date", "Title", "Content"])

我尝试应用一些更改,比如在这篇文章中:UnboundLocalError:局部变量';text';在任务前参考,但我想我还没有完全理解如何修复它。

更新:这是网站的回应和print (soup.select("b-plainlist"))的内容

<响应[503]>b'\n\n\n
\n HTTP 503\n\n

\n html{font-family:"Helvetica Neue",宋体,无衬线;}\n主体{背景颜色:#fff;填充:15px;}\ndiv.title{font-size:32px;font-weight:bold;line-height:1.2em;}\ndiv.sub-title{font-size:25px;}\ndiv.descr{页边空白顶部:40px;}\ndiv.footer{页边空白顶部:80px;颜色:#777;}div.guru{font-size:12px;color:#ccc;}\n\n\n\n 503错误\n服务不可用\n\n\n尝试访问网站再过几分钟就到了。

\n如果错误重复几次,请联系网站管理部门。

\n\\n\n\n IP:107.181.177.10
\n请求:获取L3BvbGl0aWNhLzIwMTUwMzA4
\n大师冥想:MGV1SjNTaWhuUHNiblJYVU96QVpxMDB6N1hDNjU5NTU=
\n\n\n\n\n\n'

试着在for循环外声明tag=None,如下所示

def main(req, date):
r = req.get(f"website/{date.strftime('%Y%m%d')}")
soup = BeautifulSoup(r.content, 'html.parser')
tag=None
for tag in soup.select(".b-plainlist "):

该错误的发生是因为控件从未进入循环,而变量"tag"也从未初始化。因此,当您试图返回标记.select_one(".b-planlist__date"(时,编译器会抛出UnboundLocalError

相关内容

  • 没有找到相关文章

最新更新