我收到下面的错误。代码(George方法,https://stackoverflow.com/users/7173479/george)一开始工作了几次,后来崩溃了。它应该与HTTP配置有关,但我在AWS文档中迷失了方向。我在做木星笔记本。有人能帮忙吗?
创建网关对象并在AWS中初始化
engine = 'https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q={}&btnG='
gateway = ApiGateway(engine,
access_key_id="KEY", access_key_secret="SECRET_KEY")
gateway.start()
为会话
分配网关session = requests.Session()
session.mount(engine, gateway)
发送请求(IP将随机化)
header={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36'}
search_string = '{}+and+{}+and+{}+and+{}'.format('term1','term2','term3','term4')
url = engine.format(search_string)
print(url)
response = session.get(url,headers=header)
tree = BeautifulSoup(response.content,'lxml')
result = tree.find('div',id='gs_ab_md')
print(response.status_code)
print(result.text)
print(len(result.text))
number=[int(s.replace('.','').replace(',','')) for s in result.text.split()
if s.replace('.','').replace(',','').isdigit()]
<标题>删除网关h1> ====================================BadRequestException: An error occurred (BadRequestException) when calling the PutIntegration operation: Invalid HTTP endpoint specified for URI
requests-ip-rotator
包中ApiGateway
构造函数的site
参数期望恰好是站点。除了协议、域名或IP地址和端口之外,它不能包含URI的任何部分。
如果你把构造函数改成这样:
gateway = ApiGateway("https://scholar.google.com")
gateway.start()
它将正确构造网关端点。