这是我的代码：

整个脚本在前2-3次运行良好，但现在不断发送503个响应

我检查了很多次网络，但网络没有任何问题

from bs4 import BeautifulSoup
import requests, sys, os, json
def get_amazon_search_page(search):
search = search.strip().replace(" ", "+")
for i in range(3): # tries to connect and get request the amazon 3 times
try:
print("Searching...")
response = requests.get("https://www.amazon.in/s?k={}&ref=nb_sb_noss".format(search)) # search string will be manipulated by replacing all spaces with "+" in order to search from the website itself
print(response.status_code)
if response.status_code == 200:
return response.content, search
except Exception:
pass
print("Is the search valid for the site: https://www.amazon.in/s?k={}&ref=nb_sb_noss".format(search))
sys.exit(1)
def get_items_from_page(page_content):
print(page_content)
soup = BeautifulSoup(page_content, "html.parser") # soup for extracting information
items = soup.find_all("span", class_ = "a-size-medium a-color-base a-text-normal")
prices = soup.find_all("span", class_ = "a-price-whole")
item_list = []
total_price_of_all = 0
for item, price in zip(items, prices):
dict = {}
dict["Name"] = item.text
dict["Price"] = int(price.text)
total_price_of_all += int(price.text.replace(",", ""))
item_list.append(dict)
average_price = total_price_of_all/len(item_list)
file = open("items.json", "w")
json.dump(item_list, file, indent = 4)
print("Your search results are available in the items.json file")
print("Average prices for the search: {}".format(average_price))
file.close()
def main():
os.system("clear")
print("Note: Sometimes amazon site misbehaves by sending 503 responses, this can be due to heavy traffic on that site, please cooperatenn")
search = input("Enter product name: ").strip()
page_content = get_amazon_search_page(search)
get_items_from_page(page_content)
if __name__ == "__main__":
while True:
main()

请帮忙！

服务器会阻止您抓取它。如果您检查robots.txt，您可以看到您试图请求的链接被禁止：Disallow: */s?k=*&rh=n*p_*p_*p_

但是，绕过此阻止的一个简单方法是更改用户代理(请参阅此处(。默认情况下，请求会发送类似这样的"消息"；python请求/2.22.0"；。将其更改为更像浏览器的内容将暂时有效。

向amazon.in发送GET请求，但Web服务器以响应代码503响应，该怎么办

整个脚本在前2-3次运行良好，但现在不断发送503个响应

我检查了很多次网络，但网络没有任何问题

相关内容

最新更新

热门标签：