如何阻止python脚本在从网站抓取数据时发生错误时退出



我写了一个python脚本,以1分钟的间隔从网站抓取数据。但有时会发生错误并退出脚本。有没有办法避免脚本即使在发生错误后退出?

我的代码:

from requests
import Session
import pandas as pd
from bs4 import BeautifulSoup
import re 
def get_option_chain(symbol, expdate):
if symbol == 'NIFTY':
Base_url = ("https://www.nseindia.com/live_market/dynaContent/live_watch/option_chain/optionKeys.jsp?symbol="+symbol+"&date="+expdate)

new_table = pd.DataFrame(columns=col_hdrs_name)

print(new_table)
new_table.to_csv('Option_Chain_Table_{}.csv'.format(symbol))
get_option_chain('NIFTY','17OCT2019')

schedule.every(1).minutes.do(get_option_chain,'NIFTY','17OCT2019')
while 1:
schedule.run_pending()
time.sleep(1)

错误:

Exception ('Connection aborted.', RemoteDisconnected('Remote end closed connection without 
response')) in getting data  for symbol NIFTY
Traceback (most recent call last):
File "C:PythonPython37NSE scrape fata.py", line 70, in <module>
schedule.run_pending()
File "C:PythonPython37libsite-packagesschedule__init__.py", line 563, in run_pending
default_scheduler.run_pending()
File "C:PythonPython37libsite-packagesschedule__init__.py", line 94, in run_pending
self._run_job(job)
File "C:PythonPython37libsite-packagesschedule__init__.py", line 147, in _run_job
ret = job.run()
File "C:PythonPython37libsite-packagesschedule__init__.py", line 466, in run
ret = self.job_func()
File "C:PythonPython37NSE scrape fata.py", line 41, in get_option_chain
soup = BeautifulSoup(page.content, 'html.parser')
UnboundLocalError: local variable 'page' referenced before assignment

是的,您可以使用try...except

from requests import Session
import pandas as pd
from bs4 import BeautifulSoup
import re 
import time
def get_option_chain(symbol, expdate):
if symbol == 'NIFTY':
Base_url = ("https://www.nseindia.com/live_market/dynaContent/live_watch/option_chain/optionKeys.jsp?symbol="+symbol+"&date="+expdate)

new_table = pd.DataFrame(columns=col_hdrs_name)

print(new_table)
new_table.to_csv('Option_Chain_Table_{}.csv'.format(symbol))
get_option_chain('NIFTY','17OCT2019')

schedule.every(1).minutes.do(get_option_chain,'NIFTY','17OCT2019')
while 1:
try:
schedule.run_pending()
time.sleep(1)
except:
continue

此外,您没有导入time模块并使用time.sleep(1).所以导入时间。 我也建议阅读这篇文章。

最新更新