我正在按照本教程进行网页抓取 https://www.linkedin.com/pulse/how-easy-scraping-data-from-linkedin-profiles-david-craven/。python脚本正在生成错误,我已经尝试将目录添加到PATH中,当我将路径回显到屏幕时,它会显示,但现在它显示"/Users/owner/Users/owner",而路径中应该只有一个"用户/所有者"。
我在mac os High Sierra中使用bash,并且是数据科学专业的学生,因此DevOps对我来说是一个挑战,以及学习如何将代码发布到StackOverflow,但我正在尝试记录我的步骤,以便更容易解决此问题。
- 我点装硒
- 我将chromedriver下载到我的网络抓取脚本文件的目录中,然后双击它以运行
- 我以为我用"export PATH=$PATH:~opt/bin:~/Users/owner/sbox/test/pandas_sqlite_dbase/chromedriver"将目录添加到我的 PATH 中,这是我从 http://osxdaily.com/2014/08/14/add-new-path-to-path-command-line/中找到的方向
- 我更新了画中画
- 我要从中运行脚本的目录是"/Users/owner/sbox/test/pandas_sqlite_dbase">
- 还有另一个SO帖子 网站可以检测到您何时使用带有chromedriver的硒吗? 这谈到了带有硒的 chromedriver 现在如何被自动检测和禁用......那么我是否试图使用过时的代码库?
- 我可以发布我的整个路径或提供其他信息。
from selenium import webdriver
driver = webdriver.Chrome('~/Users/owner/sbox/test/pandas_sqlite_dbase/googlechrome')
driver.get('https://www.linkedin.com')
现在我收到回溯错误
Traceback (most recent call last):
File "/Users/owner/anaconda3/lib/python3.7/site-packages/selenium/webdriver/common/service.py", line 76, in start
stdin=PIPE)
File "/Users/owner/anaconda3/lib/python3.7/subprocess.py", line 775, in __init__
restore_signals, start_new_session)
File "/Users/owner/anaconda3/lib/python3.7/subprocess.py", line 1522, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '~/Users/owner/sbox/test/pandas_sqlite_dbase/googlechrome': '~/Users/owner/sbox/test/pandas_sqlite_dbase/googlechrome'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/owner/sbox/test/pandas_sqlite_dbase/scraping_tutorial.py", line 7, in <module>
driver = webdriver.Chrome('~/Users/owner/sbox/test/pandas_sqlite_dbase/googlechrome')
File "/Users/owner/anaconda3/lib/python3.7/site-packages/selenium/webdriver/chrome/webdriver.py", line 73, in __init__
self.service.start()
File "/Users/owner/anaconda3/lib/python3.7/site-packages/selenium/webdriver/common/service.py", line 83, in start
os.path.basename(self.path), self.start_error_message)
selenium.common.exceptions.WebDriverException: Message: 'googlechrome' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home
[Finished in 0.7s with exit code 1]
[shell_cmd: python -u "/Users/owner/sbox/test/pandas_sqlite_dbase/scraping_tutorial.py"]
[dir: /Users/owner/sbox/test/pandas_sqlite_dbase]
[path: /usr/bin:/bin:/usr/sbin:/sbin]
我会检查~实际上是什么(似乎你的概念不好(通常是home dir,所以,对于用户,你的"用户/所有者",这就是为什么你获得"用户/所有者/用户/所有者"。
要检查这一点,您可以
$>cd ~
$>pwd