如何使用硒铬网络驱动程序自动登录凭据



我正在尝试从[this][1]网站中提取数据:

手动程序是在搜索框中输入字符串,例如"CCOCCO",单击"预测属性"并从表中记录"玻璃化转变温度(K("。

如果 html POST 的数量小于 5,以下代码将自动执行上述任务:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options 
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options=Options()
options.add_argument('start-maximized')
options.add_argument('disable-infobars')
options.add_argument('--disable-extensions')
driver=webdriver.Chrome(chrome_options=options)
def get_glass_temperature(smiles):
driver.get('https://www.polymergenome.org/explore/index.php?m=1')
x_path_click="//input[@class='large_input_no_round ui-autocomplete-input' and @id='keyword_original']"
x_path_find="//input[@class='dark_blue_button_no_round' and @value='Predict Properties']"
x_path_get="//table[@class='record']//tbody/tr[@class='record']//following::td[7]/center/font/font"
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, x_path_click))).send_keys(smiles)
driver.find_element_by_xpath(x_path_find).click()
return WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH,x_path_get))).get_attribute("innerHTML")

我正在将上述函数应用于具有类似于"CCOCCO"的字符串值高达 tp 400 的熊猫数据帧。 但是,在返回 5 "玻璃温度"后,将出现 WebdriverException 错误,因为网站会抛出以下消息:

"Visits of more than 5 times per day to the property prediction capability requires login. "

在运行代码之前,我登录到网站并选中"记住我"框,但错误是相同的。

我尝试修改代码如下:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options 
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd 
import os 
options=Options()
options.add_argument('start-maximized')
options.add_argument('disable-infobars')
options.add_argument('--disable-extensions')
driver=webdriver.Chrome(chrome_options=options, executable_path='/Users/ae/Downloads/chromedriver')
def get_glass_temperature(smiles):
driver.get('https://www.polymergenome.org/explore/index.php?m=1')
user_name='my_user_name'
password='my_password'
x_path_id="//input[@class='large_input_no_round' and @placeholder='User ID']"
x_path_pass="//input[@class='large_input_no_round' and @placeholder='Password']"
x_path_sign="//input[@class='orange_button_no_round' and @value='Sign In']"
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, x_path_id))).send_keys(user_name)
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, x_path_pass))).send_keys(password)
driver.find_element_by_xpath(x_path_sign).click()
x_path_click="//input[@class='large_input_no_round ui-autocomplete-input' and @id='keyword_original']"
x_path_find="//input[@class='dark_blue_button_no_round' and @value='Predict Properties']"
x_path_get="//table[@class='record']//tbody/tr[@class='record']//following::td[7]/center/font/font"
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, x_path_click))).send_keys(smiles)
driver.find_element_by_xpath(x_path_find).click()
return WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH,x_path_get))).get_attribute("innerHTML")
test_smiles=['CC(F)(F)CC(F)(F)','CCCCC(=O)OCCOC(=O)','CNS-C6H3-CSN-C6H3','CCOCCO','NH-CS-NH-C6H4','C4H8','C([*])C([*])(COOc1cc(Cl)ccc1)']
test_polymer=pd.DataFrame({'SMILES': test_smiles})
test_polymer['test_tg']=test_polymer['SMILES'].apply(get_glass_temperature)
print (test_polymer)

此修改后,我收到超时错误:

Traceback (most recent call last):
File "/Users/alieftekhari/Desktop/extract_TG.py", line 42, in <module>
test_polymer['test_tg']=test_polymer['SMILES'].apply(get_glass_temperature)
File "/anaconda/lib/python2.7/site-packages/pandas/core/series.py", line 3194, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas/_libs/src/inference.pyx", line 1472, in pandas._libs.lib.map_infer
File "/Users/user/Desktop/extract_TG.py", line 22, in get_glass_temperature
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, x_path_id))).send_keys(user_name)
File "/anaconda/lib/python2.7/site-packages/selenium/webdriver/support/wait.py", line 80, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
[1]: https://www.polymergenome.org/explore/index.php?m=1

请参阅堆栈跟踪File "/anaconda/lib/python2.7/site-packages/selenium/webdriver/support/wait.py", line 80, in until raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message:的最后一行

它清楚地提到没有这样的元素,这就是为什么它给出TimeoutException。 我在这里看到的,你的xpath是错误的。

x_path_id="//input[@class='large_input_no_round ui-autocomplete-input' and @placeholder='User ID']"
x_path_pass="//input[@class='large_input_no_round ui-autocomplete-input' and @placeholder='Password']"

没有类large_input_no_round ui-autocomplete-input, 所以用正确的类修改 xpath,如下所示。

x_path_id="//input[@class='large_input_no_round' and @placeholder='User ID']"
x_path_pass="//input[@class='large_input_no_round' and @placeholder='Password']"

问题

  • driver.get('https://www.polymergenome.org/explore/index.php?m=1')此页面没有登录窗口,因此第WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, x_path_id))).send_keys(user_name)行中的超时异常

    换句话说,当您运行脚本时,它会启动一个新的浏览器 实例,表示您以前的登录名已消失,现在您需要 登录以通过此限制Visits of more than 5 times per day to the property prediction capability requires login.;登录窗口将 在 5 次成功的提取迭代后填充,脚本失败 这是因为它尝试直接登录而无需等待 对于登录对话框,由于没有登录窗口,它给出了 超时异常。

解决方案是您应该将数据部分放入try块并登录到catch中,只有在提取数据时出现异常时,才会执行登录部分。我的 Java 实现是这样的,

@Test(invocationCount = 7)
public void getList(){
wait = new WebDriverWait(driver, 20);
By locator = By.xpath("//table[@class='record']//tbody/tr[@class='record']//following::td[7]/center/font/font");
try {
driver.findElement(By.xpath("//input[@class='large_input_no_round ui-autocomplete-input' and @id='keyword_original']")).clear();
driver.findElement(By.xpath("//input[@class='large_input_no_round ui-autocomplete-input' and @id='keyword_original']")).sendKeys("CCOCCO");
driver.findElement(By.xpath("//input[@class='dark_blue_button_no_round' and @value='Predict Properties']")).click();
String text = wait.until(ExpectedConditions.visibilityOfElementLocated(locator)).getAttribute("innerHTML");
System.out.println(text);
}catch(Exception e){
System.out.println("In Exception Block");
wait.until(ExpectedConditions.elementToBeClickable(By.xpath("//input[@class='large_input_no_round' and @placeholder='User ID']")));
driver.findElement(By.xpath("//input[@class='large_input_no_round' and @placeholder='User ID']")).sendKeys("testing");
driver.findElement(By.xpath("//input[@class='large_input_no_round' and @placeholder='Password']")).sendKeys("testing");
driver.findElement(By.xpath("//input[@class='orange_button_no_round' and @value='Sign In']")).click();
}
}       

反过来

  • 最好的方法是浏览站点,导航到登录对话框,然后登录,登录成功后,浏览搜索页面并继续提取。
  • 或者,您可以在登录之前将限制设置为 5(表示提取 5 次(。

每次请求网站时自动登录的一种方法是在初始化时使用特定的 chrome 配置文件。如果您想使用谷歌浏览器的现有个人资料,请查看这篇文章。

因此,您必须再添加一个选项:

options=Options()
options.add_argument('start-maximized')
options.add_argument('disable-infobars')
options.add_argument('--disable-extensions')
options.add_argument('user-data-dir=/path/to/chrome/profile')
driver=webdriver.Chrome(chrome_options=options, executable_path='/Users/ae/Downloads/chromedriver')

因此,如果您使用此配置文件登录并选中了"记住我",则每次您请求此网站时,您都会自动登录。

这是python中的实现:

def time_out_handling (smiles):
try:
user_name='my_user_name'
password='my_password'
x_path_id="//input[@class='large_input_no_round' and @placeholder='User ID']"
x_path_pass="//input[@class='large_input_no_round' and @placeholder='Password']"
x_path_sign="//input[@class='orange_button_no_round' and @value='Sign In']"
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, x_path_id))).send_keys(user_name)
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, x_path_pass))).send_keys(password)
driver.find_element_by_xpath(x_path_sign).click()
driver.get('https://www.polymergenome.org/explore/index.php?m=1')
x_path_click="//input[@class='large_input_no_round ui-autocomplete-input' and @id='keyword_original']"
x_path_find="//input[@class='dark_blue_button_no_round' and @value='Predict Properties']"
x_path_get="//table[@class='record']//tbody/tr[@class='record']//following::td[7]/center/font/font"
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, x_path_click))).send_keys(smiles)
driver.find_element_by_xpath(x_path_find).click()
return WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH,x_path_get))).get_attribute("innerHTML").encode('ascii', 'ignore').split(' ')[0]
except TimeoutException:
return "Nan" 

def get_glass_temperature(smiles):
try: 
driver.get('https://www.polymergenome.org/explore/index.php?m=1')
x_path_click="//input[@class='large_input_no_round ui-autocomplete-input' and @id='keyword_original']"
x_path_find="//input[@class='dark_blue_button_no_round' and @value='Predict Properties']"
x_path_get="//table[@class='record']//tbody/tr[@class='record']//following::td[7]/center/font/font"
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, x_path_click))).send_keys(smiles)
driver.find_element_by_xpath(x_path_find).click()
return WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH,x_path_get))).get_attribute("innerHTML").encode('ascii', 'ignore').split(' ')[0]
except WebDriverException:
time_out_handling (smiles)

相关内容

  • 没有找到相关文章

最新更新