Linkedin机器人Selenium Beautiful Soup未阅读/overlay/联系信息中的电子邮件



因此,每当我运行以下代码时,它都会在email.txt文件中写入"未找到邮件"。我检查了inspect中的课程,它们是正确的。有人知道可能是什么问题吗?

''

visitingProfileID = profilesQueued.pop()
visitedProfiles.append(visitingProfileID)
fullLink = 'https://www.linkedin.com' + visitingProfileID
linkoverlay=fullLink+'/overlay/contact-info/'

with open('visitedUsers.txt', 'a') as visitedUsersFile:
visitedUsersFile.write(str(visitingProfileID)+'n')
visitedUsersFile.close()
browser.get(linkoverlay)
soup2=BeautifulSoup(browser.page_source)
with open('emails.txt', 'a') as visitedEmailFile:


try:
pava2=soup2.find('section', {'class': 'pv-contact-info__contact-type ci-email'})
sto=pava2.find('a', {'class': 'pv-contact-info__contact-link link-without-visited-state t-14'}).get('href')
visitedEmailFile.write(str(sto)+'n')
except:
visitedEmailFile.write('no email found n')
visitedEmailFile.close()

''

我个人不使用漂亮的汤,但你可能需要尝试更多的xpath。

这对我有效:

# driver is already on linkedin profile
contact_infos = []
mails = []
try:
contact = driver.find_element_by_xpath(
'//*[contains(@href,"contact-info")]'
).click() #click on contact-info
except NoSuchElementException:
contact_infos.append(np.nan)
mails.append(np.nan)
else:
contact_info = driver.find_element_by_xpath(
'//*[contains(@class,"pv-contact-info")]')
# save everything from contact info
contact_infos.append(contact_info.text.split('n'))
print(contact_info.text.split('n'))
try:
mail = driver.find_element_by_xpath('//*[contains(@class,"mail")]')
except NoSuchElementException:
mails.append(np.nan)
# this closes window
driver.find_element_by_xpath('//*[contains(@type,"cancel-icon")]').click() #this closes window
time.sleep(<rndm>)
else:
mail = [x.strip() for x in mail.text.split('n')][1]
mails.append(mail)
print(mail)
driver.find_element_by_xpath('//*[contains(@type,"cancel-icon")]').click() 
time.sleep(<rndm>) 

最新更新