使用Selenium, Python和Openpyxl遍历IP地址列的问题



我对Python相当陌生,并且已经陷入困境。我已经编写了一个脚本,它将从excel文件的a列读取IP地址,使用该IP打开无头浏览器,抓取设备Mac地址,并将其粘贴到B列并保存文件。我已经能够通过硬编码IP/MAC (A2,B2)被读取/写入的单元成功地获得那块工作。我想有100个IP地址通过,但我有麻烦通过细胞循环。我不知道循环应该在哪里,也不知道如何增加细胞。此外,任何关于如何使我所写的更高效/更python化的提示将非常感谢。

def mac_attack():
from openpyxl import load_workbook
# set file path
filepath="C:\Users\myFile.xlsx"
wb = load_workbook(filepath)
sheet = wb["Sheet1"]
ip = sheet["A2"].value
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
driverPath = "C:\Users\chromedriver.exe"
browser = webdriver.Chrome(executable_path=driverPath, options=chrome_options)
browser.get(ip)
browser.implicitly_wait(10)
mac = browser.find_elements_by_xpath('/html/body/div[2]/table[2]/tbody/tr[1]/td[2]')[0].text
#writing
sheet["B2"].value = mac
wb.save("C:\Users\myFile.xlsx")
print(mac)
print(ip)
mac_attack()

关于openpyxl和Excel的一个提示是,您需要绝对确保没有多余的数据,否则openpyxl将在您不希望它这样做时读取这些数据。因此,在你想要读取的数据周围的所有单元格中使用clear内容。

这是我的建议编辑和评论,这样你就可以看到我在做什么。

def mac_attack():
from openpyxl import load_workbook
# set file path
filepath="C:\Users\myFile.xlsx"
wb = load_workbook(filepath)
sheet = wb["Sheet1"]
# Read In the Entire Column of Data, not Just One Value
sourceIPCol = sheet['A'] 
# This Should Give you an Immutable 'Tuple' Data Type
# You can confirm that by print(type(sourceIPCol))
# Assuming I am right, Convert it to a List next
sourceIPCol = list(sourceIPCol)
# Iterate Through Each Address in your list of IP addresses
# Note that you may prefer to have this as a function above this
# function which you 'call' from this function.  It may be cleaner.
B2row = 0  # Keep track of which row you want to print to in B2
for address in sourceIPCol:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
driverPath = "C:\Users\chromedriver.exe"
browser = webdriver.Chrome(executable_path=driverPath, options=chrome_options)
browser.get(ip)
browser.implicitly_wait(10)
mac = browser.find_elements_by_xpath('/html/body/div[2]/table[2]/tbody/tr[1]/td[2]')[0].text
#Write Corresponding Value to Column B in SAME SHEET
sheet[B2row]["B"].value = mac
# NOTE THAT SO LONG AS YOU ITERATE THROUGH THE ENTIRE COLUMN
# YOU SHOULDN'T NEED TO SAVE UNTIL YOU GET THROUGH THE WHOLE 
# LIST, BUT I LEFT THE LINE BELOW HERE AS I DON'T THINK IT HURTS ANYTHING.  IT WILL JUST MAKE YOUR CODE SLOWER.  TECHNICALLY YOU JUST NEED TO SAVE ONCE AT THE END BEFORE YOU CLOSE THE FILE STREAM.
wb.save("C:\Users\myFile.xlsx")
print(mac)
print(ip)
B2row = B2row + 1
mac_attack()


最新更新