我正在用氦气刮网页。
result = start_firefox(
"https://www.medtronic.com/covidien/en-us/products/brain-monitoring/bis-monitoring-system.html",
)
result.find_element_by_class_name("js-open-table-overlay").click()
点击操作后,我会看到一个表格,我需要抓取表格的内容,但点击后我如何选择表格?
您需要使用find_elements_...
来获取所有<table>
,并使用for
-循环单独处理每个表,使用(嵌套(for
-循环来获取表中的<tr>
(行(和<th>
(标头(,并使用(嵌套的(for
-循环来获取行中的<td>
(单元格(。您还必须将元素添加到正确的(嵌套的(列表中。
但是这个页面使用普通的<table>
,并且这个表一直在HTML中(不是由javaScript添加的(,所以使用pandas.read_html()
获取所有<table>
可以更简单
import pandas as pd
url = "https://www.medtronic.com/covidien/en-us/products/brain-monitoring/bis-monitoring-system.html"
all_tables = pd.read_html(url)
for table in all_tables:
print(table.to_string())
结果
ORDER CODE DESCRIPTION UNIT OF MEASURE QUANTITY
0 186-1014 BIS™ Complete 4-Channel Monitoring System Each 1
1 186-0210 BIS™ Complete 2-Channel Monitor Each 1
2 186-0224-AMS BIS™ LOC 4-Channel Monitor with Patient Interface Cable (PIC-4) Each 1
3 186-0195-AMS BIS™ LOC 2-Channel Monitor with Patient Interface Cable Each 1
4 186-0212 BIS™ Bilateral Sensor Each 1
DISPLAY BIS™ COMPLETE 2-CHANNEL MONITOR BIS™ COMPLETE 4-CHANNEL MONITORING SYSTEM
0 Parameters BIS, SQI, EMG, SR, BC, EEG BIS, SQI, EMG, SR, BC, TP, SEF, EEG
1 Trended parameters BIS, SQI, EMG, SR, BC BIS, SQI, EMG, SR, BC, SEF
2 BIS™ alarm Upper and lower limit, results in visual and audible alert when out of range Upper and lower limit, results in visual and audible alert when out of range