我执行这个python脚本从一个特定的网站获取相应的xpath图像,但这只打印"[]"
import urllib
import lxml.html
import re
html_string = urllib.urlopen("http://www.gemselect.com/actinolite-cats-eye/actinolite-cats-eye-379734.php")
dom = lxml.html.fromstring(html_string.read())
product_table = dom.xpath("/html/body/div/table[2]/tr/td/div[2]/table[1]/tr[3]/td[1]/img/@src")
for link in product_table:
print link
使用正则表达式搜索所有以/photos/
开头的jpg URL:
import re
re.findall('/photos.*jpg',html_string.text)
['/photos/actinolite-cats-eye/actinolite-cats-eye-gem-379734a.jpg',
'/photos/actinolite-cats-eye/actinolite-cats-eye-gem-379734a.jpg',
'/photos/actinolite-cats-eye/actinolite-cats-eye-gem-379734b.jpg',
'/photos/actinolite-cats-eye/actinolite-cats-eye-gem-379734c.jpg']