绑定以提取每个有地牢的链接,将它们分开,并将每个链接添加到列表中,然后删除重复的链接



我要提取每个有地牢的链接,将它们分开,将每个链接添加到列表中并删除重复项我确信我会将它们分开删除重复项会很容易我有一种感觉,我缺少一些简单的东西我会给网站,但它需要一个帐户

这就是我得到的,它抓住了它们,但它是一个大字符串。我如何将它们分离成一个列表,以便我可以从中挑选出每一个,我稍后会点击它们。或者是否有更好的方法来过滤掉具有"0"的链接;地牢;在他们使用xpath链接文本不工作

for elem in elems:
if "dungeon" in elem.get_attribute("href"):
list = elem.get_attribute("href")
print(list)
print(list[0])
and this is the output
javascript:dungeon(0,84579684);
j
javascript:dungeon(0,84579684);
j
javascript:dungeon(0,84579674);
j
javascript:dungeon(0,84579674);
j
javascript:dungeon(0,84579672);
j
javascript:dungeon(0,84579672);
j
javascript:dungeon(0,84579662);
j
javascript:dungeon(0,84579662);
j
its one big string output i think
print(list)
javascript:dungeon(0,84579684);
javascript:dungeon(0,84579684);
javascript:dungeon(0,84579674);
javascript:dungeon(0,84579674);
javascript:dungeon(0,84579672);
javascript:dungeon(0,84579672);
javascript:dungeon(0,84579662);
javascript:dungeon(0,84579662);

I want to be able to print(list[3]) and have javascript:dungeon(0,84579674); come up not "a" come up

我会这样做:

使用CCD_ 1方法添加到CCD_。

url = "https://www.sofascore.com/de/tennis/2019-01-01"
driver.get(url)
href_bucket = []
elems = driver.find_elements_by_xpath("//a")
print(len(elems))
counter = 1
fail_counter = 0
for ele in elems:
if "de" in ele.get_attribute('href'):
counter = counter  + 1
href_bucket.append(ele.get_attribute('href'))
else:
#print("fail", fail_counter)
fail_counter = fail_counter + 1
print(href_bucket[3])

如果你想删除重复:

seen = set(href_bucket)
if item not in seen:
seen.add(item)
href_bucket.append(item)

最新更新