上下文
我对编码很陌生,一直在通过视频和试错来学习。尽管它似乎已经失去了动力。
我能够使用氦下载一组youtube链接,氦是Selenium的一个简单版本。然而,我想循环浏览这些列表,从中下载成绩单。
# Get the links
def Get_links():
# For the class (categories with segments of information), find them all
Lnk = find_all(S('.style-scope ytd-video-renderer'))
fin = []
# Within this class,
for l in Lnk:
# These variables exist
# The xpath that contains the links
ind_links = find_all(S('//*[@id="thumbnail"]'))
# links in this this xpath
href_list = [e.web_element.get_attribute('href') for e in ind_links]
# We want to separate the duplicates
# for every link in the href_lists variable
for i in href_list:
# within the empty list 'fin', if it is not in the empty list, then we append it.
# This makes sense because if there is nothing in the list, then there will only be one copy of the list of links
if i not in fin:
fin.append(i)
print(fin)
输出是链接的列表
['https://www.youtube.com/watch?v=eHnXgh0j500', None,
'https://www.youtube.com/watch?v=wDHtXXApfbc',
'https://www.youtube.com/watch?v=CJhOGDU636k',
'https://www.youtube.com/watch?v=xIB6uNsgFb8',
'https://www.youtube.com/watch?v=u7Ckt6A6du8',
'https://www.youtube.com/watch?v=PnSC2BY4e7c',
'https://www.youtube.com/watch?v=UkIAsYWgciQ',
'https://www.youtube.com/watch?v=MqC_k2WxZro',
'https://www.youtube.com/watch?v=B0BpL20QHPU',
'https://www.youtube.com/watch?v=UujbkSBzuI0',
'https://www.youtube.com/watch?v=7Q8ZvFDyjhA',
'https://www.youtube.com/watch?v=Z8pVlfulkcw',
'https://www.youtube.com/watch?v=fy0clsby3v8',
'https://www.youtube.com/watch?v=oYJaLgJL0Ok',
'https://www.youtube.com/watch?v=rampRBuDIIQ',
'https://www.youtube.com/watch?v=BuhUXD0KH8k',
'https://www.youtube.com/watch?v=27mtHjDTgvQ',
'https://www.youtube.com/watch?v=kebonpz4bD0',
'https://www.youtube.com/watch?v=2KgH0UpiRiw',
'https://www.youtube.com/watch?v=TA-P5ilI_Vg',
'https://www.youtube.com/watch?v=TOTmOToM6zQ',
'https://www.youtube.com/watch?v=CRVYXC2OH7U',
'https://www.youtube.com/watch?v=g4TrGD2tDek',
'https://www.youtube.com/watch?v=tAO-Ff7_4CE',
'https://www.youtube.com/watch?v=fwe-PjrX23o',
'https://www.youtube.com/watch?v=Gu7-vlVFUnw',
'https://www.youtube.com/watch?v=oXOqExfdKNg',
'https://www.youtube.com/watch?v=zrh7P9fgga8',
'https://www.youtube.com/watch?v=HVdZ-ccwkj8',
'https://www.youtube.com/watch?v=vCdTLteTPtM']
问题
有没有一种方法可以让我进入这些链接,使用氦(或硒(在浏览器中打开它们,然后下载转录本,而无需手动将它们作为变量复制和粘贴,然后将它们放在列表中?
示例
您的URL列表:
fin = ['https://www.youtube.com/watch?v=eHnXgh0j500', None,
'https://www.youtube.com/watch?v=wDHtXXApfbc',
'https://www.youtube.com/watch?v=CJhOGDU636k'
]
循环列表并做一些事情:
for url in fin:
if url: #check for the NONE values
#do something in selenium e.g. driver.get(url)
print(url) #or just print