我正试图通过在https://www.watchcartoononline.com/bobs-burgers-season-9-episode-3-tweentrepreneurs.

我不知道如何从这个网站提取视频网址。我使用Chrome和Firefox网络开发工具来确定它在iframe中，但使用BeautifulSoup搜索iframe来提取src URL，会返回与视频无关的链接。对mp4或flv文件的引用在哪里(我在开发人员工具中看到了这些文件——尽管禁止单击它们(。

如果您了解如何使用BeautifulSoup进行视频网络抓取并提出请求，我们将不胜感激。

如果需要，这里有一些代码。很多教程都说要使用"A"标签，但我没有收到任何"A"标记。

import requests
from bs4 import BeautifulSoup
r = requests.get("https://www.watchcartoononline.com/bobs-burgers-season-9-episode-5-live-and-let-fly")
soup = BeautifulSoup(r.content,'html.parser')
links = soup.find_all('iframe')
for link in links:
print(link['src'])

import requests
url = "https://disk19.cizgifilmlerizle.com/cizgi/bobs.burgers.s09e03.mp4?st=_EEVz36ktZOv7ZxlTaXZfg&e=1541637622"
def download_file(url,filename):
# NOTE the stream=True parameter
r = requests.get(url, stream=True)
with open(filename, 'wb') as f:
for chunk in r.iter_content(chunk_size=1024): 
if chunk: # filter out keep-alive new chunks
f.write(chunk)
#f.flush() commented by recommendation from J.F.Sebastian       
return filename
download_file(url,"bobs.burgers.s09e03.mp4")

这段代码将把这一集下载到你的电脑上。视频url嵌套在<source>标签中的<video>标签内。

背景信息

(向下滚动查看答案(

只有当您试图从中获取视频格式的网站在HTML中明确说明时，才可以轻松地获得。例如，如果你想通过引用.mp4 URL从你选择的网站获取.mp4文件，那么如果我们在这里使用这个网站；https://4anime.to/yakunara-mug-cup-mo-episode-01-1?id=45314如果我们在inspect元素中查找<video>，将会有一个src包含.mp4

现在，如果我们试图从这个网站抓取.mp4 URL，就像这个一样

import requests
from bs4 import BeautifulSoup 

html_url = "https://4anime.to/yakunara-mug-cup-mo-episode-01-1?id=45314"
html_response = requests.get(html_url) 
soup = BeautifulSoup(html_response.text, 'html.parser') 

for mp4 in soup.find_all('video'):
mp4 = mp4['src']
print(mp4)

我们将得到KeyError: 'src'输出。这是由于实际视频存储在source中，如果我们打印出soup.find_all('video')中的值，我们可以查看

import requests
from bs4 import BeautifulSoup 

html_url = "https://4anime.to/yakunara-mug-cup-mo-episode-01-1?id=45314"
html_response = requests.get(html_url) 
soup = BeautifulSoup(html_response.text, 'html.parser') 

for mp4 in soup.find_all('video'):
pass
print(mp4)

输出：

<video class="video-js vjs-default-skin vjs-big-play-centered" controls="" data-setup="{}" height="264" id="example_video_1" poster="" preload="none" width="640">
<source src="https://mountainoservo0002.animecdn.com/Yakunara-Mug-Cup-mo/Yakunara-Mug-Cup-mo-Episode-01.1-1080p.mp4" type="video/mp4"/>
</video>

因此，如果我们希望现在下载.mp4，我们将使用source元素并从中获取src。

import requests
import shutil # - - This module helps to transfer information from 1 file to another 
from bs4 import BeautifulSoup # - - We could honestly do this without soup

# - - Get the url of the site you want to scrape
html_url = "https://4anime.to/yakunara-mug-cup-mo-episode-01-1?id=45314"
html_response = requests.get(html_url) 
soup = BeautifulSoup(html_response.text, 'html.parser') 
# - - Get the .mp4 url and the filename 
for vid in soup.find_all('source'):
url = vid['src']
filename = vid['src'].split('/')[-1]
# - - Get the video 
response = requests.get(url, stream=True)
# - - Make sure the status is OK
if response.status_code == 200:
# - - Make sure the file size is not 0
response.raw.decode_content = True
with open(filename, 'wb') as f:
# - - Copy what's in response.raw and transfer it into the file
shutil.copyfileobj(response.raw, f)

(很明显，您可以通过手动复制源的src并将其用作基本URL来简化这一点，而无需使用html_url。我只是想向您展示，您可以选择引用.mp4(也称为源的src((

再一次，并不是每个网站都是如此清晰。特别是对于这个网站，我们很幸运，它是如此易于管理。您可能试图从中抓取视频的其他站点可能需要从Elements(在inspect元素中(转到Network。在那里，你必须尝试获取嵌入链接的片段，并尝试下载它们来组成完整的视频，但再次强调，这并不总是那么容易，但你请求的网站的视频是。

你的答案

转到inspect元素，单击位于视频顶部的Chromecast Player (2. Player)以查看HTML属性，最后单击看起来像的嵌入

/inc/embed/embed.php?file=bobs.burgers.s09e05.flv&amp;hd=1&amp;pid=437035&amp;h=25424730eed390d0bb4634fa93a2e96c&amp;t=1618011716&amp;embed=cizgi

完成后，单击播放，确保inspect元素处于打开状态，单击视频以查看属性(或ctrl+f以筛选<video>(，然后复制应该是的src

https://cdn.cizgifilmlerizle.com/cizgi/bobs.burgers.s09e05.mp4?st=f9OWlOq1e-2M9eUVvhZa8A&e=1618019876

现在我们可以用python下载它了。

import requests
# - - This module helps to transfer information from 1 file to another 
import shutil

url = "https://cdn.cizgifilmlerizle.com/cizgi/bobs.burgers.s09e05.mp4?st=f9OWlOq1e-2M9eUVvhZa8A&e=1618019876"
response = requests.get(url, stream=True)
if response.status_code == 200:
# - - Make sure the file size is not 0
response.raw.decode_content = True
with open('bobs-burgers.mp4', 'wb') as f:
#  - - Take the data from response.raw and transfer it to the file
shutil.copyfileobj(response.raw, f)
print('downloaded file')
else:
print('Download failed')

网络剪贴视频

背景信息

(向下滚动查看答案(

你的答案

相关内容

最新更新

热门标签：