Youtube数据API:提取字典列表的转录



我正在尝试获取播放列表中多个视频的转录本。当我运行代码时,我会得到下面的列表,其中包含每个视频的id作为字典的关键字,以及字典列表作为值。有人知道a如何只提取并加入";文本";并将其存储在名为"的变量中;GetText"?

这是代码:

from googleapiclient.discovery import build
from youtube_transcript_api import YouTubeTranscriptApi
import os
api_key = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
nextPageToken = None
srt = []
vid_ids = []
vid_title = []
while True:
#1.query API 
rq = build("youtube", "v3", developerKey=api_key).playlistItems().list(
part="contentDetails, snippet",
playlistId="PLNIs-AWhQzckr8Dgmgb3akx_gFMnpxTN5",
maxResults=50, 
pageToken=nextPageToken       
).execute()

#2.Create a list with video Ids and Titles   
for item in rq["items"]:
vid_ids.append(item["contentDetails"]["videoId"])
vid_title.append(item["snippet"]["title"])

nextPageToken = rq.get('nextPageToken')
if not nextPageToken:
break
#3.Get transcripts    
for i in vid_ids:
try:
srt += [YouTubeTranscriptApi.get_transcripts([i])]                      
except:
print(f"{i} doesn't have a transcript")
print(srt)
#4.For each video id extract the Key:"text" from a list of dictionaries 
?????????????????????

这是我收到的成绩单清单的一部分:

[
({
"KHO5NIcZAc4":[
{
"text":"welcome to this wise ell tutorial in",
"start":0.23,
"duration":4.15
},
{
"text":"this video we're going to teach you",
"start":3.06,
"duration":3.09
},
...
]
})
]

坦率地说,我不理解你的问题。

这应该是基础知识:使用for-循环来处理列表和字典。

仅此而已。

data = [({'KHO5NIcZAc4':
[{'text': 'welcome to this wise ell tutorial in', 'start': 0.23, 'duration': 4.15}, {'text': "this video we're going to teach you", 'start': 3.06, 'duration': 3.09}, {'text': 'about working with the visual basic', 'start': 4.38, 'duration': 3.66}, {'text': 'editor application with a name to', 'start': 6.15, 'duration': 4.409}, {'text': 'writing some Excel VBA code in this', 'start': 8.04, 'duration': 3.66}, {'text': "video we're not going to write any code", 'start': 10.559, 'duration': 2.881}, {'text': 'itself but we are going to do is show', 'start': 11.7, 'duration': 3.45}, {'text': 'you how you can set up and work with the', 'start': 13.44, 'duration': 3.839}, {'text': "visual basic editor so I'll start by", 'start': 15.15, 'duration': 3.99}, {'text': 'showing you how you can access the VBA', 'start': 17.279, 'duration': 3.75}, {'text': 'deter from whichever version of Excel', 'start': 19.14, 'duration': 4.11}, {'text': "you happen to be working in we'll talk", 'start': 21.029, 'duration': 3.931}, {'text': 'about how you can switch between the the', 'start': 23.25, 'duration': 4.17}, {'text': 'VBA editor and Excel itself with some', 'start': 24.96, 'duration': 4.649}, {'text': "nice quick keyboard shortcuts we'll also", 'start': 27.42, 'duration': 3.54}, {'text': 'give you a quick whirlwind tour of the', 'start': 29.609, 'duration': 3.001}, {'text': 'VB screen and explain what the main', 'start': 30.96, 'duration': 4.259}, {'text': 'window is in the VB editor application', 'start': 32.61, 'duration': 5.4}]
})]
for item in data:
#print(item)
for video_id, transcript in item.items():
print('ID:', video_id)
all_parts = []
for part in transcript:
#print(part['text'])
all_parts.append(part['text'])

full_text = " ".join(all_parts)
print(full_text)

结果:

ID: KHO5NIcZAc4
welcome to this wise ell tutorial in this video we're going to teach you about working with the visual basic editor application with a name to writing some Excel VBA code in this video we're not going to write any code itself but we are going to do is show you how you can set up and work with the visual basic editor so I'll start by showing you how you can access the VBA deter from whichever version of Excel you happen to be working in we'll talk about how you can switch between the the VBA editor and Excel itself with some nice quick keyboard shortcuts we'll also give you a quick whirlwind tour of the VB screen and explain what the main window is in the VB editor application

BTW:

当您使用for-循环来处理列表或字典时,您可以使用print(...)print(type(...))print( some_dictionary.keys() )来查看变量中的内容以及在嵌套的for-循环中使用的内容。

最新更新