python:获取频道的所有youtube视频网址
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15512239/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
python: get all youtube video urls of a channel
提问by Johnny
I want to get all video url's of a specific channel. I think json with python or java would be a good choice. I can get the newest video with the following code, but how can I get ALL video links (>500)?
我想获取特定频道的所有视频网址。我认为带有 python 或 java 的 json 将是一个不错的选择。我可以使用以下代码获取最新视频,但如何获取所有视频链接(>500)?
import urllib, json
author = 'Youtube_Username'
inp = urllib.urlopen(r'http://gdata.youtube.com/feeds/api/videos?max-results=1&alt=json&orderby=published&author=' + author)
resp = json.load(inp)
inp.close()
first = resp['feed']['entry'][0]
print first['title'] # video title
print first['link'][0]['href'] #url
采纳答案by max k.
Increase max-results from 1 to however many you want, but beware they don't advise grabbing too many in one call and will limit you at 50 (https://developers.google.com/youtube/2.0/developers_guide_protocol_api_query_parameters).
将 max-results 从 1 增加到您想要的数量,但请注意,他们不建议在一次通话中抓取太多数量,并将您限制在 50 ( https://developers.google.com/youtube/2.0/developers_guide_protocol_api_query_parameters)。
Instead you could consider grabbing the data down in batches of 25, say, by changing the start-index until none came back.
相反,您可以考虑以 25 个为一组抓取数据,例如,通过更改起始索引直到没有数据返回为止。
EDIT: Here's the code for how I would do it
编辑:这是我将如何做的代码
import urllib, json
author = 'Youtube_Username'
foundAll = False
ind = 1
videos = []
while not foundAll:
inp = urllib.urlopen(r'http://gdata.youtube.com/feeds/api/videos?start-index={0}&max-results=50&alt=json&orderby=published&author={1}'.format( ind, author ) )
try:
resp = json.load(inp)
inp.close()
returnedVideos = resp['feed']['entry']
for video in returnedVideos:
videos.append( video )
ind += 50
print len( videos )
if ( len( returnedVideos ) < 50 ):
foundAll = True
except:
#catch the case where the number of videos in the channel is a multiple of 50
print "error"
foundAll = True
for video in videos:
print video['title'] # video title
print video['link'][0]['href'] #url
回答by dSebastien
Based on the code found here and at some other places, I've written a small script that does this. My script uses v3 of Youtube's API and does not hit against the 500 results limit that Google has set for searches.
根据此处和其他一些地方的代码,我编写了一个执行此操作的小脚本。我的脚本使用了 Youtube API 的 v3,并且没有达到 Google 为搜索设置的 500 个结果限制。
The code is available over at GitHub: https://github.com/dsebastien/youtubeChannelVideosFinder
代码可在 GitHub 上找到:https: //github.com/dsebastien/youtubeChannelVideosFinder
回答by Stian
After the youtube API change, max k.'s answer does not work. As a replacement, the function below provides a list of the youtube videos in a given channel. Please note that you need an API Keyfor it to work.
youtube API 更改后,max k. 的答案不起作用。作为替代,下面的函数提供给定频道中的 youtube 视频列表。请注意,您需要一个API 密钥才能使其工作。
import urllib
import json
def get_all_video_in_channel(channel_id):
api_key = YOUR API KEY
base_video_url = 'https://www.youtube.com/watch?v='
base_search_url = 'https://www.googleapis.com/youtube/v3/search?'
first_url = base_search_url+'key={}&channelId={}&part=snippet,id&order=date&maxResults=25'.format(api_key, channel_id)
video_links = []
url = first_url
while True:
inp = urllib.urlopen(url)
resp = json.load(inp)
for i in resp['items']:
if i['id']['kind'] == "youtube#video":
video_links.append(base_video_url + i['id']['videoId'])
try:
next_page_token = resp['nextPageToken']
url = first_url + '&pageToken={}'.format(next_page_token)
except:
break
return video_links
回答by Gajendra D Ambi
Independent way of doing things. No api, no rate limit.
独立的做事方式。没有api,没有速率限制。
import requests
username = "marquesbrownlee"
url = "https://www.youtube.com/user/username/videos"
page = requests.get(url).content
data = str(page).split(' ')
item = 'href="/watch?'
vids = [line.replace('href="', 'youtube.com') for line in data if item in line] # list of all videos listed twice
print(vids[0]) # index the latest video
This above code will scrap only limited number of video url's max upto 60. How to grab all the videos url which is present in the channel. Can you please suggest.
上面的代码只会废弃有限数量的视频 url,最多可达 60 个。如何获取频道中存在的所有视频 url。你能建议吗。
This above code snippet will display only the list of all the videos which is listed twice. Not all the video url's in the channel.
上面的代码片段将仅显示列出两次的所有视频的列表。并非频道中的所有视频网址。

