Javascript 改进解析 YouTube / Vimeo URL 的正则表达式

Question

提问by Daniel

I've made a function (in JavaScript) that takes an URL from either YouTube or Vimeo. It figures out the provider and ID for that particular video (demo: http://jsfiddle.net/csjwf/).

我制作了一个函数（在 JavaScript 中），它从 YouTube 或 Vimeo 获取一个 URL。它计算出该特定视频的提供者和 ID（演示：http: //jsfiddle.net/csjwf/）。

function parseVideoURL(url) {

    var provider = url.match(/http:\/\/(:?www.)?(\w*)/)[2],
        id;

    if(provider == "youtube") {

        id = url.match(/http:\/\/(?:www.)?(\w*).com\/.*v=(\w*)/)[2];
    } else if (provider == "vimeo") {

        id = url.match(/http:\/\/(?:www.)?(\w*).com\/(\d*)/)[2];
    } else {
        throw new Error("parseVideoURL() takes a YouTube or Vimeo URL");    
    }
    return {
        provider : provider,
        id : id
    }
}

It works, however as a regex Novice, I'm looking for ways to improve it. The input I'm dealing with, typically looks like this:

它有效，但是作为正则表达式新手，我正在寻找改进它的方法。我正在处理的输入通常如下所示：

http://vimeo.com/(id)
http://youtube.com/watch?v=(id)&blahblahblah.....

1) Right now I'm doing three separate matches, would it make sense to try and do everything in one single expression? If so, how?

1) 现在我正在做三场单独的比赛，尝试用一个单一的表达做所有事情是否有意义？如果是这样，如何？

2) Could the existing matches be more concise? Are they unnecessarily complex? or perhaps insufficient?

2）现有的比赛能否更简洁？它们是否不必要地复杂？或者可能不够？

3) Are there any YouTube or Vimeo URL's that would fail being parsed? I've tried quite a few and so far it seems to work pretty well.

3) 是否有无法解析的 YouTube 或 Vimeo URL？我已经尝试了很多，到目前为止它似乎工作得很好。

To summarize:I'm simply looking for ways improve the above function. Any advice is greatly appreciated.

总结一下：我只是在寻找改进上述功能的方法。任何意见是极大的赞赏。

Answer 1

采纳答案by sawa

I am not sure about your question 3), but provided that your induction on the url forms is correct, the regexes can be combined into one as follows:

我不确定您的问题 3)，但如果您对 url 表单的归纳正确，则可以将正则表达式合并为一个，如下所示：

/http:\/\/(?:www.)?(?:(vimeo).com\/(.*)|(youtube).com\/watch\?v=(.*?)&)/

You will get the match under different positions (1st and 2nd matches if vimeo, 3rd and 4th matches if youtube), so you just need to handle that.

您将在不同的位置获得比赛（如果 vimeo 是第 1 场和第 2 场比赛，如果是 youtube 是第 3 场比赛和第 4 场比赛），所以您只需要处理它。

Or, if you are quite sure that vimeo's id only includes numbers, then you can do:

或者，如果您非常确定 vimeo 的 id 仅包含数字，那么您可以执行以下操作：

/http:\/\/(?:www.)?(vimeo|youtube).com\/(?:watch\?v=)?(.*?)(?:\z|&)/

and the provider and the id will apprear under 1st and 2nd match, respcetively.

并且提供者和 id 将分别出现在第 1 次和第 2 次匹配项下。

Answer 2

回答by Yangshun Tay

Here's my attempt at the regex, which covers most updated cases:

这是我对正则表达式的尝试，它涵盖了最新的情况：

function parseVideo(url) {
    // - Supported YouTube URL formats:
    //   - http://www.youtube.com/watch?v=My2FRPA3Gf8
    //   - http://youtu.be/My2FRPA3Gf8
    //   - https://youtube.googleapis.com/v/My2FRPA3Gf8
    // - Supported Vimeo URL formats:
    //   - http://vimeo.com/25451551
    //   - http://player.vimeo.com/video/25451551
    // - Also supports relative URLs:
    //   - //player.vimeo.com/video/25451551

    url.match(/(https?\/\/)(player.|www.)?(vimeo\.com|youtu(be\.com|\.be|be\.googleapis\.com))\/(video\/|embed\/|watch\?v=|v\/)?([A-Za-z0-9._%-]*)(\&\S+)?/);
    var type = null;
    if (RegExp..indexOf('youtu') > -1) {
        type = 'youtube';
    } else if (RegExp..indexOf('vimeo') > -1) {
        type = 'vimeo';
    }

    return {
        type: type,
        id: RegExp.
    };
}

Answer 3

回答by Jason Sebring

Regex is wonderfully terse but can quickly get complicated.

正则表达式非常简洁，但很快就会变得复杂。

http://jsfiddle.net/8nagx2sk/

function parseYouTube(str) {
    // link : //youtube.com/watch?v=Bo_deCOd1HU
    // share : //youtu.be/Bo_deCOd1HU
    // embed : //youtube.com/embed/Bo_deCOd1HU

    var re = /\/\/(?:www\.)?youtu(?:\.be|be\.com)\/(?:watch\?v=|embed\/)?([a-z0-9_\-]+)/i; 
    var matches = re.exec(str);
    return matches && matches[1];
}

function parseVimeo(str) {
    // embed & link: http://vimeo.com/86164897

    var re = /\/\/(?:www\.)?vimeo.com\/([0-9a-z\-_]+)/i;
    var matches = re.exec(str);
    return matches && matches[1];
}

Sometimes simple code is nicer to your fellow developers.

有时，简单的代码对您的开发人员同伴更好。

https://jsfiddle.net/1dzb5ag1/

// protocol and www neutral
function getVideoId(url, prefixes) {
  var cleaned = url.replace(/^(https?:)?\/\/(www\.)?/, '');
  for(var i = 0; i < prefixes.length; i++) {
    if (cleaned.indexOf(prefixes[i]) === 0)
      return cleaned.substr(prefixes[i].length)
  }
  return undefined;
}

function getYouTubeId(url) {
  return getVideoId(url, [
    'youtube.com/watch?v=',
    'youtu.be/',
    'youtube.com/embed/',
    'youtube.googleapis.com/v/'
  ]);
}

function getVimeoId(url) {
  return getVideoId(url, [
    'vimeo.com/',
    'player.vimeo.com/'
  ]);
}

Which do you prefer to update?

你更喜欢更新哪个？

Answer 4

回答by Ming-Tang

Here is my regex

这是我的正则表达式

http://jsfiddle.net/csjwf/1/

Answer 5

回答by Romain

about sawa's answer :

关于sawa的回答：

a little update on the second regex :

关于第二个正则表达式的一些更新：

/http:\/\/(?:www\.)?(vimeo|youtube)\.com\/(?:watch\?v=)?(.*?)(?:\z|$|&)/

(escaping the dots prevents from matching url of type www_vimeo_com/… and $ added…)

（转义点会阻止匹配类型为 www_vimeo_com/... 的 url 和 $ added...）

here is the same idea for matching the embed urls :

这是匹配嵌入网址的相同想法：

/http:\/\/(?:www\.|player\.)?(vimeo|youtube)\.com\/(?:embed\/|video\/)?(.*?)(?:\z|$|\?)/

Answer 6

回答by fluffyBatman

For Vimeo, Don'trely on Regexas Vimeo tends to change/update their URL pattern every now and then. As of October 2nd, 2017, there are in total of six URL schemes Vimeo supports.

对于 Vimeo，不要依赖Regex，因为 Vimeo 会时不时地更改/更新其 URL 模式。截至2017 年 10 月 2 日，Vimeo 总共支持六种 URL 方案。

https://vimeo.com/*
https://vimeo.com/*/*/video/*
https://vimeo.com/album/*/video/*
https://vimeo.com/channels/*/*
https://vimeo.com/groups/*/videos/*
https://vimeo.com/ondemand/*/*

Instead, use their API to validate vimeo URLs. Here is this oEmbed (doc) API which takes an URL, checks its validity and return a object with bunch of video information(check out the dev page). Although not intended but we can easily use this to validate whether a given URL is from Vimeo or not.

相反，使用他们的 API 来验证 vimeo URL。这是这个 oEmbed ( doc) API，它接受一个 URL，检查其有效性并返回一个带有一堆视频信息的对象（查看开发页面）。虽然不是故意的，但我们可以轻松地使用它来验证给定的 URL 是否来自 Vimeo。

So, with ajax it would look like this,

所以，有了 ajax，它看起来像这样，

var VIMEO_BASE_URL = "https://vimeo.com/api/oembed.json?url=";
var yourTestUrl = "https://vimeo.com/23374724";


$.ajax({
  url: VIMEO_BASE_URL + yourTestUrl,
  type: 'GET',
  success: function(data) {
    if (data != null && data.video_id > 0)
      // Valid Vimeo url
    else
      // not a valid Vimeo url
  },
  error: function(data) {
    // not a valid Vimeo url
  }
});

Answer 7

回答by vrijdenker

3) Your regex does not match https url's. I haven't tested it, but I guess the "http://" part would become "http(s)?://". Note that this would change the matching positions of the provider and id.

3) 您的正则表达式与 https 网址不匹配。我还没有测试过，但我猜“http://”部分会变成“http(s)?://”。请注意，这会更改提供者和 id 的匹配位置。

Answer 8

回答by mica

Just in case here is a php version

以防万一这里是一个 php 版本

/*
* parseVideo
* @param (string) $url 
* mi-ca.ch 27.05.2016
* parse vimeo & youtube id
* format url for iframe embed 
* https://regex101.com/r/lA0fP4/1
*/

function parseVideo($url) {
  $re = "/(http:|https:|)\/\/(player.|www.)?(vimeo\.com|youtu(be\.com|\.be|be\.googleapis\.com))\/(video\/|embed\/|watch\?v=|v\/)?([A-Za-z0-9._%-]*)(\&\S+)?/"; 
preg_match($re, $url, $matches);

if(strrpos($matches[3],'youtu')>-1){
    $type='youtube';
    $src='https://www.youtube.com/embed/'.$matches[6];
}else if(strrpos($matches[3],'vimeo')>-1){
    $type="vimeo";
    $src='https://player.vimeo.com/video/'.$matches[6];
}else{
    return false;
}


return array(
         'type' =>  $type // return youtube or vimeo
        ,'id'   =>  $matches[6] // return the video id
        ,'src'  =>  $src // return the src for iframe embed
        );
}

Answer 9

回答by JM123

I based myself the previous answers but I needed more out the regex.

我基于自己以前的答案，但我需要更多的正则表达式。

Maybe it worked in 2011 but in 2019 the syntax has changed a bit. So this is a refresh.

也许它在 2011 年有效，但在 2019 年语法发生了一些变化。所以这是一个刷新。

The regex will allow us to detect weather the url is Youtube or Vimeo. I've added Capture group to easily retrieve the videoID.

正则表达式将允许我们检测 url 是 Youtube 或 Vimeo 的天气。我添加了 Capture 组来轻松检索 videoID。

If ran with Case insensitive setting please remove the (?i).

如果使用不区分大小写的设置运行，请删除 (?i)。

(?:(?i)(?:https:|http:)?\/\/)?(?:(?i)(?:www\.youtube\.com\/(?:embed\/|watch\?v=)|youtu\.be\/|youtube\.googleapis\.com\/v\/)(?<YoutubeID>[a-z0-9-_]{11,12})|(?:vimeo\.com\/|player\.vimeo\.com\/video\/)(?<VimeoID>[0-9]+))

https://regex101.com/r/PVdjg0/2

Answer 10

回答by Max Haponenko

I had a task to enable adding a dropbox videos. So the same input should take href, check it and transform to the playable link which I can then insert in .

我的任务是启用添加保管箱视频。因此，相同的输入应该采用 href，检查它并转换为可播放链接，然后我可以将其插入 .

const getPlayableUrl = (url) => {
    // Check youtube and vimeo
    let firstCheck = url.match(/(http:|https:|)\/\/(player.|www.)?(vimeo\.com|youtu(be\.com|\.be|be\.googleapis\.com))\/(video\/|embed\/|watch\?v=|v\/)?([A-Za-z0-9._%-]*)(\&\S+)?/);

    if (firstCheck) {
        if (RegExp..indexOf('youtu') > -1) {
            return "//www.youtube.com/embed/" + RegExp.;
        } else if (RegExp..indexOf('vimeo') > -1) {
            return 'https://player.vimeo.com/video/' + RegExp.
        }
    } else {
        // Check dropbox
        let candidate = ''
        if (url.indexOf('.mp4') !== -1) {
            candidate = url.slice(0, url.indexOf('.mp4') + 4)
        } else if (url.indexOf('.m4v') !== -1) {
            candidate = url.slice(0, url.indexOf('.m4v') + 4)
        } else if (url.indexOf('.webm') !== -1) {
            candidate = url.slice(0, url.indexOf('.webm') + 5)
        }

        let secondCheck = candidate.match(/(http:|https:|)\/\/(player.|www.)?(dropbox\.com)\/(s\/|embed\/|watch\?v=|v\/)?([A-Za-z0-9._%-]*\/)?(.*)/);
        if (secondCheck) {
            return 'https://dropbox.com/' + RegExp. + RegExp. + RegExp. + '?raw=1'
        } else {
            throw Error("Not supported video resource.");
        }
    }
}

Javascript 改进解析 YouTube / Vimeo URL 的正则表达式

提问by Daniel

采纳答案by sawa

回答by Yangshun Tay

回答by Jason Sebring

Regex is wonderfully terse but can quickly get complicated.

正则表达式非常简洁，但很快就会变得复杂。

Sometimes simple code is nicer to your fellow developers.

有时，简单的代码对您的开发人员同伴更好。

Which do you prefer to update?

你更喜欢更新哪个？

回答by Ming-Tang

回答by Romain

回答by fluffyBatman

回答by vrijdenker

回答by mica

回答by JM123

回答by Max Haponenko

相关推荐

最近更新

标签

Javascript 改进解析 YouTube / Vimeo URL 的正则表达式

提问by Daniel

采纳答案by sawa

回答by Yangshun Tay

回答by Jason Sebring

Regex is wonderfully terse but can quickly get complicated.

正则表达式非常简洁，但很快就会变得复杂。

Sometimes simple code is nicer to your fellow developers.

有时，简单的代码对您的开发人员同伴更好。

Which do you prefer to update?

你更喜欢更新哪个？

回答by Ming-Tang

回答by Romain

回答by fluffyBatman

回答by vrijdenker

回答by mica

回答by JM123

回答by Max Haponenko

相关推荐

Javascript 使用 lodash 从字符串数组中查找子字符串

什么是 JavaScript KeyCode？

Javascript Angular2 ngAfterViewInit 什么时候被调用？

Javascript 获取元素的比例值？

相关推荐

最近更新

标签