用于匹配/提取文件扩展名的 Javascript 正则表达式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6582171/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-23 22:22:20  来源:igfitidea点击:

Javascript regex for matching/extracting file extension

javascriptregex

提问by mare

The following regex

以下正则表达式

var patt1=/[0-9a-z]+$/i;

extracts the file extension of strings such as

提取字符串的文件扩展名,例如

filename-jpg
filename#gif
filename.png

How to modify this regular expression to only return an extension when string really is a filename with one dot as separator ? (Obviously filename#gif is not a regular filename)

如何修改此正则表达式以仅在字符串确实是带有一个点作为分隔符的文件名时才返回扩展名?(显然 filename#gif 不是常规文件名)

UPDATE Based on tvanofsson's comments I would like to clarify that when the JS function receives the string, the string will already contain a filename without spaces without the dots and other special characters (it will actually be handled a slug). The problem was not in parsing filenames but in incorrectly parsing slugs - the function was returning an extension of "jpg" when it was given "filename-jpg" when it should really return nullor empty string and it is this behaviour that needed to be corrected.

更新 基于 tvanofsson 的评论,我想澄清一下,当 JS 函数接收字符串时,该字符串将已经包含一个没有空格的文件名,没有点和其他特殊字符(它实际上会被处理为 a slug)。问题不在于解析文件名,而在于错误地解析 slugs - 当它真正应该返回null或空字符串时,该函数返回了“jpg”的扩展名,而正是这种行为需要更正.

回答by stema

Just add a .to the regex

只需.在正则表达式中添加一个

var patt1=/\.[0-9a-z]+$/i;

Because the dot is a special character in regex you need to escape it to match it literally: \..

因为点是正则表达式中的特殊字符,所以您需要对其进行转义以使其字面匹配:\..

Your pattern will now match any string that ends with a dot followed by at least one character from [0-9a-z].

您的模式现在将匹配以点结尾的任何字符串,后跟至少一个来自[0-9a-z].

Example:

例子:

[
  "foobar.a",
  "foobar.txt",
  "foobar.foobar1234"
].forEach( t => 
  console.log(
    t.match(/\.[0-9a-z]+$/i)[0]
  ) 
)



if you want to limit the extension to a certain amount of characters also, than you need to replace the +

如果您还想将扩展名限制为一定数量的字符,则需要替换 +

var patt1=/\.[0-9a-z]{1,5}$/i;

would allow at least 1 and at most 5 characters after the dot.

允许在点后至少有 1 个字符,最多 5 个字符。

回答by Már ?rlygsson

Try

尝试

var patt1 = /\.([0-9a-z]+)(?:[\?#]|$)/i;

This RegExp is useful for extracting file extensions from URLs - even ones that have ?foo=1query strings and #hashendings.

这个 RegExp 可用于从 URL 中提取文件扩展名 - 即使是具有?foo=1查询字符串和#hash结尾的文件扩展名 。

It will also provide you with the extension as $1.

它还将为您提供扩展名$1

var m1 = ("filename-jpg").match(patt1);
alert(m1);  // null

var m2 = ("filename#gif").match(patt1);
alert(m2);  // null

var m3 = ("filename.png").match(patt1);
alert(m3);  // [".png", "png"]

var m4 = ("filename.txt?foo=1").match(patt1);
alert(m4);  // [".txt?", "txt"]

var m5 = ("filename.html#hash").match(patt1);
alert(m5);  // [".html#", "html"]

P.S.+1 for @stema who offers pretty good adviceon some of the RegExp syntax basics involved.

@stema 的PS+1,他对所涉及的一些 RegExp 语法基础提供了很好的建议

回答by AhbapAldirmaz

Example list:

示例列表:

var fileExtensionPattern = /\.([0-9a-z]+)(?=[?#])|(\.)(?:[\w]+)$/gmi
//regex flags -- Global, Multiline, Insensitive

var ma1 = 'css/global.css?v=1.2'.match(fileExtensionPattern)[0];
console.log(ma1);
// returns .css

var ma2 = 'index.html?a=param'.match(fileExtensionPattern)[0];
console.log(ma2);
// returns .html

var ma3 = 'default.aspx?'.match(fileExtensionPattern)[0];
console.log(ma3);
// returns .aspx

var ma4 = 'pages.jsp#firstTab'.match(fileExtensionPattern)[0];
console.log(ma4);
// returns .jsp

var ma5 = 'jquery.min.js'.match(fileExtensionPattern)[0];
console.log(ma5);
// returns .js

var ma6 = 'file.123'.match(fileExtensionPattern)[0];
console.log(ma6);
// returns .123

Test page.

测试页

回答by Kamil Kie?czewski

ONELINER:

单线:

let ext = (filename.match(/\.([^.]*?)(?=\?|#|$)/) || [])[1] 

above solution include links. Egzamples (filename -> ext):

以上解决方案包括链接。示例(文件名 -> ext):

// "abcd.Ef1"               -> "Ef1"
// "abcd.efg"               -> "efg"
// "abcd.efg?aaa&a?a=b#cb"  -> "efg"
// "abcd.efg#aaa__aa?bb"    -> "efg"
// "abcd"                   -> undefined
// "abcdefg?aaa&aa=bb"      -> undefined
// "abcdefg#aaa__bb"        -> undefined

It takes everything between last dot and first "?" or "#" char or string end. To ignore "?" and "#" characters use /\.([^.]*)$/. To ignore only "#" use /\.([^.]*?)(?=\?|$)/.

它需要最后一个点和第一个“ ?”或“ #”字符或字符串结尾之间的所有内容。要忽略 " ?" 和 " #" 字符,请使用/\.([^.]*)$/. 要仅忽略“ #”,请使用/\.([^.]*?)(?=\?|$)/.