用于从 Content-Disposition 标头中提取文件名的 javascript 正则表达式
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23054475/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
javascript regex for extracting filename from Content-Disposition header
提问by adnan kamili
Content-disposition header contains filename which can be easily extracted, but sometimes it contains double quotes, sometimes no quotes and there are probably some other variants too. Can someone write a regex which works in all the cases.
Content-disposition 标头包含可以轻松提取的文件名,但有时它包含双引号,有时不包含引号,并且可能还有其他一些变体。有人可以编写一个适用于所有情况的正则表达式。
Content-Disposition: attachment; filename=content.txt
Here are some of the possible target strings:
以下是一些可能的目标字符串:
attachment; filename=content.txt
attachment; filename*=UTF-8''filename.txt
attachment; filename="EURO rates"; filename*=utf-8''%e2%82%ac%20rates
attachment; filename="omáèka.jpg"
and some other combinations might also be there
回答by Robin
You could try something in this spirit:
您可以本着这种精神尝试一些事情:
filename[^;=\n]*=((['"]).*?|[^;\n]*)
filename # match filename, followed by
[^;=\n]* # anything but a ;, a = or a newline
=
( # first capturing group
(['"]) # either single or double quote, put it in capturing group 2
.*? # anything up until the first...
# matching quote (single if we found single, double if we find double)
| # OR
[^;\n]* # anything but a ; or a newline
)
Your filename is in the first capturing group: http://regex101.com/r/hJ7tS6
您的文件名在第一个捕获组中:http: //regex101.com/r/hJ7tS6
回答by h0wXD
Slightly modified to match my use case (strips all quotes and UTF tags)
稍微修改以匹配我的用例(去除所有引号和 UTF 标签)
filename\*?=['"]?(?:UTF-\d['"]*)?([^;\r\n"']*)['"]?;?
filename\*?=['"]?(?:UTF-\d['"]*)?([^;\r\n"']*)['"]?;?
回答by def00111
/filename[^;=\n]*=(?:(\?['"])(.*?)|(?:[^\s]+'.*?')?([^;\n]*))/i
https://regex101.com/r/hJ7tS6/51
https://regex101.com/r/hJ7tS6/51
Edit: You can also use this parser: https://github.com/Rob--W/open-in-browser/blob/master/extension/content-disposition.js
编辑:您也可以使用此解析器:https: //github.com/Rob--W/open-in-browser/blob/master/extension/content-disposition.js
回答by Antoine Bolvy
Disclaimer:the following answer only works with PCRE(e.g. Python / PHP), if you have to use javascript, use Robin's answer.
免责声明:以下答案仅适用于PCRE(例如 Python/PHP),如果您必须使用 javascript,请使用 Robin 的答案。
This modified version of Robin's regex strips the quotes:
Robin 正则表达式的这个修改版本去掉了引号:
filename[^;\n=]*=(['\"])*(.*)(?(1)|)
filename # match filename, followed by
[^;=\n]* # anything but a ;, a = or a newline
=
(['"])* # either single or double quote, put it in capturing group 1
(?:utf-8\'\')? # removes the utf-8 part from the match
(.*) # second capturing group, will contain the filename
(?(1)|) # if clause: if first capturing group is not empty,
# match it again (the quotes), else match nothing
https://regex101.com/r/hJ7tS6/28
https://regex101.com/r/hJ7tS6/28
The filename is in the second capturing group.
文件名在第二个捕获组中。
回答by Harbor Young
Here is my regular expression. It works on Javascript.
这是我的正则表达式。它适用于 Javascript。
filename\*?=((['"])[\s\S]*?|[^;\n]*)
I used this in my project.
我在我的项目中使用了它。
回答by kiripk
filename[^;\n]*=(UTF-\d['"]*)?((['"]).*?[.]$|[^;\n]*)?
I have upgraded Robin's solution to do two more things:
我已经升级了 Robin 的解决方案来做另外两件事:
This is an ECMAScript solution.
这是一个 ECMAScript 解决方案。