用于从 Content-Disposition 标头中提取文件名的 javascript 正则表达式

Question

提问by adnan kamili

Content-disposition header contains filename which can be easily extracted, but sometimes it contains double quotes, sometimes no quotes and there are probably some other variants too. Can someone write a regex which works in all the cases.

Content-disposition 标头包含可以轻松提取的文件名，但有时它包含双引号，有时不包含引号，并且可能还有其他一些变体。有人可以编写一个适用于所有情况的正则表达式。

Content-Disposition: attachment; filename=content.txt

Here are some of the possible target strings:

以下是一些可能的目标字符串：

attachment; filename=content.txt
attachment; filename*=UTF-8''filename.txt
attachment; filename="EURO rates"; filename*=utf-8''%e2%82%ac%20rates
attachment; filename="omáèka.jpg"
and some other combinations might also be there

Answer 1

回答by Robin

You could try something in this spirit:

您可以本着这种精神尝试一些事情：

filename[^;=\n]*=((['"]).*?|[^;\n]*)

filename      # match filename, followed by
[^;=\n]*      # anything but a ;, a = or a newline
=
(             # first capturing group
    (['"])    # either single or double quote, put it in capturing group 2
    .*?       # anything up until the first...
            # matching quote (single if we found single, double if we find double)
|             # OR
    [^;\n]*   # anything but a ; or a newline
)

Your filename is in the first capturing group: http://regex101.com/r/hJ7tS6

您的文件名在第一个捕获组中：http: //regex101.com/r/hJ7tS6

Answer 2

回答by h0wXD

Slightly modified to match my use case (strips all quotes and UTF tags)

稍微修改以匹配我的用例（去除所有引号和 UTF 标签）

filename\*?=['"]?(?:UTF-\d['"]*)?([^;\r\n"']*)['"]?;?

https://regex101.com/r/UhCzyI/3

Answer 3

回答by def00111

/filename[^;=\n]*=(?:(\?['"])(.*?)|(?:[^\s]+'.*?')?([^;\n]*))/i

https://regex101.com/r/hJ7tS6/51

Edit: You can also use this parser: https://github.com/Rob--W/open-in-browser/blob/master/extension/content-disposition.js

编辑：您也可以使用此解析器：https: //github.com/Rob--W/open-in-browser/blob/master/extension/content-disposition.js

Answer 4

回答by Antoine Bolvy

Disclaimer:the following answer only works with PCRE(e.g. Python / PHP), if you have to use javascript, use Robin's answer.

免责声明：以下答案仅适用于PCRE（例如 Python/PHP），如果您必须使用 javascript，请使用 Robin 的答案。

This modified version of Robin's regex strips the quotes:

Robin 正则表达式的这个修改版本去掉了引号：

filename[^;\n=]*=(['\"])*(.*)(?(1)|)

filename        # match filename, followed by
[^;=\n]*        # anything but a ;, a = or a newline
=
(['"])*         # either single or double quote, put it in capturing group 1
(?:utf-8\'\')?  # removes the utf-8 part from the match
(.*)            # second capturing group, will contain the filename
(?(1)|)       # if clause: if first capturing group is not empty,
                # match it again (the quotes), else match nothing

https://regex101.com/r/hJ7tS6/28

The filename is in the second capturing group.

文件名在第二个捕获组中。

Answer 5

回答by Harbor Young

Here is my regular expression. It works on Javascript.

这是我的正则表达式。它适用于 Javascript。

filename\*?=((['"])[\s\S]*?|[^;\n]*)

I used this in my project.

我在我的项目中使用了它。

Answer 6

回答by kiripk

filename[^;\n]*=(UTF-\d['"]*)?((['"]).*?[.]$|[^;\n]*)?

I have upgraded Robin's solution to do two more things:

我已经升级了 Robin 的解决方案来做另外两件事：

Capture filename even if it has escaped double quotes.
Capture UTF-8'' part as a separate group.

捕获文件名，即使它已转义双引号。
捕获 UTF-8'' 部分作为一个单独的组。

This is an ECMAScript solution.

这是一个 ECMAScript 解决方案。

https://regex101.com/r/7Csdp4/3/

用于从 Content-Disposition 标头中提取文件名的 javascript 正则表达式

提问by adnan kamili

回答by Robin

回答by h0wXD

回答by def00111

回答by Antoine Bolvy

回答by Harbor Young

回答by kiripk

相关推荐

最近更新

标签

用于从 Content-Disposition 标头中提取文件名的 javascript 正则表达式

提问by adnan kamili

回答by Robin

回答by h0wXD

回答by def00111

回答by Antoine Bolvy

回答by Harbor Young

回答by kiripk

相关推荐

javascript 如何使用 snap.svg 为路径变形设置动画

javascript return 和 return() 有什么区别？

javascript 如何在javascript中将数据属性值转换为整数？

javascript 为什么 .innerText 在 Firefox 中不起作用？

相关推荐

最近更新

标签