javascript url-safe 文件名安全字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8485027/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
javascript url-safe filename-safe string
提问by ndmweb
Looking for a regex/replace function to take a user inputted string say, "John Smith's Cool Page" and return a filename/url safe string like "john_smith_s_cool_page.html", or something to that extent.
寻找一个正则表达式/替换函数来接受用户输入的字符串,比如“John Smith's Cool Page”并返回一个文件名/url 安全字符串,比如“john_smith_s_cool_page.html”,或者类似的东西。
回答by Shalom Craimer
Well, here's one that replaces anything that's not a letter or a number, and makes it all lower case, like your example.
嗯,这里有一个替换任何不是字母或数字的东西,并使其全部小写,就像你的例子一样。
var s = "John Smith's Cool Page";
var filename = s.replace(/[^a-z0-9]/gi, '_').toLowerCase();
Explanation:
解释:
The regular expression is /[^a-z0-9]/gi
. Well, actually the gi
at the end is just a set of options that are used when the expression is used.
正则表达式是/[^a-z0-9]/gi
. 好吧,实际上gi
最后只是一组在使用表达式时使用的选项。
i
means "ignore upper/lower case differences"g
means "global", which really means that every match should be replaced, not just the first one.
i
意思是“忽略大小写差异”g
意味着“全局”,这实际上意味着应该替换每个匹配项,而不仅仅是第一个匹配项。
So what we're looking as is really just [^a-z0-9]
. Let's read it step-by-step:
所以我们所看到的真的只是[^a-z0-9]
. 让我们一步一步地阅读它:
- The
[
and]
define a "character class", which is a list of single-characters. If you'd write[one]
, then that would match either 'o' or 'n' or 'e'. - However, there's a
^
at the start of the list of characters. That means it should match only characters notin the list. - Finally, the list of characters is
a-z0-9
. Read this as "a through z and 0 through 9". It's is a short way of writingabcdefghijklmnopqrstuvwxyz0123456789
.
- 所述
[
和]
定义“字符类别”,其是单字符的列表。如果你写[one]
,那么它会匹配 'o' 或 'n' 或 'e'。 - 但是,
^
在字符列表的开头有一个。这意味着它应该只匹配不在列表中的字符。 - 最后,字符列表是
a-z0-9
。将此读作“a 到 z 和 0 到 9”。这是一种简短的写作方式abcdefghijklmnopqrstuvwxyz0123456789
。
So basically, what the regular expression says is: "Find every letter that is not between 'a' and 'z' or between '0' and '9'".
所以基本上,正则表达式所说的是:“找到不在'a'和'z'之间或不在'0'和'9'之间的每个字母”。
回答by speedplane
I know the original poster asked for a simple Regular Expression, however, there is more involved in sanitizing filenames, including filename length, reserved filenames, and, of course reserved characters.
我知道原始海报要求一个简单的正则表达式,但是,清理文件名涉及更多内容,包括文件名长度、保留文件名,当然还有保留字符。
Take a look at the code in node-sanitize-filenamefor a more robust solution.
查看node-sanitize-filename中的代码以获得更强大的解决方案。
回答by Adam D
For more flexible and robust handling of unicode characters etc, you could use the slugifyin conjunction with some regex to remove unsafe URL characters
为了更灵活和强大地处理 unicode 字符等,您可以将slugify与一些正则表达式结合使用来删除不安全的 URL 字符
const urlSafeFilename = slugify(filename, { remove: /"<>#%\{\}\|\\^~\[\]`;\?:@=&/g });
This produces nice kebab-case filenemas in your url and allows for more characters outside the a-z0-9
range.
这会在您的 url 中生成不错的 kebab-case filenemas,并允许在a-z0-9
范围之外使用更多字符。
回答by Hemant Metalia
I think your requirement is to replaces white spaces and aphostophy `s with _ and append the .html at the end try to find such regex.
我认为您的要求是用 _ 替换空格和 aphostophy 并在末尾附加 .html 尝试找到这样的正则表达式。
refer
参考