Javascript 如何正确转义和取消转义包含换行符的多行字符串?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/46252753/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I properly escape and unescape a multiline string that contains newline literals?
提问by m90
I'm working on a Visual Studio Code extension. The extension is supposed to act on the text that is currently selected in the editor window and send it to an external command (lein-cljfmtin my case, but I think that's unrelated to my question). When the external command is done processing the text I want to replace the current editor selector with the result returned from the command line tool.
我正在开发 Visual Studio Code 扩展。该扩展程序应该作用于当前在编辑器窗口中选择的文本并将其发送到外部命令(lein-cljfmt在我的情况下,但我认为这与我的问题无关)。当外部命令处理完文本时,我想用命令行工具返回的结果替换当前编辑器选择器。
Before sending the string I escape it like this:
在发送字符串之前,我像这样转义它:
contents
.replace(/\/g, '\\')
.replace(/"/g, '\"')
.replace(/\n/g, '\n');
The result in being unescaped like:
未转义的结果如下:
contents
.replace(/\n/g, '\n')
.replace(/\"/g, '"')
.replace(/\\/g, '\');
This works in all but one case: when the selection that is being processed contains a string literal that contains a newline literal, the unescaping will instead turn this into a linebreak, thus breaking the code in the editor.
除了一种情况外,这适用于所有情况:当正在处理的选择包含包含换行文字的字符串文字时,转义会将其转换为换行符,从而破坏编辑器中的代码。
This is an example of a snippet that breaks my escaping:
这是破坏我的转义的片段示例:
(defn join
[a b]
(str a "\n" b))
I tried quite some regexp black magic like
我尝试了很多正则表达式黑魔法,比如
.replace(/(?!\B"[^"]*)\n(?![^"]*"\B)/g, '\n')
by now, but couldn't find a solution that does not have edge cases. Is there a way to do this that I am missing? I also wonder if there is a VSCode extension API that could handle that as it seems to be a common scenario to me.
到现在为止,但找不到没有边缘情况的解决方案。有没有办法做到这一点,我失踪?我也想知道是否有一个 VSCode 扩展 API 可以处理这个问题,因为这对我来说似乎是一个常见的场景。
回答by skirtle
I think this might be what you need:
我认为这可能是您需要的:
function slashEscape(contents) {
return contents
.replace(/\/g, '\\')
.replace(/"/g, '\"')
.replace(/\n/g, '\n');
}
var replacements = {'\\': '\', '\n': '\n', '\"': '"'};
function slashUnescape(contents) {
return contents.replace(/\(\|n|")/g, function(replace) {
return replacements[replace];
});
}
var tests = [
'\', '\\', '\n', '\n', '\\n', '\\n',
'\\\n', '\\\n', '\"\\n', '\n\n',
'\n\n\n', '\n\n', '\n\n', '\n\n',
'\\n\n\nn\n\n\\n\\n', '"', '\"', '\\"'
];
tests.forEach(function(str) {
var out = slashUnescape(slashEscape(str));
// assert that what goes in is what comes out
console.log(str === out, '[' + str + ']', '[' + out + ']');
});
Trying to unescape the string in 3 stages is really tricky because \nhas a different meaning depending on how many slashes there are just before it. In your example the original string of \n(slash n) gets encoded as \\n(slash slash n), then when you decode it the last two characters match the first of your RegExps when what you want is for the first two characters to match the third RegExp. You've got to count the slashes to be sure. Doing it all in one go dodges that problem by decoding those leading slashes at the same time.
尝试在 3 个阶段中对字符串进行转义非常棘手,因为\n它具有不同的含义,具体取决于它前面有多少斜杠。在您的示例中,\n(slash n)的原始字符串被编码为\\n(slash slash n),然后当您对其进行解码时,最后两个字符匹配第一个 RegExp,而您想要的是前两个字符匹配第三个 RegExp . 你必须计算斜线才能确定。通过同时解码那些前导斜杠,一次性完成所有操作可以避免该问题。

