C# 正则表达式匹配变量多行?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10292163/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Regular Expression Match variable multiple lines?
提问by Arya
Lets say I have the following text and I want to extract the text between "Start of numbers" and "End of numbers" there are dynamic amount of lines and the only thing which changes in the numbers in them eg: first, second, etc. Each file I'll be extracting data from has different amount of lines between between "Start of numbers" and "End of numbers". How can I write a regex to match the content between "Start of numbers" and "End of numbers" without knowing how many lines will be in the file between Start of numbers" and "End of numbers"?
假设我有以下文本,我想提取“数字开头”和“数字结尾”之间的文本,有动态的行数和唯一改变其中数字的内容,例如:第一、第二等. 我将从中提取数据的每个文件在“数字开头”和“数字结尾”之间都有不同数量的行。如何编写正则表达式来匹配“数字开头”和“数字结尾”之间的内容,而不知道数字开头和“数字结尾”之间的文件中有多少行?
Regards!
问候!
This is the first line This is the second line
Start of numbers
This is the first line
This is the second line
This is the third line
This is the ...... line
This is the ninth line
End of numbers
采纳答案by Paul Oliver
You should use the SingleLinemode which tells your C# regular expression that .matches any character (not any character except \n).
您应该使用这种SingleLine模式,它告诉您的 C# 正则表达式.匹配任何字符(不是除 之外的任何字符\n)。
var regex = new Regex("Start of numbers(.*)End of numbers",
RegexOptions.IgnoreCase | RegexOptions.Singleline);
回答by David Z.
You should be able to match multi-line strings without issue. Just remember to add the right characters in (\nfor new lines).
您应该能够毫无问题地匹配多行字符串。请记住在 (\n新行) 中添加正确的字符。
string pattern = "Start of numbers(.|\n)*End of numbers";
Match m = Regex.Matches(input, pattern);
This is easier if you can think of your string with the hidden characters.
如果您能想到带有隐藏字符的字符串,这会更容易。
Start of numbers\n\nThis is the first line\nThis is the second line\n ...
回答by japesu
Something like this:
像这样的东西:
^(start)([\s\n\d\w]*)(end)$
^(开始)([\s\n\d\w]*)(结束)$
Where your get the second group. You can even name the group if you like. So the point is that you read the whole thing in one string and then get the regexp result from it.
你在哪里得到第二组。如果您愿意,您甚至可以为该组命名。所以关键是你在一个字符串中读取整个内容,然后从中获取正则表达式结果。
Edit:
编辑:
Have to edit a bit. If you're match can be in middle somewhere then drop the start (^) and end ($) characters. (start)([\s\n\d\w]*)(end)
得稍微编辑一下。如果您的匹配项可以在中间某个位置,则删除开始 (^) 和结束 ($) 字符。(开始)([\s\n\d\w]*)(结束)
And a note that this will leave you only the lines you want to get. Then handle these lines.
请注意,这将只留下您想要的行。然后处理这些行。
回答by Hyman
/(?<=Start of numbers).*(?=End of numbers)/s
You need to enable the dotall flag.
您需要启用 dotall 标志。

