带有 Boost Regex 的 C++ 正则表达式
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5804453/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
C++ Regular Expressions with Boost Regex
提问by alyx
I am trying to take a string in C++ and find all IP addresses contained inside, and put them into a new vector string.
我试图在 C++ 中获取一个字符串并找到其中包含的所有 IP 地址,并将它们放入一个新的向量字符串中。
I've read a lot of documentation on regex, but I just can't seem to understand how to do this simple function.
我已经阅读了很多关于正则表达式的文档,但我似乎无法理解如何执行这个简单的功能。
I believe I can use this Perl expression to find any IP address:
我相信我可以使用这个 Perl 表达式来查找任何 IP 地址:
re("\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b");
But I am still stumped on how to do the rest.
但我仍然对如何做剩下的事情感到困惑。
回答by Vitus
Perhaps you're looking for something like this. It uses regex_iterator
to get all matches of the current pattern. See reference.
也许你正在寻找这样的东西。它用于regex_iterator
获取当前模式的所有匹配项。见参考。
#include <boost/regex.hpp>
#include <iostream>
#include <string>
int main()
{
std::string text(" 192.168.0.1 abc 10.0.0.255 10.5.1 1.2.3.4a 5.4.3.2 ");
const char* pattern =
"\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)"
"\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)"
"\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)"
"\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b";
boost::regex ip_regex(pattern);
boost::sregex_iterator it(text.begin(), text.end(), ip_regex);
boost::sregex_iterator end;
for (; it != end; ++it) {
std::cout << it->str() << "\n";
// v.push_back(it->str()); or something similar
}
}
Output:
输出:
192.168.0.1
10.0.0.255
5.4.3.2
Side note: you probably meant \\b
instead of \b
; I doubt you watnted to match backspace character.
旁注:您可能的意思是\\b
而不是\b
; 我怀疑你想要匹配退格字符。
回答by kelanth
The offered solution is quite good, thanks for it. Though I found a slight mistake in the pattern itself.
提供的解决方案非常好,谢谢。虽然我发现模式本身有一个小错误。
For example, something like 49.000.00.01 would be taken as a valid IPv4 address and from my understanding, it shouldn't be (just happened to me during some dump processing).
例如,像 49.000.00.01 这样的东西将被视为有效的 IPv4 地址,根据我的理解,它不应该是(只是在某些转储处理期间发生在我身上)。
I suggest to improve the patter into:
我建议将模式改进为:
"\b(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]?|0)"
"\.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]?|0)"
"\.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]?|0)"
"\.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]?|0)\b";
This should allow only 0.0.0.0 as the all-zero-in, which I suppose to be correct and it will eliminate all .00. .000. etc.
这应该只允许 0.0.0.0 作为全零输入,我认为这是正确的,它将消除所有 .00。.000。等等。
回答by Douglas Daseeco
#include <string>
#include <list>
#include <boost/regex.hpp>
typedef std::string::const_iterator ConstIt;
int main()
{
// input text, expected result, & proper address pattern
const std::string sInput
(
"192.168.0.1 10.0.0.255 abc 10.5.1.00"
" 1.2.3.4a 168.72.0 0.0.0.0 5.4.3.2"
);
const std::string asExpected[] =
{
"192.168.0.1",
"10.0.0.255",
"0.0.0.0",
"5.4.3.2"
};
boost::regex regexIPs
(
"(^|[ \t])("
"(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])[.]"
"(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])[.]"
"(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])[.]"
"(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])"
")($|[ \t])"
);
// parse, check results, and return error count
boost::smatch what;
std::list<std::string> ns;
ConstIt end = sInput.end();
for (ConstIt begin = sInput.begin();
boost::regex_search(begin, end, what, regexIPs);
begin = what[0].second)
{
ns.push_back(std::string(what[2].first, what[2].second));
}
// check results and return number of errors (zero)
int iErrors = 0;
int i = 0;
for (std::string & s : ns)
if (s != asExpected[i ++])
++ iErrors;
return iErrors;
}