转义 C++ 字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2417588/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Escaping a C++ string
提问by Danra
What's the easiest way to convert a C++ std::string to another std::string, which has all the unprintable characters escaped?
将 C++ std::string 转换为另一个 std::string 的最简单方法是什么,其中所有不可打印的字符都已转义?
For example, for the string of two characters [0x61,0x01], the result string might be "a\x01" or "a%01".
例如,对于两个字符 [0x61,0x01] 的字符串,结果字符串可能是“a\x01”或“a%01”。
采纳答案by Josh Kelley
Take a look at the Boost's String Algorithm Library. You can use its is_printclassifier (together with its operator! overload) to pick out nonprintable characters, and its find_format()functions can replace those with whatever formatting you wish.
看看 Boost 的字符串算法库。你可以使用它的is_print分类器(连同它的运算符!重载)来挑选不可打印的字符,它的find_format()函数可以用你想要的任何格式替换这些字符。
#include <iostream>
#include <boost/format.hpp>
#include <boost/algorithm/string.hpp>
struct character_escaper
{
template<typename FindResultT>
std::string operator()(const FindResultT& Match) const
{
std::string s;
for (typename FindResultT::const_iterator i = Match.begin();
i != Match.end();
i++) {
s += str(boost::format("\x%02x") % static_cast<int>(*i));
}
return s;
}
};
int main (int argc, char **argv)
{
std::string s("a\x01");
boost::find_format_all(s, boost::token_finder(!boost::is_print()), character_escaper());
std::cout << s << std::endl;
return 0;
}
回答by Josh Kelley
Assumes the execution character set is a superset of ASCII and CHAR_BITis 8. For the OutIterpass a back_inserter(e.g. to a vector<char>or another string), ostream_iterator, or any other suitable output iterator.
假设执行字符集是 ASCII 的超集并且CHAR_BIT是 8。对于OutIter,传递back_inserter(例如到vector<char>或另一个字符串)、ostream_iterator或任何其他合适的输出迭代器。
template<class OutIter>
OutIter write_escaped(std::string const& s, OutIter out) {
*out++ = '"';
for (std::string::const_iterator i = s.begin(), end = s.end(); i != end; ++i) {
unsigned char c = *i;
if (' ' <= c and c <= '~' and c != '\' and c != '"') {
*out++ = c;
}
else {
*out++ = '\';
switch(c) {
case '"': *out++ = '"'; break;
case '\': *out++ = '\'; break;
case '\t': *out++ = 't'; break;
case '\r': *out++ = 'r'; break;
case '\n': *out++ = 'n'; break;
default:
char const* const hexdig = "0123456789ABCDEF";
*out++ = 'x';
*out++ = hexdig[c >> 4];
*out++ = hexdig[c & 0xF];
}
}
}
*out++ = '"';
return out;
}
回答by Scindix
Assuming that "easiest way" means short and yet easily understandable while not depending on any other resources (like libs) I would go this way:
假设“最简单的方法”意味着简短且易于理解,而不依赖于任何其他资源(如库),我会这样做:
#include <cctype>
#include <sstream>
// s is our escaped output string
std::string s = "";
// loop through all characters
for(char c : your_string)
{
// check if a given character is printable
// the cast is necessary to avoid undefined behaviour
if(isprint((unsigned char)c))
s += c;
else
{
std::stringstream stream;
// if the character is not printable
// we'll convert it to a hex string using a stringstream
// note that since char is signed we have to cast it to unsigned first
stream << std::hex << (unsigned int)(unsigned char)(c);
std::string code = stream.str();
s += std::string("\x")+(code.size()<2?"0":"")+code;
// alternatively for URL encodings:
//s += std::string("%")+(code.size()<2?"0":"")+code;
}
}
回答by Douglas Leeder
One person's unprintable character is another's multi-byte character. So you'll have to define the encoding before you can work out what bytes map to what characters, and which of those is unprintable.
一个人的不可打印字符是另一个人的多字节字符。因此,您必须先定义编码,然后才能确定哪些字节映射到哪些字符,哪些是不可打印的。
回答by hkaiser
Have you seen the article about how to Generate Escaped String Output Using Spirit.Karma?
您是否看过有关如何使用 Spirit.Karma 生成转义字符串输出的文章?