在 C++ 中编码/解码 URL
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/154536/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Encode/Decode URLs in C++
提问by user126593
Does anyone know of any good C++ code that does this?
有谁知道这样做的任何好的 C++ 代码?
回答by xperroni
I faced the encoding half of this problem the other day. Unhappy with the available options, and after taking a look at this C sample code, i decided to roll my own C++ url-encode function:
前几天我遇到了这个问题的编码一半。对可用选项不满意,在查看了这个 C 示例代码后,我决定推出自己的 C++ url-encode 函数:
#include <cctype>
#include <iomanip>
#include <sstream>
#include <string>
using namespace std;
string url_encode(const string &value) {
ostringstream escaped;
escaped.fill('0');
escaped << hex;
for (string::const_iterator i = value.begin(), n = value.end(); i != n; ++i) {
string::value_type c = (*i);
// Keep alphanumeric and other accepted characters intact
if (isalnum(c) || c == '-' || c == '_' || c == '.' || c == '~') {
escaped << c;
continue;
}
// Any other characters are percent-encoded
escaped << uppercase;
escaped << '%' << setw(2) << int((unsigned char) c);
escaped << nouppercase;
}
return escaped.str();
}
The implementation of the decode function is left as an exercise to the reader. :P
decode 函数的实现留给读者作为练习。:P
回答by user126593
Answering my own question...
回答我自己的问题...
libcurl has curl_easy_escapefor encoding.
libcurl 有curl_easy_escape用于编码。
For decoding, curl_easy_unescape
回答by user126593
string urlDecode(string &SRC) {
string ret;
char ch;
int i, ii;
for (i=0; i<SRC.length(); i++) {
if (int(SRC[i])==37) {
sscanf(SRC.substr(i+1,2).c_str(), "%x", &ii);
ch=static_cast<char>(ii);
ret+=ch;
i=i+2;
} else {
ret+=SRC[i];
}
}
return (ret);
}
not the best, but working fine ;-)
不是最好的,但工作正常;-)
回答by Yuriy Petrovskiy
cpp-netlibhas functions
cpp-netlib有函数
namespace boost {
namespace network {
namespace uri {
inline std::string decoded(const std::string &input);
inline std::string encoded(const std::string &input);
}
}
}
they allow to encode and decode URL strings very easy.
它们允许非常容易地编码和解码 URL 字符串。
回答by tormuto
Ordinarily adding '%' to the int value of a char will not work when encoding, the value is supposed to the the hex equivalent. e.g '/' is '%2F' not '%47'.
通常在编码时将 '%' 添加到 char 的 int 值将不起作用,该值应该是等效的十六进制值。例如,“/”是“%2F”而不是“%47”。
I think this is the best and concise solutions for both url encoding and decoding (No much header dependencies).
我认为这是 url 编码和解码的最佳和简洁的解决方案(没有太多的头依赖)。
string urlEncode(string str){
string new_str = "";
char c;
int ic;
const char* chars = str.c_str();
char bufHex[10];
int len = strlen(chars);
for(int i=0;i<len;i++){
c = chars[i];
ic = c;
// uncomment this if you want to encode spaces with +
/*if (c==' ') new_str += '+';
else */if (isalnum(c) || c == '-' || c == '_' || c == '.' || c == '~') new_str += c;
else {
sprintf(bufHex,"%X",c);
if(ic < 16)
new_str += "%0";
else
new_str += "%";
new_str += bufHex;
}
}
return new_str;
}
string urlDecode(string str){
string ret;
char ch;
int i, ii, len = str.length();
for (i=0; i < len; i++){
if(str[i] != '%'){
if(str[i] == '+')
ret += ' ';
else
ret += str[i];
}else{
sscanf(str.substr(i + 1, 2).c_str(), "%x", &ii);
ch = static_cast<char>(ii);
ret += ch;
i = i + 2;
}
}
return ret;
}
回答by kreuzerkrieg
[Necromancer mode on]
Stumbled upon this question when was looking for fast, modern, platform independent and elegant solution. Didnt like any of above, cpp-netlib would be the winner but it has horrific memory vulnerability in "decoded" function. So I came up with boost's spirit qi/karma solution.
[死灵法师模式开启]
在寻找快速、现代、独立于平台且优雅的解决方案时偶然发现了这个问题。与上述任何一个不同,cpp-netlib 将是赢家,但它在“解码”功能中具有可怕的内存漏洞。所以我想出了boost的灵气/业力解决方案。
namespace bsq = boost::spirit::qi;
namespace bk = boost::spirit::karma;
bsq::int_parser<unsigned char, 16, 2, 2> hex_byte;
template <typename InputIterator>
struct unescaped_string
: bsq::grammar<InputIterator, std::string(char const *)> {
unescaped_string() : unescaped_string::base_type(unesc_str) {
unesc_char.add("+", ' ');
unesc_str = *(unesc_char | "%" >> hex_byte | bsq::char_);
}
bsq::rule<InputIterator, std::string(char const *)> unesc_str;
bsq::symbols<char const, char const> unesc_char;
};
template <typename OutputIterator>
struct escaped_string : bk::grammar<OutputIterator, std::string(char const *)> {
escaped_string() : escaped_string::base_type(esc_str) {
esc_str = *(bk::char_("a-zA-Z0-9_.~-") | "%" << bk::right_align(2,0)[bk::hex]);
}
bk::rule<OutputIterator, std::string(char const *)> esc_str;
};
The usage of above as following:
以上用法如下:
std::string unescape(const std::string &input) {
std::string retVal;
retVal.reserve(input.size());
typedef std::string::const_iterator iterator_type;
char const *start = "";
iterator_type beg = input.begin();
iterator_type end = input.end();
unescaped_string<iterator_type> p;
if (!bsq::parse(beg, end, p(start), retVal))
retVal = input;
return retVal;
}
std::string escape(const std::string &input) {
typedef std::back_insert_iterator<std::string> sink_type;
std::string retVal;
retVal.reserve(input.size() * 3);
sink_type sink(retVal);
char const *start = "";
escaped_string<sink_type> g;
if (!bk::generate(sink, g(start), input))
retVal = input;
return retVal;
}
[Necromancer mode off]
[死灵法师模式关闭]
EDIT01: fixed the zero padding stuff - special thanks to Hartmut Kaiser
EDIT02: Live on CoLiRu
EDIT01:修复零填充的东西 - 特别感谢 Hartmut Kaiser
EDIT02:Live on CoLiRu
回答by kometen
Inspired by xperroni I wrote a decoder. Thank you for the pointer.
受 xperroni 的启发,我写了一个解码器。谢谢指点。
#include <iostream>
#include <sstream>
#include <string>
using namespace std;
char from_hex(char ch) {
return isdigit(ch) ? ch - '0' : tolower(ch) - 'a' + 10;
}
string url_decode(string text) {
char h;
ostringstream escaped;
escaped.fill('0');
for (auto i = text.begin(), n = text.end(); i != n; ++i) {
string::value_type c = (*i);
if (c == '%') {
if (i[1] && i[2]) {
h = from_hex(i[1]) << 4 | from_hex(i[2]);
escaped << h;
i += 2;
}
} else if (c == '+') {
escaped << ' ';
} else {
escaped << c;
}
}
return escaped.str();
}
int main(int argc, char** argv) {
string msg = "J%C3%B8rn!";
cout << msg << endl;
string decodemsg = url_decode(msg);
cout << decodemsg << endl;
return 0;
}
edit: Removed unneeded cctype and iomainip includes.
编辑:删除了不需要的 cctype 和 iomainip 包含。
回答by alanc10n
CGICCincludes methods to do url encode and decode. form_urlencode and form_urldecode
CGICC包括进行 url 编码和解码的方法。form_urlencode 和 form_urldecode
回答by Bagelzone Ha'bonè
Adding a follow-up to Bill's recommendation for using libcurl: great suggestion, and to be updated:
after 3 years, the curl_escapefunction is deprecated, so for future use it's better to use curl_easy_escape.
为 Bill 使用 libcurl 的建议添加后续内容:很好的建议,并进行更新:
3 年后,不推荐使用curl_escape函数,因此为了将来使用,最好使用curl_easy_escape。
回答by moonlightdock
I ended up on this question when searching for an api to decode url in a win32 c++ app. Since the question doesn't quite specify platform assuming windows isn't a bad thing.
在 win32 c++ 应用程序中搜索用于解码 url 的 api 时,我最终遇到了这个问题。由于这个问题并没有完全指定平台,假设 windows 不是一件坏事。
InternetCanonicalizeUrl is the API for windows programs. More info here
InternetCanonicalizeUrl 是 Windows 程序的 API。更多信息在这里
LPTSTR lpOutputBuffer = new TCHAR[1];
DWORD dwSize = 1;
BOOL fRes = ::InternetCanonicalizeUrl(strUrl, lpOutputBuffer, &dwSize, ICU_DECODE | ICU_NO_ENCODE);
DWORD dwError = ::GetLastError();
if (!fRes && dwError == ERROR_INSUFFICIENT_BUFFER)
{
delete lpOutputBuffer;
lpOutputBuffer = new TCHAR[dwSize];
fRes = ::InternetCanonicalizeUrl(strUrl, lpOutputBuffer, &dwSize, ICU_DECODE | ICU_NO_ENCODE);
if (fRes)
{
//lpOutputBuffer has decoded url
}
else
{
//failed to decode
}
if (lpOutputBuffer !=NULL)
{
delete [] lpOutputBuffer;
lpOutputBuffer = NULL;
}
}
else
{
//some other error OR the input string url is just 1 char and was successfully decoded
}
InternetCrackUrl (here) also seems to have flags to specify whether to decode url
InternetCrackUrl (这里) 似乎也有标志来指定是否解码 url