C++ 转换 unicode 字符串,反之亦然
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4786292/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Converting unicode strings and vice-versa
提问by user963241
I'm kind of new to using Unicodestring and pointers and I've no idea how the conversion to unicode to ascii and versa-versa works. Following is what I'm trying to do,
我对使用Unicode字符串和指针有点陌生,我不知道从 unicode 到 ascii 的转换是如何工作的,反之亦然。以下是我正在尝试做的事情,
const wchar_t *p = L"This is a string";
If I wanted to convert it to char*
, how would the conversion work with converting wchar_t*
to char*
and vice-versa?
如果我想将其转换为char*
,转换如何与转换wchar_t*
为char*
,反之亦然?
or by value using wstring
to string
class object and vice-versa
或通过使用值wstring
到string
类对象,反之亦然
std::wstring wstr = L"This is a string";
If i'm correct, can you just copy the string to a new buffer without conversion?
如果我是对的,您可以将字符串复制到新缓冲区而不进行转换吗?
采纳答案by Eugene Mayevski 'Callback
The solutions are platform-dependent. On Windows use MultiByteToWideCharand WideCharToMultiByteAPI functions. On Unix/linux platforms iconvlibrary is quite popular.
解决方案依赖于平台。在 Windows 上使用MultiByteToWideChar和WideCharToMultiByteAPI 函数。在 Unix/linux 平台上iconv库非常流行。
回答by Philipp
In the future (VS 2010 already supports it), this will be possible in standard C++ (finally!):
将来(VS 2010 已经支持它),这将在标准 C++ 中成为可能(终于!):
#include <string>
#include <locale>
std::wstring_convert<std::codecvt_utf8<wchar_t>> converter;
const std::wstring wide_string = L"This is a string";
const std::string utf8_string = converter.to_bytes(wide_string);
回答by MSalters
The conversion from ASCII to Unicode and vice versa are quite trivial. By design, the first 128 Unicode values are the same as ASCII (in fact, the first 256 are equal to ISO-8859-1).
从 ASCII 到 Unicode 以及从 ASCII 到 Unicode 的转换非常简单。按照设计,前 128 个 Unicode 值与 ASCII 相同(实际上,前 256 个等于 ISO-8859-1)。
So the following code works on systems where char
is ASCII and wchar_t
is Unicode:
因此,以下代码适用于char
ASCII 和wchar_t
Unicode 的系统:
const char* ASCII = "Hello, world";
std::wstring Unicode(ASCII, ASCII+strlen(ASCII));
You can't reverse it this simple: 汉 does exist in Unicode but not in ASCII, so how would you "convert" it?
你不能这么简单地反转它:汉确实存在于 Unicode 中,但不存在于 ASCII 中,那么你将如何“转换”它?
回答by Thomas
回答by cpx
C Standard library functions: mbstowcs
and wcstombs
C 标准库函数:mbstowcs
和wcstombs
回答by bratao
The widen() algorithm converts char
to wchar_t
:
widen() 算法转换char
为wchar_t
:
char a;
a = 'a';
whcar_t wa = cin.widen(a);
Of course, you have to put it into a loop. And resolve the *;
The opposite is accomplished by narrow()
当然,您必须将其放入循环中。并解析 *; 相反是通过narrow()