C++ 转换 unicode 字符串,反之亦然

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4786292/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 16:29:07  来源:igfitidea点击:

Converting unicode strings and vice-versa

c++unicode

提问by user963241

I'm kind of new to using Unicodestring and pointers and I've no idea how the conversion to unicode to ascii and versa-versa works. Following is what I'm trying to do,

我对使用Unicode字符串和指针有点陌生,我不知道从 unicode 到 ascii 的转换是如何工作的,反之亦然。以下是我正在尝试做的事情,

const wchar_t *p = L"This is a string";

If I wanted to convert it to char*, how would the conversion work with converting wchar_t*to char*and vice-versa?

如果我想将其转换为char*,转换如何与转换wchar_t*char*,反之亦然?

or by value using wstringto stringclass object and vice-versa

或通过使用值wstringstring类对象,反之亦然

std::wstring wstr = L"This is a string";

If i'm correct, can you just copy the string to a new buffer without conversion?

如果我是对的,您可以将字符串复制到新缓冲区而不进行转换吗?

采纳答案by Eugene Mayevski 'Callback

The solutions are platform-dependent. On Windows use MultiByteToWideCharand WideCharToMultiByteAPI functions. On Unix/linux platforms iconvlibrary is quite popular.

解决方案依赖于平台。在 Windows 上使用MultiByteToWideCharWideCharToMultiByteAPI 函数。在 Unix/linux 平台上iconv库非常流行。

回答by Philipp

In the future (VS 2010 already supports it), this will be possible in standard C++ (finally!):

将来(VS 2010 已经支持它),这将在标准 C++ 中成为可能(终于!):

#include <string>
#include <locale>

std::wstring_convert<std::codecvt_utf8<wchar_t>> converter;
const std::wstring wide_string = L"This is a string";
const std::string utf8_string = converter.to_bytes(wide_string);

回答by MSalters

The conversion from ASCII to Unicode and vice versa are quite trivial. By design, the first 128 Unicode values are the same as ASCII (in fact, the first 256 are equal to ISO-8859-1).

从 ASCII 到 Unicode 以及从 ASCII 到 Unicode 的转换非常简单。按照设计,前 128 个 Unicode 值与 ASCII 相同(实际上,前 256 个等于 ISO-8859-1)。

So the following code works on systems where charis ASCII and wchar_tis Unicode:

因此,以下代码适用于charASCII 和wchar_tUnicode 的系统:

const char* ASCII = "Hello, world";
std::wstring Unicode(ASCII, ASCII+strlen(ASCII));

You can't reverse it this simple: 汉 does exist in Unicode but not in ASCII, so how would you "convert" it?

你不能这么简单地反转它:汉确实存在于 Unicode 中,但不存在于 ASCII 中,那么你将如何“转换”它?

回答by Thomas

C++ by itself doesn't offer this functionality. You'll need a separate library, like libiconv.

C++ 本身不提供此功能。您将需要一个单独的库,例如libiconv

回答by cpx

C Standard library functions: mbstowcsand wcstombs

C 标准库函数:mbstowcswcstombs

回答by bratao

The widen() algorithm converts charto wchar_t:

widen() 算法转换charwchar_t

char a;
a = 'a';
whcar_t wa = cin.widen(a);

Of course, you have to put it into a loop. And resolve the *; The opposite is accomplished by narrow()

当然,您必须将其放入循环中。并解析 *; 相反是通过narrow()