C++ 你如何正确使用 WideCharToMultiByte

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/215963/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 13:47:05  来源:igfitidea点击:

How do you properly use WideCharToMultiByte

c++unicodecharacter-encodingcodepages

提问by Obediah Stane

I've read the documentation on WideCharToMultiByte, but I'm stuck on this parameter:

我已阅读有关WideCharToMultiByte的文档,但我坚持使用此参数:

lpMultiByteStr
[out] Pointer to a buffer that receives the converted string.

I'm not quite sure how to properly initialize the variable and feed it into the function

我不太确定如何正确初始化变量并将其提供给函数

回答by tfinniga

Here's a couple of functions (based on Brian Bondy's example) that use WideCharToMultiByte and MultiByteToWideChar to convert between std::wstring and std::string using utf8 to not lose any data.

这里有几个函数(基于 Brian Bondy 的示例),它们使用 WideCharToMultiByte 和 MultiByteToWideChar 在 std::wstring 和 std::string 之间转换,使用 utf8 不会丢失任何数据。

// Convert a wide Unicode string to an UTF8 string
std::string utf8_encode(const std::wstring &wstr)
{
    if( wstr.empty() ) return std::string();
    int size_needed = WideCharToMultiByte(CP_UTF8, 0, &wstr[0], (int)wstr.size(), NULL, 0, NULL, NULL);
    std::string strTo( size_needed, 0 );
    WideCharToMultiByte                  (CP_UTF8, 0, &wstr[0], (int)wstr.size(), &strTo[0], size_needed, NULL, NULL);
    return strTo;
}

// Convert an UTF8 string to a wide Unicode String
std::wstring utf8_decode(const std::string &str)
{
    if( str.empty() ) return std::wstring();
    int size_needed = MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), NULL, 0);
    std::wstring wstrTo( size_needed, 0 );
    MultiByteToWideChar                  (CP_UTF8, 0, &str[0], (int)str.size(), &wstrTo[0], size_needed);
    return wstrTo;
}

回答by Michael Burr

Elaborating on the answerprovided by Brian R. Bondy: Here's an example that shows why you can't simply size the output buffer to the number of wide characters in the source string:

详细说明Brian R. Bondy 提供的答案:以下示例说明了为什么不能简单地将输出缓冲区的大小设置为源字符串中的宽字符数:

#include <windows.h>
#include <stdio.h>
#include <wchar.h>
#include <string.h>

/* string consisting of several Asian characters */
wchar_t wcsString[] = L"\u9580\u961c\u9640\u963f\u963b\u9644";

int main() 
{

    size_t wcsChars = wcslen( wcsString);

    size_t sizeRequired = WideCharToMultiByte( 950, 0, wcsString, -1, 
                                               NULL, 0,  NULL, NULL);

    printf( "Wide chars in wcsString: %u\n", wcsChars);
    printf( "Bytes required for CP950 encoding (excluding NUL terminator): %u\n",
             sizeRequired-1);

    sizeRequired = WideCharToMultiByte( CP_UTF8, 0, wcsString, -1,
                                        NULL, 0,  NULL, NULL);
    printf( "Bytes required for UTF8 encoding (excluding NUL terminator): %u\n",
             sizeRequired-1);
}

And the output:

和输出:

Wide chars in wcsString: 6
Bytes required for CP950 encoding (excluding NUL terminator): 12
Bytes required for UTF8 encoding (excluding NUL terminator): 18

回答by Brian R. Bondy

You use the lpMultiByteStr [out] parameter by creating a new char array. You then pass this char array in to get it filled. You only need to initialize the length of the string + 1 so that you can have a null terminated string after the conversion.

您可以通过创建新的字符数组来使用 lpMultiByteStr [out] 参数。然后您将这个字符数组传入以填充它。你只需要初始化字符串的长度+1,这样你就可以在转换后有一个空终止的字符串。

Here are a couple of useful helper functions for you, they show the usage of all parameters.

这里有几个有用的辅助函数,它们显示了所有参数的用法。

#include <string>

std::string wstrtostr(const std::wstring &wstr)
{
    // Convert a Unicode string to an ASCII string
    std::string strTo;
    char *szTo = new char[wstr.length() + 1];
    szTo[wstr.size()] = '
//pX is an out parameter, it fills your variable with 10.
void fillXWith10(int *pX)
{
  *pX = 10;
}

int main(int argc, char ** argv)
{
  int X;
  fillXWith10(&X);
  return 0;
}
'; WideCharToMultiByte(CP_ACP, 0, wstr.c_str(), -1, szTo, (int)wstr.length(), NULL, NULL); strTo = szTo; delete[] szTo; return strTo; } std::wstring strtowstr(const std::string &str) { // Convert an ASCII string to a Unicode String std::wstring wstrTo; wchar_t *wszTo = new wchar_t[str.length() + 1]; wszTo[str.size()] = L'##代码##'; MultiByteToWideChar(CP_ACP, 0, str.c_str(), -1, wszTo, (int)str.length()); wstrTo = wszTo; delete[] wszTo; return wstrTo; }

--

——

Anytime in documentation when you see that it has a parameter which is a pointer to a type, and they tell you it is an out variable, you will want to create that type, and then pass in a pointer to it. The function will use that pointer to fill your variable.

任何时候在文档中,当您看到它有一个参数是一个类型的指针,并且他们告诉您它是一个 out 变量时,您将想要创建该类型,然后传入一个指向它的指针。该函数将使用该指针来填充您的变量。

So you can understand this better:

所以你可以更好地理解这一点:

##代码##