在 Linux 中是否有任何内置函数可以将 wstring 或 wchar_t* 转换为 UTF-8?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7469296/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-05 06:15:43  来源:igfitidea点击:

Is there any built-in function that convert wstring or wchar_t* to UTF-8 in Linux?

c++clinuxutf-8wstring

提问by Amir Saniyan

I want to convert wstring to UTF-8 Encoding, but I want to use built-in functions of Linux.

我想将 wstring 转换为 UTF-8 编码,但我想使用 Linux 的内置函数。

Is there any built-in function that convert wstringor wchar_t*to UTF-8 in Linux with simple invokation?

是否有任何内置函数可以通过简单的调用在 Linux中转换wstring或转换wchar_t*为 UTF-8 ?

Example:

例子:

wstring str = L"file_name.txt";
wstring mode = "a";
fopen([FUNCTION](str), [FUNCTION](mode)); // Simple invoke.
cout << [FUNCTION](str); // Simple invoke.

采纳答案by Kerrek SB

The C++ language standard has no notion of explicit encodings. It only contains an opaque notion of a "system encoding", for which wchar_tis a "sufficiently large" type.

C++ 语言标准没有显式编码的概念。它只包含一个“系统编码”的不透明概念,它wchar_t是一个“足够大”的类型。

To convert from the opaque system encoding to an explicit external encoding, you must use an external library. The library of choice would be iconv()(from WCHAR_Tto UTF-8), which is part of Posix and available on many platforms, although on Windows the WideCharToMultibytefunctions is guaranteed to produce UTF8.

要将不透明系统编码转换为显式外部编码,您必须使用外部库。选择的库是iconv()(from WCHAR_Tto UTF-8),它是 Posix 的一部分,可在许多平台上使用,尽管在 Windows 上,这些WideCharToMultibyte函数保证生成 UTF8。

C++11 adds new UTF8 literalsin the form of std::string s = u8"Hello World: \U0010FFFF";. Those are already in UTF8, but they cannot interface with the opaque wstringother than through the way I described.

C ++ 11增加了新的UTF8文字的形式std::string s = u8"Hello World: \U0010FFFF";。这些已经在 UTF8 中了,但是wstring除了通过我描述的方式之外,它们无法与不透明的接口连接。

See this questionfor a bit more background.

有关更多背景信息,请参阅此问题

回答by thiton

Certainly there is no function built in on Linux, because the name Linux references the kernel only, which doesn't have anything to with it. I seriously doubt that the libc that comes with gcc has such a function, and

当然,Linux 上没有内置函数,因为 Linux 名称仅引用内核,与内核无关。我严重怀疑gcc自带的libc有这样的功能,而且

$ man -k utf

supports this theory. But there are plenty of good UTF-8 libraries around. I personally recommend the iconv library for such conversions.

支持这个理论。但是周围有很多很好的 UTF-8 库。我个人推荐 iconv 库进行此类转换。

回答by David Heffernan

It's quite plausible that wcstombs will do what you need if what you actually want to do is convert from wide characters to the current locale.

如果您真正想要做的是从宽字符转换为当前语言环境,那么 wcstombs 会做您需要的事情是很有道理的。

If not then you probably need to look to ICU, boost or similar.

如果没有,那么您可能需要看 ICU、boost 或类似的。

回答by Cubbi

If/when your compiler supports enough of C++11, you could use wstring_convert

如果/当您的编译器支持足够的 C++11,您可以使用 wstring_convert

#include <iostream>
#include <codecvt>
#include <locale>
int main()
{
    std::wstring_convert<std::codecvt_utf8<wchar_t>> utf8_conv;
    std::wstring str = L"file_name.txt";
    std::cout << utf8_conv.to_bytes(str) << '\n';
}

tested with clang++ 2.9/libc++ on Linux and Visual Studio 2010 on Windows.

在 Linux 上使用 clang++ 2.9/libc++ 和在 Windows 上使用 Visual Studio 2010 进行测试。