如何通过输入十进制数打印UTF-8符号?



例如,我输入十进制数 210,即符号"Ò"。

例如代码

int a = 210;
wcout << wchar_t (a);

工作正常,但在"wcout"之前,我使用"cout"并且它们不兼容。

int main() {
string a = "u";
string b = "210";
string c = a + b;
cout << b + a << endl;
cout << "Second cout message...";
}

错误:

main.cpp:4:15: error: u used with no
following hex digits
string a = "u";
^~
1 error generated.
compiler exit status 1

您可以使用std::wcrtomb将宽字符转换为多字节字符串。请参阅 https://en.cppreference.com/w/cpp/string/multibyte/wcrtomb

#include <iostream>
#include <cwchar>
#include <clocale>
int main() {
std::setlocale(LC_ALL, "en_US.utf8");
wchar_t wc = 127820;
char mbstr[5]{};
std::mbstate_t state{};
std::wcrtomb(mbstr, wc, &state);
std::cout << mbstr << std::endl;
}

https://ideone.com/QVMppe

另外,如果您对 Unicode 是什么以及如何使用可变宽度字符编码进行编码感兴趣,请参阅 https://en.wikipedia.org/wiki/UTF-8

UTF-8 很容易手动编码,例如:

std::string toUTF8(uint32_t cp)
{
char utf8[4];
int len = 0;
if (cp <= 0x007F)
{
utf8[0] = static_cast<char>(cp);
len = 1;
}
else
{
if (cp <= 0x07FF)
{
utf8[0] = 0xC0;
len = 2;
}
else if (cp <= 0xFFFF)
{
utf8[0] = 0xE0;
len = 3;
}
else if (cp <= 0x10FFFF)
{
utf8[0] = 0xF0;
len = 4;
}
else
throw std::invalid_argument("invalid codepoint");
for(int i = 1; i < len; ++i)
{
utf8[len-i] = static_cast<char>(0x80 | (cp & 0x3F));
cp >>= 6;
}
utf8[0] |= static_cast<char>(cp);
}
return std::string(utf8, len);
}
int main()
{
std::string utf8 = toUTF8(210);
std::cout << utf8;
}

现场演示

确保您的主机确实支持显示 UTF-8 文本。

最新更新