c++ 将wchar_t转换为字符

t5fffqht 于 2023-03-14 发布在其他

关注(0)|答案(9)|浏览(210)

我想知道这样做安全吗？

wchar_t wide = /* something */;
assert(wide >= 0 && wide < 256 &&);
char myChar = static_cast<char>(wide);

如果我很确定宽字符将落在ASCII范围内。

c++

来源：https://stackoverflow.com/questions/3019977/convert-wchar-t-to-char

9条答案

按热度按时间

evrscar21#

为什么不直接使用库例程wcstombs呢？

赞(0）回复(0）举报 2023-03-14

pod7payv2#

assert用于确保在调试模式下某些内容为真，而不会对发布版本产生任何影响。最好使用if语句，并为超出范围的字符制定备用计划，除非获取超出范围的字符的唯一方法是通过程序错误。
此外，根据您的字符编码，您可能会发现Unicode字符0x80到0xff与它们的char版本之间存在差异。

赞(0）回复(0）举报 2023-03-14

tvmytwxo3#

您正在寻找的页面它符合ANSI标准，所以您可以信赖它。即使wchar_t使用255以上的代码，它也能工作。您几乎肯定不想使用它。
wchar_t * 是 * 整型类型，所以如果你真的这么做了，编译器不会抱怨：

char x = (char)wc;

但是 * 因为 * 它是整型，所以绝对没有理由这样做。如果你不小心读到了Herbert Schildt's C: The Complete Reference，或者任何基于它的C书籍，那么你就完全被误导了。Characters 应该是int * 或者更好的类型。这意味着你应该这样写：

int x = getchar();

而不是这个

char x = getchar(); /* <- WRONG! */

对于整型类型，char毫无价值，你不应该创建带char类型参数的函数，也不应该创建char类型的临时变量，同样的建议也适用于wchar_t。
char*可能是字符串的一个方便的typedef，但是不管cdecl工具怎么说，把它当作一个“字符数组”或“指向字符数组的指针”是一个新手错误。

for(int i = 0; s[i]; ++i) {
  wchar_t wc = s[i];
  char c = doit(wc);
  out[i] = c;
}

是荒谬的错误。它不会做你想做的事;它 * 会 * 以微妙而严重的方式崩溃，在不同的平台上表现不同，你 * 肯定会 * 把你的用户搞糊涂。如果你看到这个，你正在试图重新实现wctombs()，它已经是ANSI C的一部分，但它仍然是错误的。
您“真的”在寻找iconv()，它将一种编码的字符串（即使它被打包到wchar_t数组中）转换为另一种编码的字符串。
现在阅读this，了解iconv的问题所在。

赞(0）回复(0）举报 2023-03-14

1tu0hz3e4#

简单的方法是：

wstring your_wchar_in_ws(<your wchar>);
        string your_wchar_in_str(your_wchar_in_ws.begin(), your_wchar_in_ws.end());
        char* your_wchar_in_char =  your_wchar_in_str.c_str();

我用这个方法很多年了：）

赞(0）回复(0）举报 2023-03-14

jm2pwxwz5#

前一段时间我写的一个简短的函数，用来将wchar_t数组打包成char数组。ANSI代码页（0-127）上没有的字符被替换为“？”字符，并且它正确地处理代理对。

size_t to_narrow(const wchar_t * src, char * dest, size_t dest_len){
  size_t i;
  wchar_t code;

  i = 0;

  while (src[i] != '\0' && i < (dest_len - 1)){
    code = src[i];
    if (code < 128)
      dest[i] = char(code);
    else{
      dest[i] = '?';
      if (code >= 0xD800 && code <= 0xD8FF)
        // lead surrogate, skip the next code unit, which is the trail
        i++;
    }
    i++;
  }

  dest[i] = '\0';

  return i - 1;

}

赞(0）回复(0）举报 2023-03-14

kx7yvsdv6#

从技术上讲，“char”可以与“signed char”或“unsigned char”具有相同的范围。对于无符号字符，您的范围是正确的;理论上，对于有符号字符，你的条件是错误的。2在实践中，很少有编译器会反对--结果是一样的。
吹毛求疵：assert中的最后一个&&是语法错误。
Assert是否合适取决于您是否能够承受代码到达客户端时的崩溃，以及如果违反了Assert条件但Assert没有编译到代码中，您可以或应该做什么。对于调试工作来说，它似乎很好，但您可能还需要在它之后进行活动测试以进行运行时检查。

赞(0）回复(0）举报 2023-03-14

i2byvkas7#

这里有另一种方法，记得对结果使用free（）。

char* wchar_to_char(const wchar_t* pwchar)
{
    // get the number of characters in the string.
    int currentCharIndex = 0;
    char currentChar = pwchar[currentCharIndex];

    while (currentChar != '\0')
    {
        currentCharIndex++;
        currentChar = pwchar[currentCharIndex];
    }

    const int charCount = currentCharIndex + 1;

    // allocate a new block of memory size char (1 byte) instead of wide char (2 bytes)
    char* filePathC = (char*)malloc(sizeof(char) * charCount);

    for (int i = 0; i < charCount; i++)
    {
        // convert to char (1 byte)
        char character = pwchar[i];

        *filePathC = character;

        filePathC += sizeof(char);

    }
    filePathC += '\0';

    filePathC -= (sizeof(char) * charCount);

    return filePathC;
}

赞(0）回复(0）举报 2023-03-14

uidvcgyl8#

也可以转换wchar_t --〉wstring --〉string --〉char

wchar_t wide;
wstring wstrValue;
wstrValue[0] = wide

string strValue;
strValue.assign(wstrValue.begin(), wstrValue.end());  // convert wstring to string

char char_value = strValue[0];

赞(0）回复(0）举报 2023-03-14

qcuzuvrc9#

一般来说，当然不是int(wchar_t(255)) == int(char(255))，但这只是意味着它们有相同的int值，它们可能代表不同的字符。
你甚至会在大多数Windows PC上看到这样的差异。例如，在Windows代码页1250上，char(0xFF)与wchar_t(0x02D9)（上面的点）是相同的字符，而不是wchar_t(0x00FF)（带分音符的小y）。
注意，它甚至不适用于ASCII范围，因为C++甚至不需要ASCII。

赞(0）回复(0）举报 2023-03-14

我来回答

c++ 将wchar_t转换为字符

9条答案

相关问题

热门标签

最新问答