在 Linux 平台上生成 R 中的扩展 ASCII


到目前为止

,我一直在Windows平台上工作,我获取扩展ascii字符的代码是这样的:

extendedascii=rawToChar(as.raw(seq(128,255,by=1)),multiple=TRUE)

这给了我一个包含我需要的字符的矢量。

  [1] "€" "" "‚" "ƒ" "„" "…" "†" "‡" "ˆ" "‰" "Š" "‹" "Œ" "" "Ž" "" "" "‘" "’" "“" "”" "•" "–" "—" "˜" "™" "š" "›" "œ" "" "ž" "Ÿ" " " "¡" "¢" "£" "¤" "¥" "¦"
 [40] "§" "¨" "©" "ª" "«" "¬" "­" "®" "¯" "°" "±" "²" "³" "´" "µ" "¶" "·" "¸" "¹" "º" "»" "¼" "½" "¾" "¿" "À" "Á" "Â" "Ã" "Ä" "Å" "Æ" "Ç" "È" "É" "Ê" "Ë" "Ì" "Í"
 [79] "Î" "Ï" "Ð" "Ñ" "Ò" "Ó" "Ô" "Õ" "Ö" "×" "Ø" "Ù" "Ú" "Û" "Ü" "Ý" "Þ" "ß" "à" "á" "â" "ã" "ä" "å" "æ" "ç" "è" "é" "ê" "ë" "ì" "í" "î" "ï" "ð" "ñ" "ò" "ó" "ô"
[118] "õ" "ö" "÷" "ø" "ù" "ú" "û" "ü" "ý" "þ" "ÿ"

现在,在 Linux 上,我得到这个:

  [1] "x80" "x81" "x82" "x83" "x84" "x85" "x86" "x87" "x88" "x89" "x8a" "x8b" "x8c"
 [14] "x8d" "x8e" "x8f" "x90" "x91" "x92" "x93" "x94" "x95" "x96" "x97" "x98" "x99"
 [27] "x9a" "x9b" "x9c" "x9d" "x9e" "x9f" "xa0" "xa1" "xa2" "xa3" "xa4" "xa5" "xa6"
 [40] "xa7" "xa8" "xa9" "xaa" "xab" "xac" "xad" "xae" "xaf" "xb0" "xb1" "xb2" "xb3"
 [53] "xb4" "xb5" "xb6" "xb7" "xb8" "xb9" "xba" "xbb" "xbc" "xbd" "xbe" "xbf" "xc0"
 [66] "xc1" "xc2" "xc3" "xc4" "xc5" "xc6" "xc7" "xc8" "xc9" "xca" "xcb" "xcc" "xcd"
 [79] "xce" "xcf" "xd0" "xd1" "xd2" "xd3" "xd4" "xd5" "xd6" "xd7" "xd8" "xd9" "xda"
 [92] "xdb" "xdc" "xdd" "xde" "xdf" "xe0" "xe1" "xe2" "xe3" "xe4" "xe5" "xe6" "xe7"
[105] "xe8" "xe9" "xea" "xeb" "xec" "xed" "xee" "xef" "xf0" "xf1" "xf2" "xf3" "xf4"
[118] "xf5" "xf6" "xf7" "xf8" "xf9" "xfa" "xfb" "xfc" "xfd" "xfe" "xff"

我尝试了Encoding(extensesascii)并获得了向量所有元素的"Unknown"。我也尝试了iconv(extendedascii, from="UTF-8", to="ASCII"),最终得到了 NA。

我相信我的基本问题是我不知道我的文本是什么编码,而且,我的机器可能不知道/识别它。有什么帮助吗?

没有扩展 ASCII 这样的东西。您在 Windows 上的编码称为 Windows-1252 或 CP-1252。 iconv很清楚这一点。

如果这种编码中有许多文件,则可能需要继续在 Linux 上使用 iconv;否则,一劳永逸地切换到 UTF-8 是有意义的。

相关内容

  • 没有找到相关文章

最新更新