nodejs 对 child_process.spawn 和 child_process.execFile 中的参数使用

在 NodeJS 中，child_process.execFile和.spawn采用以下参数：

args <string[]>字符串参数列表。

NodeJS 如何对你在这个数组中传递的字符串进行编码？

上下文：我正在编写一个nodejs应用程序，它将元数据(通常包括非ASCII字符)添加到mp3中。

我知道ffmpeg期待utf8编码的参数。如果我的 nodejs 应用程序调用child_process.execFile("ffmpeg",["-metadata","title="+myString], {encoding:"utf8")那么 nodejs 如何在参数中对myString进行编码？
我知道id3v2期待拉丁1编码的参数。如果我的 nodejs 应用程序调用child_process.execFile("id3v2",["--titl",myString], {encoding:"latin1")那么 nodejs 如何在参数中编码myString？

我看到execFile和spawn都采取了"编码"论点。但是nodejs文档说"编码选项可用于指定用于解码stdout和stderr输出的字符编码。文档没有提到args的编码。

答：NodeJS总是将参数编码为UTF-8。

我写了一个简单的C++应用程序，它显示了传递到其argv的字节的原始真相：

#include <stdio.h>
int main(int argc, char *argv[])
{
printf("argc=%un", argc);
for (int i = 0; i < argc; i++)
{
printf("%u:"", i);
for (char *c = argv[i]; *c != 0; c++)
{
if (*c >= 32 && *c < 127)
printf("%c", *c);
else
{
unsigned char d = *(unsigned char *)c;
unsigned int e = d;
printf("\x%02X", e);
}
}
printf(""n");
}
return 0;
}

在我的 NodeJS 应用程序中，我得到了一些字符串，我确信它们来自什么：

const a = Buffer.from([65]).toString("utf8");
const pound = Buffer.from([0xc2, 0xa3]).toString("utf8");
const skull = Buffer.from([0xe2, 0x98, 0xa0]).toString("utf8");
const pound2 = Buffer.from([0xa3]).toString("latin1");

toString 的参数表示缓冲区中的原始字节应理解为缓冲区采用 UTF-8(或最后一种情况下为 latin1)。结果是我有四个字符串，我明确知道它们的内容是正确的。

(我知道Javascript VM通常将其字符串存储为UTF16？在我的实验中，pound 和 pound2 的行为相同，这一事实证明了字符串的来源并不重要。

最后，我用这些字符串调用了execFile：

child_process.execFileAsync("argcheck",[a,pound,pound2,skull],{encoding:"utf8"});
child_process.execFileAsync("argcheck",[a,pound,pound2,skull],{encoding:"latin1"});

在这两种情况下，nodejs 传递给 argv 的原始字节都是字符串a、pound、pound2、skull的 UTF-8 编码。

那么我们如何从nodejs传递latin1参数呢？

上面的解释表明nodejs不可能将127..255范围内的任何latin1字符传递给child_process.spawn/execFile。但是有一个涉及child_process.exec的逃生舱口：

示例：此字符串"A £☠">
在 Javascript 的 UTF16 内部存储为 "\u0041 \u00A3 \u2620">
以 UTF-8 编码为 "\x41 \xC2\xA3 \xE2\x98\xA0">
在拉丁语 1 中编码为 "\x41 \xA3 ？"(骷髅头和交叉骨在拉丁语中无法表达1)
Unicode 字符 0-127 与 latin1 相同，编码为 utf-8 与 latin1 相同
Unicode 字符 128-255 与 latin1 相同，但编码方式不同
Unicode chars 256+ 在 latin1/中不存在。

// this would encode them as utf8, which is wrong:
execFile("id3v2", ["--comment", "A £ ☠", "x.mp3"]);
// instead we'll use shell printf to bypass nodejs's wrongful encoding:
exec("id3v2 --comment "`printf "A xA3 ?"`" x.mp3");

这里有一个方便的方法，可以将像"A £☠"这样的字符串变成像"A \xA3 ？"这样的字符串，准备传递到child_process.exec：

const comment2 = [...comment]
.map(c =>
c <= "u007F" ? c : c <= "u00FF"
? `\x${("000" + c.charCodeAt(0).toString(16)).substr(-2)}` : "?")
)
.join("");
const cmd = `id3v2 --comment "`printf "${comment2}"`" "${fn}"`;
child_process.exec(cmd, (e, stdout, stderr) => { ... });

那么我们如何从nodejs传递latin1参数呢？

相关内容

最新更新

热门标签：