package.json脚本中的参数(编码问题)



https://nodejs.org/docs/latest/api/process.html#processargvhttps://www.golinuxcloud.com/pass-arguments-to-npm-script/

通过调用package.json中的脚本传递参数,如下所示:

--pathToFile=./ESMM/Parametrização_Dezembro_PS1_2022.xlsx

在代码中检索该参数作为参数

const value = process.argv.find( element => element.startsWith( `--pathToFile=` ) );
const pathToFile=value.replace( `--pathToFile=` , '' );

获得的字符串似乎是错误的格式/编码

/ESMM/参数├º├úo_Dezembro_PS_1_2022.xlsx

我尝试转换为latin1(使用此编码解决了过去的其他问题)

const latin1Buffer = buffer.transcode(Buffer.from(pathToFile), "utf8", "latin1");
const latin1String = latin1Buffer.toString("latin1");

但仍然无法获得正确编码的字符串:

/ESMM/Parametriza?º?úo_Dezembro_PS_1_2022.xlsx

我的package.json是UTF-8。

我当前的区域设置是(chcp):活动代码页:850

操作系统:Windows

这似乎与有关

  • https://code.visualstudio.com/docs/editor/tasks#_changing-任务输出的编码
  • vs代码、如何改变由"触发的终端的编码;构建任务">
  • https://pt.stackoverflow.com/questions/148543/como-consertar-erro-de-acentua%C3%A7%C3%A3o-执行cmd
  • 在Node.js中获取argv原始字节

将尝试这些配置

const min = parseInt("0xD800",16), max = parseInt("0xDFFF",16);
console.log(min);//55296
console.log(max);//57343
let textFiltered = "",specialChars = 0;
for(let charAux of pathToFile){
const hexChar = Buffer.from(charAux, 'utf8').toString('hex');
console.log(hexChar)
const intChar = parseInt(hexChar,16);
if(hexChar.length > 2){
//if(intChar>min && intChar<max){
//console.log(Buffer.from(charAux, 'utf8').toString('hex'))
specialChars++;
console.log(`specialChars(${specialChars}): ${hexChar}`);
}else{
textFiltered += String.fromCharCode(intChar);
}
}

console.log(textFiltered)//正常字符

/ESMM/Parametrizao_Zembro_PS_1_2022.xlsx

console.log(specialChars(${specialChars}): ${hexChar})//特殊字符

specialChars(1): e2949c  
specialChars(2): c2ba  
specialChars(3): e2949c  
specialChars(4): c3ba

似乎e2949c十六进制值指示一个特殊字符,因为它重复并且0xc2ba应该能够转换为"0";ç";以及0xc3ba到"0";ã"idealy还在想办法。

每个Unicode代码点都可以用\u{xxxxxx}写成一个字符串,其中xxxxxx表示1-6个十六进制数字

如@JosefZ所示,但对于Python,在我的情况下,gona使用直接转换,因为will all都有关键字"Parametrização;作为参数的一部分。

在这种情况下遇到的问题是,我的package.json和我的脚本的格式是@triplee所说的正确的UTF8格式(感谢providade的帮助),但process.argv返回<string[]>基本上是UTF16…所以我的解决方案是处理├在十六进制中是";e2949c";并检索正确的字符:

const UTF8_Character = "e2949c" //├
//for this cases use this json/array that haves the correct encoding
const personalized_encoding = {
"c2ba": "ç",
"c3ba": "ã"
}
let textFiltered = "",specialChars = 0;
for(let charAux of pathToFile){
const hexChar = Buffer.from(charAux, 'utf8').toString('hex');
//console.log(hexChar)
const intChar = parseInt(hexChar,16);
if(hexChar.length > 2){
if(hexChar === UTF8_Character) continue;
specialChars++;
//console.log(`specialChars(${specialChars}): ${hexChar}`);
textFiltered += personalized_encoding[hexChar];
}else{
textFiltered += String.fromCharCode(intChar);
}
}
console.log(textFiltered);

最新更新