如何使用批处理文件对文本进行排序和求和时忽略第一和第三数据列?



我在下面制作了这个脚本:

start /w "" "C:Program Files (x86)CoolUtilsTotal PDF ConverterPDFConverter.exe" "%userprofile%Desktopteste.oxps" "%userprofile%Desktopteste.txt"
del /f /s /q "%userprofile%Desktopteste.pdf"
findstr /v /r /c:"http" /c:"Banrisul" /c:"Sac" /c:":" /c:"-" /c:"SAC" /c:"OUVIDORIA" /c:"B A N R I S U L" /c:"+" "teste.txt" > "output.txt"
for /f "usebackq tokens=* delims=" %%a in ("output.txt") do (echo(%%a)>>output2.txt
move /y output2.txt output.txt

output.txt是下面总结为真实文件太大:

01  VERO BANRICOMPRAS A PRAZO                         914709              77,56
VERO BANRICOMPRAS A PRAZO                         914710             322,58
VERO BANRICOMPRAS A VISTA                         256534             187,64
VERO BANRICOMPRAS A VISTA                         256539              17,62
VERO CARTAO CREDITO                               391534             146,22
VERO CARTAO CREDITO                               391535             159,26
VERO CARTAO CREDITO                               391536             543,16
VERO CARTAO CREDITO                               391537             479,37
VERO CARTAO CREDITO                               391538              96,00
REDECARD S.A CARTAO CREDITO                       415856             217,25
REDECARD S.A CARTAO CREDITO                       416552             152,28
VERO CARTAO DEBITO                                338711             144,60
VERO CARTAO DEBITO                                338712             378,03
VERO CARTAO DEBITO                                338713             134,34
CIELO S/A CARTAO DEBITO                           312125              57,71
CIELO S/A CARTAO DEBITO                           312126              23,46
CIELO S/A CARTAO DEBITO                           312127             232,55
RESGATE AUTOMATICO                                000000          17.700,00
02  VERO BANRICOMPRAS A PRAZO                         208538             396,99
VERO BANRICOMPRAS A PRAZO                         208539             217,65
VERO BANRICOMPRAS A VISTA                         562243             147,65
VERO CARTAO CREDITO                               692285             105,58
VERO CARTAO CREDITO                               692286              28,53
VERO CARTAO CREDITO                               692287             561,48
VERO CARTAO CREDITO                               692288             150,14
VERO CARTAO CREDITO                               692289             101,78
CIELO S/A CARTAO CREDITO                          673728             125,39
CIELO S/A CARTAO CREDITO                          673729              25,36
REDECARD S.A CARTAO CREDITO                       716879             278,83
REDECARD S.A CARTAO CREDITO                       720069              47,78
VERO CARTAO DEBITO                                637568             124,70
VERO CARTAO DEBITO                                637569             272,15
VERO CARTAO DEBITO                                637570             244,13
CIELO S/A CARTAO DEBITO                           609977             100,13
03  VERO BANRICOMPRAS A PRAZO                         501238             158,81

我陷入了如何对每个相同文本的值求和,然后对所有文本求和的总和陷入困境。中间的数据列和首字母0102等无关紧要。只应使用文本和最后一列。它需要像那个风箱:

VERO BANRICOMPRAS A PRAZO                 20.596,26
VERO BANRICOMPRAS A VISTA                 14.658,12
VERO CARTAO CREDITO                        2.549,34
(etc.)
---------------------------------------------------
TOTAL OF ALL:                             37.803,72

刚刚在此链接中上传了我的完整output.txt:https://ufile.io/uyxr1


关于我的output.txt文件:数据列之间只有空格,但描述和数字的字符数量总是会改变。但对我来说唯一重要的是描述和最后一列。


@Mofi,看看我的期望:

------------ CARDS OF MONTH -----------
CIELO S/A CARTAO CREDITO        2.147,13
CIELO S/A CARTAO DEBITO        10.867,72
REDECARD S.A CARTAO CREDITO    11.835,11
REDECARD S.A CARTAO DEBITO         87,20
VERO BANRICOMPRAS A PRAZO      17.083,70
VERO BANRICOMPRAS A VISTA       7.829,06
VERO CARTAO CREDITO            58.052,05
VERO CARTAO DEBITO             17.215,64
----------------------------------------
TOTAL OF ALL:                 125.117,61
----------------------------------------
RESGATE AUTOMATICO            152.900,00
PAGAMENTO DE FORNECEDOR        25.900,00
DOC-E                          63.487,36

批处理文件用于执行一系列命令和应用程序,但不用于文本文件编辑或数据计算。还有许多其他编程和脚本语言比使用 Windows 命令处理器cmd.exe更适合此任务。我永远不会想到使用cmd.exe对批处理文件执行此类任务的想法。出于这个原因,我对这个任务很感兴趣,如果甚至可以只使用 Windows 命令行支持的命令来做到这一点。是的,经过数小时的编码,我可以提供一个批处理文件,该文件适用于相关示例。我没有在整个文件上运行它。

@echo off
setlocal EnableExtensions DisableDelayedExpansion
set "DataCount=0"
set "MaxNameLength=19"
set "TempFile1=%TEMP%Output.tmp"
set "TempFile2=%TEMP%Sorted.tmp"
set "OutputFile=Output.txt"
"%ProgramFiles(x86)%CoolUtilsTotal PDF ConverterPDFConverter.exe" "%UserProfile%Desktopteste.oxps" "%UserProfile%Desktopteste.txt"
del /F "%UserProfile%Desktopteste.pdf"
%SystemRoot%System32findstr.exe /V /R /C:"http" /C:"Banrisul" /C:"Sac" /C:":" /C:"-" /C:"SAC" /C:"OUVIDORIA" /C:"B A N R I S U L" /C:"+" "%UserProfile%Desktopteste.txt" >"%TempFile1%"
if not exist "%TempFile1%" goto EndBatch
rem Process each line in the output file line by line with leading
rem spaces and tabs already removed by FOR, but no other characters.
for /F usebackq^ tokens^=*^ eol^= %%I in ("%TempFile1%") do (
set "DataRow=%%~I"
call :ProcessLine
)
goto FormatOutput

:ProcessLine
rem Remove all double quotes within the data row.
set "DataRow=%DataRow:"=%"
rem Remove all exclamation marks within the data row.
set "DataRow=%DataRow:!=%"
rem Remove all horizontal tabs within the data row by spaces.
set "DataRow=%DataRow:  = %"
rem Replace all $ by the string #DollarSign# within the data row.
set "DataRow=%DataRow:$=#DollarSign#%"
rem Replace two spaces in series by a dollar sign.
set "DataRow=%DataRow:  =$%"
rem Replace all occurrences of dollar sign plus space by dollar sign.
set "DataRow=%DataRow:$ =$%"
rem Split up the data row using dollar sign as delimiter which can result
rem in four or just three tokenized substrings depending on existence of
rem 01, 02, ... in first data column. Of interest is the name string in
rem first or second data column and the value string in third or fourth
rem data column.
for /F tokens^=1-4^ delims^=$^ eol^= %%J in ("%DataRow%") do (
if "%%M" == "" (
set "DataName=%%~J"
set "DataValue=%%L"
) else (
set "DataName=%%~K"
set "DataValue=%%M"
)
)
rem It is safe now to replace the string #DollarSign# back to dollar sign.
set "DataName=%DataName:#DollarSign#=$%"
rem If the first 2 characters of data name are two digits and third
rem character is a space then remove those 3 characters from data name.
if not "%DataName:~2,1%" == " " goto CheckName
if "%DataName:~0,1%" == ";" goto CheckName
for /F "delims=01234567890" %%J in ("%DataName:~0,2%") do goto CheckName
set "DataName=%DataName:~3%"
rem Data names not containing one of the following four strings
rem should be at end of the list and not included in the total sum.
:CheckName
if not "%DataName:DEBITO=%"  == "%DataName%" goto ReformatValue
if not "%DataName:CREDITO=%" == "%DataName%" goto ReformatValue
if not "%DataName:A VISTA=%" == "%DataName%" goto ReformatValue
if not "%DataName:A PRAZO=%" == "%DataName%" goto ReformatValue
set "DataName=z_%DataName%"
rem Floating point arithmetic is not supported by Windows command processor,
rem just signed 32-bit integer arithmetic with values in range -2147483648
rem to 2147483647. Therefore remove all dots and commas from data value.
:ReformatValue
set "DataValue=%DataValue:.=%"
set "DataValue=%DataValue:,=%"
for /F "tokens=* delims=0" %%J in ("%DataValue%") do set "DataValue=%%J"
if not defined DataValue set "DataValue=0"
rem Add the data value as integer to total sum. There is no special
rem overflow handling implemented yet in case of total sum exceeds
rem the maximum 32-bit positive signed integer value 2147483647.
if not "%DataName:~0,2%" == "z_" set /A "#TotalSum+=DataValue"
rem Find out if a data value of current data name is already in list
rem of environment variables and in this case just add the data value
rem to the existing sum for data data name and exit the subroutine.
for /F "tokens=1* delims==" %%J in ('set $ 2^>nul') do if /I "%%K" == "%DataName%" set /A "#%%~nJ+=DataValue" & goto :EOF
rem This is a new data name with its first data value. So set
rem the appropriate environment variables for name and value.
set /A DataCount+=1
set "$%DataCount%=%DataName%"
set "#%DataCount%=%DataValue%"
rem For a later aligned output find out the length of the data name and
rem remember its length if being greater than longest data name up to now.
rem z_ added at beginning of some data names must be always ignored for
rem length of data name.
set "NameLength=1"
:GetNameLength
set "DataName=%DataName:~1%"
if not "%DataName%" == "" set /A "NameLength+=1" & goto GetNameLength
if "%DataName:~0,2%" == "z_" set /A NameLength-=2
if %NameLength% GTR %MaxNameLength% set "MaxNameLength=%NameLength%"
rem Exit the subroutine.
goto :EOF

:FormatOutput
rem Add 3 to maximum name length to have always at least 3 spaces
rem between longest data name and the sum of the data values.
set /A MaxNameLength+=3
setlocal EnableDelayedExpansion
rem Build a string consisting of spaces according to maximum name length.
set "SpacesName="
for /L %%I in (1,1,%MaxNameLength%) do set "SpacesName=!SpacesName! "
rem Format value of total sum.
call :FormatNumber %#TotalSum%
rem For a later aligned output find out the length of the total
rem sum which is expected to be the greatest value in output.
set "ValueLength=1"
:GetValueLength
set "DataValue=%DataValue:~1%"
if not "%DataValue%" == "" set /A "ValueLength+=1" & goto GetValueLength
rem Build a string consisting of spaces according to maximum name length.
set "SpacesValue="
for /L %%I in (1,1,%ValueLength%) do set "SpacesValue=!SpacesValue! "
rem Output the data names with appropriate number of aligning spaces
rem and the data sum for each data name formatted and with aligning
rem spaces into the output file.
set /A ExtraNameLength=MaxNameLength+2
del "%TempFile1%" 2>nul
(for /F "tokens=1* delims==" %%I in ('set $ 2^>nul') do (
set "DataName=%%J%SpacesName%"
if not "!DataName:~0,2!" == "z_" (
set "DataName=!DataName:~0,%MaxNameLength%!"
) else set "DataName=!DataName:~0,%ExtraNameLength%!"
call :FormatNumber !#%%~nI!
set "DataValue=%SpacesValue%!DataValue!"
set "DataValue=!DataValue:~-%ValueLength%!"
echo !DataName!!DataValue!
))>>"%TempFile1%"
rem Sort the lines in output file according to name.
%SystemRoot%System32sort.exe "%TempFile1%" /O "%TempFile2%"
del "%TempFile1%" 2>nul
rem 16 is the length of string CARDS OF MONTH with a space on both sides.
set /A LineLength=MaxNameLength+ValueLength
set /A HeadLength=(LineLength-16) / 2
rem Build the heading for the output file with centered CARDS OF MONTH
rem and create the output file with this heading as first line.
set "HyphensHead="
for /L %%I in (1,1,%HeadLength%) do set "HyphensHead=!HyphensHead!-"
set "Heading=%HyphensHead% CARDS OF MONTH %HyphensHead%"
set /A HeadLength=HeadLength*2 + 16
if not %HeadLength% == %LineLength% set "Heading=%Heading%-"
echo %Heading%>"%OutputFile%"
set "TotalSumWrite=1"
for /F usebackq^ tokens^=*^ eol^= %%I in ("%TempFile2%") do (
set "DataRow=%%I"
if defined TotalSumWrite (
if not "!DataRow:~0,2!" == "z_" (
echo !DataRow!>>"%OutputFile%"
) else (
rem Append the total summary to the output file.
set "TotalSumWrite="
>>"%OutputFile%" echo %SpacesName: =-%%SpacesValue: =-%
set "DataName=TOTAL OF ALL:%SpacesName%"
set "DataName=!DataName:~0,%MaxNameLength%!"
call :FormatNumber %#TotalSum%
set "DataValue=%SpacesValue%!DataValue!"
set "DataValue=!DataValue:~-%ValueLength%!"
>>"%OutputFile%" echo !DataName!!DataValue!
>>"%OutputFile%" echo %SpacesName: =-%%SpacesValue: =-%
>>"%OutputFile%" echo !DataRow:~2!
)
) else echo !DataRow:~2!>>"%OutputFile%"
)
if defined TotalSumWrite (
>>"%OutputFile%" echo %SpacesName: =-%%SpacesValue: =-%
set "DataName=TOTAL OF ALL:%SpacesName%"
set "DataName=!DataName:~0,%MaxNameLength%!"
call :FormatNumber %#TotalSum%
set "DataValue=%SpacesValue%!DataValue!"
set "DataValue=!DataValue:~-%ValueLength%!"
>>"%OutputFile%" echo !DataName!!DataValue!
)
del "%TempFile2%" 2>nul
endlocal
goto EndBatch

rem The subroutine below reformats 0 to 99 to 0,00 to 0,99 and inserts
rem dots on larger values after a series of 3 digits left to the comma.
:FormatNumber
set "DataValue=%1"
if "%DataValue:~1,1%" == "" set "DataValue=0%DataValue%"
if "%DataValue:~2,1%" == "" set "DataValue=0%DataValue%"
set "DataValue=%DataValue:~0,-2%,%DataValue:~-2%"
if not "%DataValue:~6,1%" == "" set "DataValue=%DataValue:~0,-6%.%DataValue:~-6%"
if not "%DataValue:~10,1%" == "" set "DataValue=%DataValue:~0,-10%.%DataValue:~-10%"
goto :EOF
:EndBatch
endlocal

注意 1:代码中有一行,其中水平制表符必须在行中,而不是浏览器根据 HTML 规范显示的两个空格。因此,搜索包含字符串水平制表符的注释,并将下面行中:=之间的空格替换为制表符。

注2:阅读以rem开头的行的评论。批处理代码在处理每行之前删除所有行中的所有"以及所有!

注 3:Windows 命令处理器不支持浮点运算。它仅支持 32 位有符号整数算法。因此,每个总和,包括在计算过程中大于21.474.836,47的总和,在超过此限制时都是错误的。当然,如果根据数据确实需要,可以使用额外的代码来解决此限制。

注4:批处理代码将两个或多个串联空格解释为数据列之间的分隔符。因此,如果第二个数据列中的任何字符串偶然有两个串联空格,则批处理文件至少会为此数据行生成错误的结果。

注 5:批处理文件需要一些时间才能完成任务,因为 Windows 命令处理器并不是真正为此类任务而设计的。

要了解使用的命令及其工作原理,请打开命令提示符窗口,在那里执行以下命令,并仔细阅读为每个命令显示的所有帮助页面。

  • call /?
  • del /?
  • echo /?
  • endlocal /?
  • findstr /?
  • for /?
  • goto /?
  • if /?
  • rem /?
  • set /?
  • setlocal /?
  • sort /?

另请参阅:

  • GOTO :EOF 回到哪里?
  • 使用Windows批处理文件的具有多个命令的单行
  • Microsoft关于使用命令重定向运算符的文章

编辑:请使用此代码,它是此处发布的代码的增强版本,如果一行的总和低于总和输出,则输出所有总和正确TOTAL OF ALL:具有更多字符的值。

cmd

不擅长数学。有set /a命令,但它只能处理 32 位整数(-2,147,483,648 到 2,147,483,647(。.,不是数字的一部分,因此我们必须删除它们进行计算,并在末尾再次插入它们。

我们可以使用 set 命令获取子字符串。

set命令在echo -------------格式化总和之后(插入点和逗号并构建一个右对齐的字符串;您可能需要对其进行一些调整以处理较短或较长的数字 - 我会留给您(

@echo off
setlocal enabledelayedexpansion
(
for /f "delims=" %%a in (output.txt) do (
set "line=%%a"
set "second=!line:~28,30!"
set "fourth=!line:~90!
set "summand=!fourth:.=!"
set "summand=!summand:,=!"
set /a sum+=summand
echo !second!!fourth!
)
echo -------------------------------------------
set "sum=!sum:~0,-2!,!sum:~-2!"
set "sum=!sum:~0,-6!.!sum:~-6!"
set "sum=                    !sum!"
set "sum=!sum:~-20!
echo TOTAL OF ALL:          !sum!
)>out.txt
type out.txt

注意:由于 INT32 与删除逗号(乘以 10(相结合,如果总和大于 21.474.836,47,您的结果可能会很奇怪

最新更新