多个 FINDSTR 命令以获得所需的结果



这个想法是获取找到 404 错误的 url 及其上方的 id,以指示 url 属于它们,并进一步查找文件名文本并添加到输出文件中。

我一直在尝试循环查找 STR,以从先前找到的行号中查找行。有人可以帮忙吗?

示例文件:

FileName:  LastABC-1563220.xml
-------------------------------
123456786
12348
1234DEF
-------------------------------
http://Product.com/1234DEF
HTTP/1.1 404 Not Found - 0.062000
http://Product.com/1234DEF_1
HTTP/1.1 200 OK - 0.031000
123456785
12349
1234EFG
-------------------------------
http://Product.com/1234EFG
HTTP/1.1 200 OK - 0.031000
123456784
12340
1234FGH
-------------------------------
http://Product.com/1234FGH
HTTP/1.1 200 OK - 0.031000
http://Product.com/1234FGH_1
HTTP/1.1 404 Not Found - 0.079000
http://Product.com/1234FGH_2
HTTP/1.1 404 Not Found - 0.067000
http://Product.com/1234FGH_4
HTTP/1.1 404 Not Found - 0.047000

期望输出:

FileName:  LastABC-1563220.xml
123456786 12348 1234DEF
http://Product.com/1234DEF
123456784 12340 1234FGH
http://Product.com/1234FGH_1
http://Product.com/1234FGH_2
http://Product.com/1234FGH_4

到目前为止,我有的脚本:

del "%FailingURLS%" 2>nul
set numbers=
for /F "delims=:" %%a in ('findstr /I /N /C:"404 Not Found" %Formatedfile%') do (
set /A before=%%a-1
set "numbers=!numbers!!before!: "
)
(for /F "tokens=1* delims=:" %%a in ('findstr /N "^" %Formatedfile% ^| findstr /B "%numbers%"') do echo %%b) > %FailingURLS%

这是我的做法:

@echo off
setlocal EnableDelayedExpansion
del PreviousLines.txt 2>nul
set "ids="
(for /F "delims=" %%a in (test.txt) do (
set "line=%%a"
if "!line:~0,9!" equ "FileName:" (
echo(!line!>> PreviousLines.txt
) else if "!line:~0,5!" equ "http:" (
if defined ids echo(!ids!>> PreviousLines.txt
set "ids="
echo(!line!>> PreviousLines.txt
) else if "!line:~0,4!" equ "HTTP" (
rem It is an "OK" or "Not Found" line...
rem If is "Not Found", show previous lines
if "!line:Not Found=!" neq "!line!" type PreviousLines.txt
rem Anyway, reset previous lines
del PreviousLines.txt 2>nul
set "ids="
) else if "!line:~0,5!" neq "-----" (
set "ids=!ids!!line! "
)
)) > FailingURLS.txt

输出:

FileName:  LastABC-1563220.xml
123456786 12348 1234DEF 
http://Product.com/1234DEF
http://Product.com/1234FGH_1
http://Product.com/1234FGH_2
http://Product.com/1234FGH_4

我不明白你为什么在http://Product.com/1234FGH_1之前显示123456784 12340 1234FGHid,因为这样的 id 属于可以http://Product.com/1234FGH......

您的问题太宽泛了,因此以下示例显示了从文件中检索" 404"URL的方法,我认为这是您的主要问题。

@Echo Off
SetLocal EnableExtensions DisableDelayedExpansion
Set "Src=formattedfile.txt"
Set "Str=404 Not Found"
(Set LF=^
% 0x0A %
)
For /F %%A In ('Copy /Z "%~f0" Nul')Do Set "CR=%%A"
SetLocal EnableDelayedExpansion
FindStr /RC:".*!CR!*!LF!.*%Str%" "%Src%"
EndLocal
Pause

只需修改行3的值以匹配格式化文本文件的名称

您提供的文件内容的输出:

http://Product.com/1234DEF
http://Product.com/1234FGH_1
http://Product.com/1234FGH_2
http://Product.com/1234FGH_4
Press any key to continue . . .

这里有一个脚本(我们称之为extract-failed-urls.bat),它演示了一种完成任务的可能方法 - 带有相当多的解释性rem注释,以帮助您理解会发生什么:

@echo off
setlocal EnableExtensions DisableDelayedExpansion
rem // Define constants here:
set "_FILE=%~1"      & rem // (`%~1` represents the first command line argument)
set "_URLP=://"      & rem // (partial string that every listed URL contains)
set "_RESP=HTTP/1.1" & rem // (partial string that every response begins with)
set "_ERRN=404"      & rem // (specific error number in response to recognise)
rem // Determine the total number of lines contained in the given file:
(for /F %%C in ('^< "%_FILE%" find /C /V ""') do set "CNT=%%C") || goto :EOF
rem // Read from the given file:
< "%_FILE%" (
rem // Clear IDs and URL buffers, and preset flag:
set "IDS=" & set "URL=" & set "FLAG=#"
setlocal EnableDelayedExpansion
rem // Read and write first line of file separately:
set /A "CNT-=1" & set "LINE=" & set /P LINE="" & < nul set /P ="!LINE!"
rem // Loop through the remaining lines:
for /L %%I in (1,1,!CNT!) do (
rem // Read a line and process only non-empty one:
set /P LINE="" && (
rem // Try to split off response prefix:
set "REST=!LINE:*%_RESP% =!"
rem // Determine kind of current line:
if "!LINE:-=!" == "" (
rem // Line contains only hyphens `-`, so clear URL buffer:
set "URL="
) else if not "!LINE!" == "!LINE:*%_URLP%=!" (
rem // Line contains an URL, so store to URL buffer, set flag:
set "URL=!LINE!" & set "FLAG=#"
) else if "!LINE!" == "%_RESP% !REST!" (
rem // Line contains a response, so gather number:
for /F %%R in ("!REST!") do (
rem /* Specific error encountered, hence write IDs, if any,
rem    clear IDs buffer, then write stored URL, if any: */
if "%%R" == "%_ERRN%" (
if defined IDS echo/& echo(!IDS!
set "IDS=" & if defined URL echo(!URL!
)
)
rem // Clear URL buffer and set flag:
set "URL=" & set "FLAG=#"
) else (
rem /* No other condition fulfilled, hence line contains an ID,
rem    so put ID into IDs buffer, clear URL buffer and flag: */
if defined FLAG (set "IDS=!LINE!") else set "IDS=!IDS! !LINE!"
set "URL=" & set "FLAG="
)
)
)
endlocal
)
endlocal
exit /B

要针对名为sample.txt的输入文件运行它,请使用如下命令行:

extract-failed-urls.bat "sample.txt"

要将输出写入名为failed-urls.txt的另一个文件,请使用以下命令:

extract-failed-urls.bat "sample.txt" > "failed-urls.txt"

使用问题的示例输入文件中的数据,输出将如下所示:

FileName:  LastABC-1563220.xml
123456786 12348 1234DEF
http://Product.com/1234DEF
123456784 12340 1234FGH
http://Product.com/1234FGH_1
http://Product.com/1234FGH_2
http://Product.com/1234FGH_4

这种方法区分了以下不同类型的输入行,其识别会触发某些相应的活动:

  1. 第一行(以FileName:开头的那行):
    • 只需输出未经编辑的行(没有尾随换行符);
  2. 仅包含连字符的行 (-------------------------------):
    • 清除保存(最后一个)URL 的缓冲区;
  3. 包含 URL 的行,即包含://的行:
    • 存储(覆盖)缓冲区的 URL;
    • 设置一个标志以清除 ID 的缓冲区(稍后);
  4. 保存响应的行,即以HTTP/1.1+空格开头的行:
    • 如果错误号为404
      • 输出 ID 缓冲区的内容(如果有);
      • 清除 ID 的缓冲区;
      • 输出 URL 的缓冲区内容(如果有);
    • 清除保存(最后一个)URL 的缓冲区;
    • 设置一个标志以清除 ID 的缓冲区(稍后);
  5. 包含 ID 的行,因此所有其他行:
    • 如果设置了清除 ID 缓冲区的标志,则清除缓冲区;
    • 将 ID 附加到 ID 的缓冲区(空格分隔);
    • 清除保存(最后一个)URL 的缓冲区;
    • 重置标志以清除 ID 的缓冲区;

这是一种更简单的方法,它依赖于以下事实:输入文件中的 ID 块始终包含三行,然后是仅连字符行,然后出现 URL 和响应对(如果不是,则会显示错误消息):

@echo off
setlocal EnableExtensions DisableDelayedExpansion
rem // Define constants here:
set "_FILE=%~1"      & rem // (`%~1` represents the first command line argument)
set "_URLP=://"      & rem // (partial string that every listed URL contains)
set "_RESP=HTTP/1.1" & rem // (partial string that every response begins with)
set "_ERRN=404"      & rem // (specific error number in response to recognise)
rem // Determine the total number of lines contained in the given file:
(for /F %%C in ('^< "%_FILE%" find /C /V ""') do set "CNT=%%C") || goto :EOF
rem // Read from the given file:
< "%_FILE%" (
rem // Clear IDs buffer and such for previous lines:
set "IDS=#" & set "PREV1=" & set "PREV2="
setlocal EnableDelayedExpansion
rem // Read and write first line of file separately:
set /A "CNT-=1" & set "LINE=" & set /P LINE="" & < nul set /P ="!LINE!"
rem // Read and check second line of file separately:
set /A "CNT-=1" & set "LINE=" & set /P LINE="" & if not "!LINE:-=!" == "" goto :ERROR
rem // Loop through the remaining lines:
set /A "CNT/=2" & for /L %%I in (1,1,!CNT!) do (
rem // Read a line and process only non-empty one:
set /P LINE1="" && (
rem // Read another line and process only non-empty one:
set /P LINE2="" && (
rem // Determine kind of first line:
if not "!LINE1!" == "!LINE1:*%_URLP%=!" (
rem // First line contains an URL, so next line must be a response;
rem    hence try to split off response prefix: */
set "REST=!LINE2:*%_RESP% =!"
rem // Check second line whether it is really a response:
if "!LINE2!" == "%_RESP% !LINE2:*%_RESP% =!" (
rem // Line indeed contains a response, so gather number:
for /F %%R in ("!REST!") do (
rem /* Specific error encountered, hence write IDs, if any,
rem    clear IDs buffer, then write URL from first line: */
if "%%R" == "%_ERRN%" (
if defined IDS echo/& echo(!IDS!
set "IDS=" & echo(!LINE1!
)
)
) else goto :ERROR
rem // Clear buffers for previous lines:
set "PREV1=" & set "PREV2="
) else (
rem /* First line does not contain an URL, so it contains an ID,
rem    hence check if buffers for previous lines already contain
rem    data, which must be IDs, so store them all in IDs buffer,
rem    and check if the second line contains only hyphens `-`: */
if defined PREV1 if "!LINE2:-=!" == "" (
set "IDs=!PREV1! !PREV2! !LINE1!"
) else goto :ERROR
rem // Store both lines into buffer for previous lines:
set "PREV1=!LINE1!" & set "PREV2=!LINE2!"
)
) || exit /B 0
) || exit /B 0
)
endlocal
)
endlocal
exit /B
:ERROR
if defined IDS > con echo/
if "!" == "" endlocal
>&2 echo ERROR: expected file format violated!
exit /B 2

调用约定以及基于输入数据的输出与上述相同。

最新更新