如何从一组其他较长的字符串中删除不同的字符串?



>背景信息:目录c:documents充满了来自不同人的.doc.xls文件。文件名中的某处有首字母缩写,以识别谁编辑了该文件。每个文件名都可以有一个或多个初始集。这次我只对.doc文件感兴趣。此目录的横截面如下所示:

depot.inventory.20180921.[CMP]-[OxA](DOT)-(TTR).edited.doc
rack_location_(IIY)collected.2018.11.24.edit[UTS]_{POM}.doc

该列表持续数百个文件。我想生成这些文件的副本,没有编辑者的首字母缩写,并将它们放入名为c:uniform的目录中。

这里的常量是:每组首字母长 3 个字母,可以是大写或小写,并括在某种括号中。在任何给定时间,我都会在一个文件中列出编辑者的首字母缩写列表,每行格式一组,例如:

CMP
OXA
TTR
DOT
UTS
IIY
POM

该文件在任何给定日期都有大约 100-150 个名称。

到目前为止,我想出了如何从所有.doc文件中删除一组首字母缩写,如下所示:

for /R "C:documents" %%f in (*.doc) do (
call :Sub %%~nf
)
:Sub
set str=%*
set str=%str:[DOT]=%
echo %str%

在这里,在此代码段中,我以[DOT]为例。我想将字符串[DOT]变量,并从编辑器的首字母文件中读取它。但是,对于每个文档文件,这多次是必需的。

所以我的批处理程序将遍历源目录中的所有 *.doc 文件,对于每个文件,它将通过 100-150 个名称的循环并删除这些字符串并形成一个新文件名并将旧文件从源目录复制到目标目录中,使用新文件名,这是编辑的首字母缩写从源文件名的版本中删除。

如何进行第二个循环?

我对语法感到困惑。

这是这个不寻常的文件复制任务的注释批处理文件。

@echo off
setlocal EnableExtensions DisableDelayedExpansion
set "SourceFolder=C:documents"
set "DestinationFolder=C:uniform"
rem Is there no *.doc file to process in source directory?
if not exist "%SourceFolder%*.doc" goto :EOF
rem Do nothing if the text file with editors' initials
rem does not exist in the batch file directory.
if not exist "%~dp0EditorsInitials.txt" goto :EOF
rem Create the destination directory on not already existing
rem and veriy the real existence of the destination directory.
md "%DestinationFolder%" 2>nul
if not exist "%DestinationFolder%" goto :EOF
rem Read the editors' initials from text file and create a space separated
rem list of them assigned to the environment variable EditorsInitials.
setlocal EnableDelayedExpansion
set "EditorsInitials="
for /F "usebackq" %%I in ("%~dp0EditorsInitials.txt") do set "EditorsInitials=!EditorsInitials! %%~I"
endlocal & set "EditorsInitials=%EditorsInitials:~1%"
rem For each non-hidden *.doc file in source directory get file name with
rem file extension and with path if there is one specified left to *.doc
rem and assign it to the environment variable FullFileName. The file name
rem only is assigned to the environment variable FileName. Then delayed
rem environment variable expansion is enabled again for running two nested
rem loops which runs case-insensitive string substitutions on the file name
rem string value to remove the editors' initials from the file name. Next
rem one more loop is used to remove also .edited and .edit from file name.
rem The current *.doc file is finally copied with cleaned file name to
rem the configured destination directory. A date in file name remains.
for %%I in ("%SourceFolder%*.doc") do (
set "FullFileName=%%I"
set "FileName=%%~nI"
setlocal EnableDelayedExpansion
for %%J in (%EditorsInitials%) do for %%K in ("-" "." "_" "") do (
set "FileName=!FileName:%%~K[%%J]=!"
set "FileName=!FileName:%%~K(%%J)=!"
set "FileName=!FileName:%%~K{%%J}=!"
)
for %%J in (".edited" ".edit") do set "FileName=!FileName:%%~J=!"
copy "!FullFileName!" "%DestinationFolder%!FileName!%%~xI" >nul
endlocal
)
endlocal

此批处理文件执行命令copy,其中包含两个示例 *.doc 文件的编辑者首字母缩写列表,该文件包含两个示例 * 文件的目录中的文件EditorsInitials.txt

"C:documentsdepot.inventory.20180921.[CMP]-[OxA](DOT)-(TTR).edited.doc" "C:uniformdepot.inventory.20180921.doc"
"C:documentsrack_location_(IIY)collected.2018.11.24.edit[UTS]_{POM}.doc" "C:uniformrack_locationcollected.2018.11.24.doc"

要了解使用的命令及其工作原理,请打开命令提示符窗口,在那里执行以下命令,并仔细阅读为每个命令显示的所有帮助页面。

  • call /?......解释%~dp0...参数 0 的驱动器和路径,这是包含此批处理文件的目录的完整路径,始终以反斜杠结尾。
  • copy /?
  • echo /?
  • endlocal /?
  • for /?
  • goto /?
  • if /?
  • md /?
  • rem /?
  • set /?
  • setlocal /?

另请参阅有关使用命令重定向运算符的Microsoft文章,以获取有关>nul2>nul的说明。

此解决方案使用 PowerShell 中的正则表达式。如果您使用的是受支持的Windows系统,它将具有PowerShell。这确实假设没有人使用垂直线作为其首字母缩写的一部分,也没有人将其用作首字母周围的括号。将$DestinationDir更改为您的选择。

当您确信文件将正确重命名时,请从Rename-Item命令中删除-WhatIf

=== 重命名首字母缩写.ps1

$SourceDir = 'C:srctreninitials'
$DestinationDir = 'C:srctreninitialsuniform'
$Editors = (Get-Content -Path $(Join-Path -Path $SourceDir -ChildPath 'Editors.txt')) -join '|'
$OpeningBrackets = @('[', '(', '{') -join '|'
$ClosingBrackets = @(']', ')', '}') -join '|'
$Regex = '(' + $OpeningBrackets + ')(' + $Editors + ')(' + $ClosingBrackets + ')'
$FileTypes = @('*.doc', '*.xls')
foreach ($FileType in $FileTypes) {
Get-ChildItem -Path $SourceDir -File -Recurse -Filter $FileType |
ForEach-Object {
if ($_.Name -match $Regex) {
$NewName = $_.Name -replace $Regex,''
Move-Item -LiteralPath $_.FullName `
-Destination $(Join-Path -Path $DestinationDir -ChildPath $NewName) -WhatIf
}
}
}

如果必须从 cmd 调用它.exe shell:

powershell -NoLogo -NoProfile -File "Rename-Initials.ps1"

您尝试完成的任务并非那么简单,尤其是当您不想在删除带括号的字符串部分后留下分隔符序列(如句点、连字符、下划线等(时。

这是一个脚本,它一个接一个地删除括号中的已知编辑者的首字母缩写(在列表文件中预定义initials.txt在当前目录中(; 如果留下两个相邻的分隔符(如.-_,以及,;%(将被留下,则删除第一个; 如果没有这样的分隔符, 插入第一个定义的那个(.(。(可选(由已知后缀(如脚本中定义的editededit(和前面的分隔符组成的潜在尾部也会被删除。所以这是代码,包括一些解释性rem注释:

@echo off
setlocal EnableExtensions DisableDelayedExpansion
rem // Define constants here:
set "_ROOT=C:documents" & rem // (root directory; `.` is current, `%~dp0.` is script's parent)
set "_DEST=C:uniform"   & rem // (destination directory)
set "_OVER="             & rem // (set this to `|` to overwrite existing files, or else to ``)
set "_LIST=initials.txt" & rem // (text file containing list of editors' initials, one per line)
set _MASKS="*.doc" "*.xls" & rem // (list of file patterns to process)
(set _LF=^
%= blank line =%
) & rem // (line-break)
set _PAREN=( )^%_LF%%_LF%[ ]^%_LF%%_LF%{ } & rem // (list of pairs of parentheses)
set _SEPAR=. - _ "," ";" %% & rem // (list of separators; do not use `=`, `~`, `!`, `^`)
set _TAILS="edited" "edit"  & rem // (optional list of suffixes to remove; may be empty)
rem // Change into root (source) directory:
pushd "%_ROOT%" && (
rem // Iterate through all matching files:
for %%F in (%_MASKS%) do (
rem // Store full name of current file:
set "FILE=%%~F" & set "NAME=%%~nxF"
rem // Toggle delayed expansion to avoid trouble with `!`:
setlocal EnableDelayedExpansion
rem // Loop over the list of initials:
for /F "usebackq delims= eol=|" %%E in ("%_LIST%") do (
rem // Loop over trailing separators:
for %%J in (. !_SEPAR! "") do (
rem // Loop over leading separators:
for %%I in (!_SEPAR! "") do (
rem // Loop over pairs of parentheses:
for /F "tokens=1,2" %%K in ("!_PAREN!") do (
rem // Conditionally remove parenthesised text from file name:
if not "%%~J"=="" (
set "NAME=!NAME:%%~I%%K%%E%%L%%~J=%%~J!"
) else if not "%%~I"=="" (
set "NAME=!NAME:%%~I%%K%%E%%L%%~J=%%~I!"
) else if defined _SEPAR (
set "NAME=!NAME:%%~I%%K%%E%%L%%~J=%_SEPAR:~,1%!"
) else (
set "NAME=!NAME:%%~I%%K%%E%%L%%~J=.!"
)
)
)
)
)
rem // Process optional list of suffixes:
if defined _TAILS (
rem // Use `for /F` loop to split file name into base name and extension:
for /F "delims= eol=|" %%N in (""!NAME!"") do (
endlocal
rem // Store file name components:
set "NAME=%%~nxN" & set "EXT=%%~xN" & set "TEST=%%~nN|"
setlocal EnableDelayedExpansion
rem // Loop over suffixes:
for %%M in (!_TAILS!) do (
rem // Loop over separators:
for %%J in (!_SEPAR!) do (
rem // Remove found suffix from base name:
if not "!TEST!"=="!TEST:%%~J%%~M|=!" (
set "NAME=!TEST:%%~J%%~M|=!!EXT!"
)
)
)
)
)
rem // Actually copy file to destination with the newly built name:
if not exist "!_DEST!!NAME!!_OVER!" (
ECHO copy /Y "!FILE!" "!_DEST!!NAME!"
)
endlocal
)
popd
)
endlocal
exit /B

在顶部的Define constants here:部分中配置确切的行为。

测试输出后,删除大写ECHO命令以实际复制文件;要禁止显示copy命令返回的大量行1 file(s) copied.,请将该ECHO替换为> nul

相关内容

  • 没有找到相关文章

最新更新