我还不完全理解Import-Csv
中ForEach-Object
循环期间$_
的值。所以我不知道如何合并两个csv文件之间的记录(我可以找到匹配的记录,只是不知道如何将记录合并)。
我曾想过将两个csv文件中的两个$_
都指定为字符串,然后将两个字符串合并为一行,但我确信如何将$_
转换为字符串,而不会变成@{column1 = value1_1; column2 = value2_1}
与"value1_1","value2_1"
这样奇怪的东西。我甚至不确定这是否是最好的解决方案,或者powershell是否支持合并这两个记录(看起来像哈希表?)。
以下代码在第62行出错($MergedFileTable += $userInputRecord + $record
):
Function Main
{
$userInputCsvTable = Import-Csv 'C:ScriptsTestsTestCurrentUserInput.csv'
$userInputCsvFile = 'C:ScriptsTestsTestCurrentUserInput.csv'
$DatabaseCsvTable = Import-Csv 'C:ScriptsTestsTestCurrentDatabaseReport.csv'
$DatabaseCsvFile = 'C:ScriptsTestsTestCurrentDatabaseReport.csv'
$DatabaseCustomCsvFile = 'C:ScriptsTestsTestCurrentDatabaseReport_Custom.csv'
$MergedFileTable = @{}
$MergedFile = 'C:ScriptsTestsTestCurrentMerged.csv'
[String]$UserInputColumnIDsTitle = 'IDs'
[String]$UserInputColumnLastNameTitle = 'Last Name'
[String]$UserInputColumnFirstNameTitle = 'First'
[String]$DatabaseColumnUserLoginTitle = 'User Login'
[String]$DatabaseColumnFirstNameTitle = 'First Name'
[String]$DatabaseColumnLastNameTitle = 'Last Name'
[String]$DatabaseColumnUserStatusTitle = 'User Status'
[String]$DatabaseColumnDomain1Title = 'Domain1'
[String]$DatabaseColumnDomain2Title = 'Domain2'
[String]$DatabaseColumnDomain3Title = 'Domain3'
[String]$DatabaseColumnDomain4Title = 'Domain4'
[String]$DatabaseColumnDomain5Title = 'Domain5'
[String]$DatabaseColumnDomain6Title = 'Domain6'
Write-Host "Gathering one list of first and last names from file: $userInputCsvFile..."
[String[]]$userInputColumnsLastAndFirstNames = @()
$userInputCsvTable | ForEach-Object {
$userInputColumnsLastAndFirstNames += $_.$UserInputColumnLastNameTitle + ' ' + $_.$UserInputColumnFirstNameTitle
}
Write-Host "Complete."
Write-Host "Getting Number of Names..."
[Int]$numOfNames = $userInputColumnsLastAndFirstNames.Length
Write-Host "Complete."
Write-Host "Creating new Database table only with first and last names in list (side-by-side)..."
[Int]$DisplayCounter = 0
$DatabaseCsvTable | ForEach-Object {
$DisplayCounter++
If ($DisplayCounter % 10000 -eq 0)
{
Write-Host 'On record ' $DisplayCounter
}
If ($userInputColumnsLastAndFirstNames -contains ($_.$DatabaseColumnLastNameTitle + ' ' + $_.$DatabaseColumnFirstNameTitle))
{
$_
}
} | Export-Csv $DatabaseCustomCsvFile -NoTypeInformation -Force
Write-Host "Complete."
Write-Host "Creating merged file with duplicate records found..."
$DatabaseCustomCsvTable = Import-Csv $DatabaseCustomCsvFile
$userInputCsvTable | ForEach-Object {
$userInputRecord = $_
$firstName = $_.$UserInputColumnFirstNameTitle
$lastName = $_.$UserInputColumnLastNameTitle
$matchedTable = $DatabaseCustomCsvTable | Where-Object {($firstName -eq $_.$DatabaseColumnFirstNameTitle) -and ($lastName -eq $_.$DatabaseColumnLastNameTitle)}
If ($matchedTable)
{
ForEach ($record in $matchedTable)
{
$MergedFileTable += $userInputRecord + $record
}
}
}
Write-Host "Complete."
Write-Host "Exporting "
$MergedFileTable | Export-Csv $MergedFile -Force -NoTypeInformation
}
Main
错误:
Method invocation failed because [System.Management.Automation.PSObject] doesn't contain a method named 'op_Addition'.
At C:ScriptsTestsTestCurrenttest1.ps1:62 char:5
+ $MergedFileTable += $userInputRecord + $record
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidOperation: (op_Addition:String) [], RuntimeException
+ FullyQualifiedErrorId : MethodNotFound
有什么办法吗?
我想真的没有简单的方法。。。
# create 2 merged file contents
Write-Host "Searching DB (with each search found)
and creating 2 merged record file contents in memory (one with duplicate records) (one with single records)..."
$UserInputSearchNum = 1
$searchPrintOutputNum = 5
[String]$MergedFileColumnsLine = '"' + $UserInputColumnIDsTitle + '","' + $UserInputColumnFirstNameTitle + '","' + $UserInputColumnLastNameTitle + '","' + $UserInputColumnQLActiveTitle + '","' + $DbColumnUserLoginTitle + '","' + $DbColumnFirstNameTitle + '","' + $DbColumnLastNameTitle + '","' + $DbColumnUserStatusTitle + '","' + $DbColumnAdD1Title + '","' + $DbColumnAdD2Title + '","' + $DbColumnAdD3Title + '","' + $DbColumnAdD4Title + '","' + $DbColumnAdD5Title + '","' + $DbColumnAdD6Title + '"'
$MergedFileContentsSingleRecords = @($MergedFileColumnsLine)
$MergedFileContentsDuplicateRecords = @($MergedFileColumnsLine)
Import-Csv $LocalUserInputCsvReportWithPath | ForEach-Object {
If ($UserInputSearchNum % $searchPrintOutputNum -eq 0) # Lets user know which search it's on so user can estimate eta on completion
{
Write-Host "$UserInputSearchNum/$UserInputNumOfSearches searches complete"
}
[String]$userInputRecord = '"' + $_.$UserInputColumnIDsTitle + '","' + $_.$UserInputColumnFirstNameTitle + '","' + $_.$UserInputColumnLastNameTitle + '","' + $_.$UserInputColumnQLActiveTitle + '"'
$firstAndLastName = $_.$UserInputColumnFirstNameTitle + ' ' + $_.$UserInputColumnLastNameTitle
$matchedDbRecordTable = Import-Csv $LocalDbCustomReportFileWithPath | Where-Object {$firstAndLastName -eq ($_.$DbColumnFirstNameTitle + ' ' + $_.$DbColumnLastNameTitle)}
If ($matchedDbRecordTable)
{
If ($matchedDbRecordTable.length -gt 1)
{
$matchedDbRecordTable | ForEach-Object {
$MergedFileContentsDuplicateRecords += $userInputRecord + ',"' + $_.$DbColumnUserLoginTitle + '","' + $_.$DbColumnFirstNameTitle + '","' + $_.$DbColumnLastNameTitle + '","' + $_.$DbColumnUserStatusTitle + '","' + $_.$DbColumnAdD1Title + '","' + $_.$DbColumnAdD2Title + '","' + $_.$DbColumnAdD3Title + '","' + $_.$DbColumnAdD4Title + '","' + $_.$DbColumnAdD5Title + '","' + $_.$DbColumnAdD6Title + '"'
} | Out-Null
} Else {
$matchedDbRecordTable | ForEach-Object {
$MergedFileContentsSingleRecords += $userInputRecord + ',"' + $_.$DbColumnUserLoginTitle + '","' + $_.$DbColumnFirstNameTitle + '","' + $_.$DbColumnLastNameTitle + '","' + $_.$DbColumnUserStatusTitle + '","' + $_.$DbColumnAdD1Title + '","' + $_.$DbColumnAdD2Title + '","' + $_.$DbColumnAdD3Title + '","' + $_.$DbColumnAdD4Title + '","' + $_.$DbColumnAdD5Title + '","' + $_.$DbColumnAdD6Title + '"'
} | Out-Null
}
} Else {
$MergedFileContentsSingleRecords += $userInputRecord + ',"Not Found","Not Found","Not Found","Not Found","Not Found","Not Found","Not Found","Not Found","Not Found","Not Found"'
}
$UserInputSearchNum++
} | Out-Null
Write-Host "Complete."
Write-Host ''
以下是完整的测试脚本:https://gist.github.com/vardahoth/24649da587afc3d838234ab2398c8f80
以下是用户输入文件(用于测试):https://gist.github.com/vardahoth/864c90325d4e4dc4cf6921bf3b352ff5
以下是数据库输入文件(用于测试):https://gist.github.com/vardahoth/81beb4340b9a4d765ebe7b6361d12e30
这只需要运行测试几秒钟。但是,合并具有33000*34000条记录的两个文件大约需要60个小时(同时考虑在两个文件中发现的重复项)。
如果有人更快地知道一个库支持merge方法,那将是非常有帮助的。谢谢