导入Csv将记录分配给字符串(就像它在以记事本形式打开的Csv文件中一样)



我还不完全理解Import-CsvForEach-Object循环期间$_的值。所以我不知道如何合并两个csv文件之间的记录(我可以找到匹配的记录,只是不知道如何将记录合并)。

我曾想过将两个csv文件中的两个$_都指定为字符串,然后将两个字符串合并为一行,但我确信如何将$_转换为字符串,而不会变成@{column1 = value1_1; column2 = value2_1}"value1_1","value2_1"这样奇怪的东西。我甚至不确定这是否是最好的解决方案,或者powershell是否支持合并这两个记录(看起来像哈希表?)。

以下代码在第62行出错($MergedFileTable += $userInputRecord + $record):

Function Main
{
$userInputCsvTable = Import-Csv 'C:ScriptsTestsTestCurrentUserInput.csv'
$userInputCsvFile = 'C:ScriptsTestsTestCurrentUserInput.csv'
$DatabaseCsvTable = Import-Csv 'C:ScriptsTestsTestCurrentDatabaseReport.csv'
$DatabaseCsvFile = 'C:ScriptsTestsTestCurrentDatabaseReport.csv'
$DatabaseCustomCsvFile = 'C:ScriptsTestsTestCurrentDatabaseReport_Custom.csv'
$MergedFileTable = @{}
$MergedFile = 'C:ScriptsTestsTestCurrentMerged.csv'
[String]$UserInputColumnIDsTitle = 'IDs'
[String]$UserInputColumnLastNameTitle = 'Last Name'
[String]$UserInputColumnFirstNameTitle = 'First'
[String]$DatabaseColumnUserLoginTitle = 'User Login'
[String]$DatabaseColumnFirstNameTitle = 'First Name'
[String]$DatabaseColumnLastNameTitle = 'Last Name'
[String]$DatabaseColumnUserStatusTitle = 'User Status'
[String]$DatabaseColumnDomain1Title = 'Domain1'
[String]$DatabaseColumnDomain2Title = 'Domain2'
[String]$DatabaseColumnDomain3Title = 'Domain3'
[String]$DatabaseColumnDomain4Title = 'Domain4'
[String]$DatabaseColumnDomain5Title = 'Domain5'
[String]$DatabaseColumnDomain6Title = 'Domain6'
Write-Host "Gathering one list of first and last names from file: $userInputCsvFile..."
[String[]]$userInputColumnsLastAndFirstNames = @()
$userInputCsvTable | ForEach-Object {
    $userInputColumnsLastAndFirstNames += $_.$UserInputColumnLastNameTitle + ' ' + $_.$UserInputColumnFirstNameTitle
}
Write-Host "Complete."
Write-Host "Getting Number of Names..."
[Int]$numOfNames = $userInputColumnsLastAndFirstNames.Length
Write-Host "Complete."
Write-Host "Creating new Database table only with first and last names in list (side-by-side)..."
[Int]$DisplayCounter = 0
$DatabaseCsvTable | ForEach-Object {
    $DisplayCounter++
    If ($DisplayCounter % 10000 -eq 0)
    {
        Write-Host 'On record ' $DisplayCounter
    }
    If ($userInputColumnsLastAndFirstNames -contains ($_.$DatabaseColumnLastNameTitle + ' ' + $_.$DatabaseColumnFirstNameTitle))
    {
        $_
    }
} | Export-Csv $DatabaseCustomCsvFile -NoTypeInformation -Force
Write-Host "Complete."
Write-Host "Creating merged file with duplicate records found..."
$DatabaseCustomCsvTable = Import-Csv $DatabaseCustomCsvFile
$userInputCsvTable | ForEach-Object {
    $userInputRecord = $_
    $firstName = $_.$UserInputColumnFirstNameTitle
    $lastName = $_.$UserInputColumnLastNameTitle
    $matchedTable = $DatabaseCustomCsvTable | Where-Object {($firstName -eq $_.$DatabaseColumnFirstNameTitle) -and ($lastName -eq $_.$DatabaseColumnLastNameTitle)}
    If ($matchedTable)
    {
        ForEach ($record in $matchedTable)
        {
            $MergedFileTable += $userInputRecord + $record
        }
    }
}
Write-Host "Complete."
Write-Host "Exporting "
$MergedFileTable | Export-Csv $MergedFile -Force -NoTypeInformation
}
Main

错误:

Method invocation failed because [System.Management.Automation.PSObject] doesn't contain a method named 'op_Addition'.
At C:ScriptsTestsTestCurrenttest1.ps1:62 char:5
+                 $MergedFileTable += $userInputRecord + $record
+                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (op_Addition:String) [], RuntimeException
    + FullyQualifiedErrorId : MethodNotFound

有什么办法吗?

我想真的没有简单的方法。。。

# create 2 merged file contents
Write-Host "Searching DB (with each search found)
and creating 2 merged record file contents in memory (one with duplicate records) (one with single records)..."
$UserInputSearchNum = 1
$searchPrintOutputNum = 5
[String]$MergedFileColumnsLine = '"' + $UserInputColumnIDsTitle + '","' + $UserInputColumnFirstNameTitle + '","' + $UserInputColumnLastNameTitle + '","' + $UserInputColumnQLActiveTitle + '","' + $DbColumnUserLoginTitle + '","' + $DbColumnFirstNameTitle + '","' + $DbColumnLastNameTitle + '","' + $DbColumnUserStatusTitle + '","' + $DbColumnAdD1Title + '","' + $DbColumnAdD2Title + '","' + $DbColumnAdD3Title + '","' + $DbColumnAdD4Title + '","' + $DbColumnAdD5Title + '","' + $DbColumnAdD6Title + '"'
$MergedFileContentsSingleRecords = @($MergedFileColumnsLine)
$MergedFileContentsDuplicateRecords = @($MergedFileColumnsLine)
Import-Csv $LocalUserInputCsvReportWithPath | ForEach-Object {
    If ($UserInputSearchNum % $searchPrintOutputNum -eq 0) # Lets user know which search it's on so user can estimate eta on completion
    {
        Write-Host "$UserInputSearchNum/$UserInputNumOfSearches searches complete"
    }
    [String]$userInputRecord = '"' + $_.$UserInputColumnIDsTitle + '","' + $_.$UserInputColumnFirstNameTitle + '","' + $_.$UserInputColumnLastNameTitle + '","' + $_.$UserInputColumnQLActiveTitle + '"'
    $firstAndLastName = $_.$UserInputColumnFirstNameTitle + ' ' + $_.$UserInputColumnLastNameTitle
    $matchedDbRecordTable = Import-Csv $LocalDbCustomReportFileWithPath | Where-Object {$firstAndLastName -eq ($_.$DbColumnFirstNameTitle + ' ' + $_.$DbColumnLastNameTitle)}
    If ($matchedDbRecordTable)
    {
        If ($matchedDbRecordTable.length -gt 1)
        {
            $matchedDbRecordTable | ForEach-Object {
                $MergedFileContentsDuplicateRecords += $userInputRecord + ',"' + $_.$DbColumnUserLoginTitle + '","' + $_.$DbColumnFirstNameTitle + '","' + $_.$DbColumnLastNameTitle + '","' + $_.$DbColumnUserStatusTitle + '","' + $_.$DbColumnAdD1Title + '","' + $_.$DbColumnAdD2Title + '","' + $_.$DbColumnAdD3Title + '","' + $_.$DbColumnAdD4Title + '","' + $_.$DbColumnAdD5Title + '","' + $_.$DbColumnAdD6Title + '"'
            } | Out-Null
        } Else {
            $matchedDbRecordTable | ForEach-Object {
                $MergedFileContentsSingleRecords += $userInputRecord + ',"' + $_.$DbColumnUserLoginTitle + '","' + $_.$DbColumnFirstNameTitle + '","' + $_.$DbColumnLastNameTitle + '","' + $_.$DbColumnUserStatusTitle + '","' + $_.$DbColumnAdD1Title + '","' + $_.$DbColumnAdD2Title + '","' + $_.$DbColumnAdD3Title + '","' + $_.$DbColumnAdD4Title + '","' + $_.$DbColumnAdD5Title + '","' + $_.$DbColumnAdD6Title + '"'
            } | Out-Null
        }
    } Else {
        $MergedFileContentsSingleRecords += $userInputRecord + ',"Not Found","Not Found","Not Found","Not Found","Not Found","Not Found","Not Found","Not Found","Not Found","Not Found"'
    }
    $UserInputSearchNum++
} | Out-Null
Write-Host "Complete."
Write-Host ''

以下是完整的测试脚本:https://gist.github.com/vardahoth/24649da587afc3d838234ab2398c8f80

以下是用户输入文件(用于测试):https://gist.github.com/vardahoth/864c90325d4e4dc4cf6921bf3b352ff5

以下是数据库输入文件(用于测试):https://gist.github.com/vardahoth/81beb4340b9a4d765ebe7b6361d12e30

这只需要运行测试几秒钟。但是,合并具有33000*34000条记录的两个文件大约需要60个小时(同时考虑在两个文件中发现的重复项)。

如果有人更快地知道一个库支持merge方法,那将是非常有帮助的。谢谢

最新更新