我有一个解决方案,但我认为这不是最好的方法,因为它需要很长时间,所以我正在寻找一种更快/更好/更智能的方法。
我有多个从.csv文件中提取的pscustomObject对象。每个对象至少有一个公共属性。一个相对较小(对象中约200-300个项目/行),但另一个相当大(约60000-100000个项目)。其中一个的内容可能与另一个内容匹配,也可能不匹配。
我需要找到两个对象在特定属性上的匹配位置,然后将每个对象的属性组合为一个具有所有或大多数属性的对象。
代码的一个示例片段(不准确,但它应该可以工作-请参阅示例数据的图像):数据表
Write-Verbose "Pulling basic Fruit data together"
$Purchase = import-csv "C:Purchase.csv"
$Selling = import-csv "C:Selling.csv"
Write-Verbose "Combining Fruit names and removing duplicates"
$Fruits = $Purchase.Fruit
$Fruits += $Selling.Fruit
$Fruits = $Fruits | Sort-Object -Unique
$compareData = @()
Foreach ($Fruit in $Fruits) {
$IndResults = @()
$IndResults = [pscustomobject]@{
#Adding Purchase and Selling data
Farmer = $Purchase.Where({$PSItem.Fruit -eq $Fruit}).Farmer
Region = $Purchase.Where({$PSItem.Fruit -eq $Fruit}).Region
Water = $Purchase.Where({$PSItem.Fruit -eq $Fruit}).Water
Market = $Selling.Where({$PSItem.Fruit -eq $Fruit}).Market
Cost = $Selling.Where({$PSItem.Fruit -eq $Fruit}).Cost
Tax = $Selling.Where({$PSItem.Fruit -eq $Fruit}).Tax
}
Write-Verbose "Loading Individual results into response"
$CompareData += $IndResults
}
Write-Output $CompareData
我认为问题是这样的:
Farmer = $Purchase.Where({$PSItem.Fruit -eq $Fruit}).Farmer
如果我理解这一点,它就是在每次经过这一行时都要查看$Purchase对象。我正在寻找一种方法来加快整个过程,而不是让它在每次比赛尝试时查看整个对象。
使用此Join-Object
:
$Purchase | Join $Selling -On Fruit | Format-Table
结果(使用Simon Catlin的数据):
Fruit Farmer Region Water Market Cost Tax
----- ------ ------ ----- ------ ---- ---
Apple Adam Alabama 1 MarketA 10 0.1
Cherry Charlie Cincinnati 2 MarketC 20 0.2
Damson Daniel Derby 3 MarketD 30 0.3
Elderberry Emma Eastbourne 4 MarketE 40 0.4
Fig Freda Florida 5 MarketF 50 0.5
使用Join Object
http://ramblingcookiemonster.github.io/Join-Object/
Join-Object -Left $purchase -Right $selling -LeftJoinProperty fruit -RightJoinProperty fruit -Type OnlyIfInBoth | ft
当我试图将人力资源系统的员工数据与AD林中的员工数据进行整合时,遇到了这个问题。由于有成千上万的行,这个过程需要很长时间。
我最终放弃了自定义对象,转而使用老式的哈希表。
然后,哈希表条目本身包含一个子哈希表和数据。在您的实例中,外部哈希将以$fruit为关键字,子哈希包含各种属性,例如:farm、region等。
相比之下,哈希表是闪电般的快。遗憾的是PowerShell在这方面进展缓慢。
如果你需要更多信息,大声喊出来。
26/01示例代码。。。假设我正确理解需求:
采购。CSV:
Fruit,Farmer,Region,Water
Apple,Adam,Alabama,1
Cherry,Charlie,Cincinnati,2
Damson,Daniel,Derby,3
Elderberry,Emma,Eastbourne,4
Fig,Freda,Florida,5
出售。CSV
Fruit,Market,Cost,Tax
Apple,MarketA,10,0.1
Cherry,MarketC,20,0.2
Damson,MarketD,30,0.3
Elderberry,MarketE,40,0.4
Fig,MarketF,50,0.5
代码
[String] $Local:strPurchaseFile = 'c:temppurchase.csv';
[String] $Local:strSellingFile = 'c:tempselling.csv';
[HashTable] $Local:objFruitHash = @{};
[System.Array] $Local:objSelectStringHit = $null;
[String] $Local:strFruit = '';
if ( (Test-Path -LiteralPath $strPurchaseFile -PathType Leaf) -and (Test-Path -LiteralPath $strSellingFile -PathType Leaf) ) {
#
# Populate data from purchase file.
#
foreach ( $objSelectStringHit in (Select-String -LiteralPath $strPurchaseFile -Pattern '^([^,]+),([^,]+),([^,]+),([^,]+)$' | Select-Object -Skip 1) ) {
$objFruitHash[ $objSelectStringHit.Matches[0].Groups[1].Value ] = @{ 'Farmer' = $objSelectStringHit.Matches[0].Groups[2].Value;
'Region' = $objSelectStringHit.Matches[0].Groups[3].Value;
'Water' = $objSelectStringHit.Matches[0].Groups[4].Value;
};
} #foreach-purchase-row
#
# Populate data from selling file.
#
foreach ( $objSelectStringHit in (Select-String -LiteralPath $strSellingFile -Pattern '^([^,]+),([^,]+),([^,]+),([^,]+)$' | Select-Object -Skip 1) ) {
$objFruitHash[ $objSelectStringHit.Matches[0].Groups[1].Value ] += @{ 'Market' = $objSelectStringHit.Matches[0].Groups[2].Value;
'Cost' = [Convert]::ToDecimal( $objSelectStringHit.Matches[0].Groups[3].Value );
'Tax' = [Convert]::ToDecimal( $objSelectStringHit.Matches[0].Groups[4].Value );
};
} #foreach-selling-row
#
# Output data. At this point, you could now build a PSCustomObject.
#
foreach ( $strFruit in ($objFruitHash.Keys | Sort-Object) ) {
Write-Host -Object ( '{0,-15}{1,-15}{2,-15}{3,-10}{4,-10}{5,10:C}{6,10:P}' -f
$strFruit,
$objFruitHash[$strFruit]['Farmer'],
$objFruitHash[$strFruit]['Region'],
$objFruitHash[$strFruit]['Water'],
$objFruitHash[$strFruit]['Market'],
$objFruitHash[$strFruit]['Cost'],
$objFruitHash[$strFruit]['Tax']
);
} #foreach
} else {
Write-Error -Message 'File error.';
} #else-if
我需要自己做类似的事情。我想取两个系统数组对象并对它们进行比较,从而得出匹配项,而不必每次都操作输入数据。这是我使用的方法,尽管我意识到这是低效的,但对于我必须处理的大约200张记录来说,这是即时的。
我试着把我正在做的事情(用户和他们的新旧家庭目录)翻译成农民、水果和市场等,所以我希望这有意义!
$Purchase = import-csv "C:Purchase.csv"
$Selling = import-csv "C:Selling.csv"
$compareData = @()
foreach ($iPurch in $Purchase) {
foreach ($iSell in $Selling) {
if ($iPurch.fruit -match $iSell.fruit) {
write-host "Match found between $($iPurch.Fruit) and $($iSell.Fruit)"
$hash = @{
Fruit = $iPurch.Fruit
Farmer = $iPurch.Farmer
Region = $iPurch.Region
Water = $iPurch.Water
Market = $iSell.Market
Cost = $iSell.Cost
Tax = $iSell.Tax
}
$Build = New-Object PSObject -Property $hash
$Total = $Total + 1
$compareData += $Build
}
}
}
Write-Host "Processed $Total records"