将Powershell脚本拆分为两个单独的部分



我有这个脚本连接到Sharepoint Online,索引所有文件和文件夹,以系统的方式下载它们,并生成一个带有文件名,文件夹,大小,路径等的.csv。

由于各种原因,我最终陷入了这样一种情况:我得到了所有数据,但元数据已损坏(上述.csv文件(。

不幸的是,仅仅为此重新运行整个脚本并不是一个真正的选择,因为这需要大约 90 个小时。

我一直在尝试分解代码以删除"下载文件"功能并保留生成.csv的部分,但到目前为止没有运气。

我找到了似乎负责它的函数(WriteLog(,但我正在努力将其与其他函数分开。

附言代码不是我的,我从无法访问的开发人员那里继承了它(不幸的是(

请在下面找到代码:

param(
[Parameter(Mandatory = $true)]
[string]$srcUrl,
[Parameter(Mandatory = $true)]
[string]$username,
[Parameter(Mandatory = $false,HelpMessage = "From Date: (dd/mm/yyyy)")]
[string]$fromDate,
[Parameter(Mandatory = $false,HelpMessage = "To Date: (dd/mm/yyyy)")]
[string]$toDate,
[Parameter(Mandatory = $true)]
[string]$folderPath,
[Parameter(Mandatory = $true)]
[string]$csvPath
) #end param
cls
#Load SharePoint CSOM Assemblies
Add-Type -Path "C:Program FilesSharePoint Online Management ShellMicrosoft.Online.SharePoint.PowerShellMicrosoft.SharePoint.Client.dll"
Add-Type -Path "C:Program FilesSharePoint Online Management ShellMicrosoft.Online.SharePoint.PowerShellMicrosoft.SharePoint.Client.Runtime.dll"
$global:OutFilePath = -join ($csvPath,"Documents.csv")
$global:OutFilePathError = -join ($csvPath,"ErrorLog_GetDocuments.csv")
$header = "Title,Type,Parent,Name,Path,FileSize(bytes),Created,Created by,Modified,Modified by,Matterspace title,Matterspace url"
$srcLibrary = "Documents"
$securePassword = Read-Host -Prompt "Enter your password: " -AsSecureString
$credentials = New-Object Microsoft.SharePoint.Client.SharePointOnlineCredentials ($username,$securePassword)
$sUrl = [System.Uri]$srcUrl
$domainUrl = -join ("https://",$sUrl.Host)
function WriteLog
{
param(
[Parameter(Mandatory = $true)] $title,$type,$folderName,$name,$path,$fileSize,$created,$createdby,$modifed,$modifiedby,$matterspacetitle,$materspaceUrl
)
$nowTime = Get-Date -Format "dd-MMM-yy,HH:mm:ss"
$folderName = $folderName.Replace(",","|") ### sometime folder / file name has comma so replace it with something
$name = $name.Replace(",","|")
#$path = $path.Replace(",","|")
$title=[System.String]::Concat("""""""$title""""""")
$type=[System.String]::Concat("""""""$type""""""")
$folderName=[System.String]::Concat("""""""$folderName""""""")
$name=[System.String]::Concat("""""""$name""""""")
$path=[System.String]::Concat("""""""$path""""""")
$fileSize=[System.String]::Concat("""""""$fileSize""""""")
$created=[System.String]::Concat("""""""$created""""""")
$createdby=[System.String]::Concat("""""""$createdby""""""")
$modified=[System.String]::Concat("""""""$modified""""""")
$modifiedby=[System.String]::Concat("""""""$modifiedby""""""")
$matterspacetitle=[System.String]::Concat("""""""$matterspacetitle""""""")
$materspaceUrl=[System.String]::Concat("""""""$materspaceUrl""""""")     
$lineContent = "$("$title"),$($type),$($folderName),$($name),$($path),$($fileSize),$($created),$($createdby),$($modified),$($modifiedby),$($matterspacetitle),$($materspaceUrl)"
Add-Content -Path $global:OutFilePath -Value "$lineContent" 
}
#Function to get all files of a folder
Function Get-FilesFromFolder([Microsoft.SharePoint.Client.Folder]$Folder,$SubWeb,$MTitle)
{
Write-host -f Yellow "Processing Folder:"$Folder.ServerRelativeUrl
$folderItem = $Folder.ListItemAllFields
#$srcContext.Load($f)
$Ctx.Load($folderItem)
$Ctx.ExecuteQuery()
#Get All Files of the Folder
$Ctx.load($Folder.files)
$Ctx.ExecuteQuery()
$authorEmail = $folderItem["Author"].Title
$editorEmail = $folderItem["Editor"].Title
$filepath = $folderItem["FileDirRef"]
if([string]::IsNullOrEmpty($filepath))
{
$filepath=$Folder.ServerRelativeUrl
}
$created = $folderItem["Created"]
$modified = $folderItem["Modified"]
$title = $folderItem["Title"]
if ([string]::IsNullOrEmpty($title))
{
$title = "Not Specified"
}
#$fileSize = $fItem["File_x0020_Size"]
$fileName = $Folder.Name    
#list all files in Folder
write-host $Folder.Name
$splitString=$Folder.ServerRelativeUrl -split('/')
$dirUrl="";
write-host $splitString.Length
$parentUrl=""
For($i=3; $i -le $splitString.Length;$i++)
{
if($splitString[$i] -notcontains('.'))
{
Write-Host $i
Write-Host $splitString[$i]
$dirUrl=-join($dirUrl,"",$splitString[$i])
$parentUrl=-join($parentUrl,"",$splitString[$i+1])
}
}
$dirPath = -join ($folderPath,$dirUrl)
WriteLog $title "Folder" $parentUrl.TrimEnd('') $fileName $filepath 0 $created $authorEmail $modified $editorEmail $MTitle $SubWeb
write-host $dirPath
if (-not (Test-Path -Path $dirPath))
{
New-Item -ItemType directory -Path $dirPath
}
ForEach ($File in $Folder.files)
{
try{
$remarkDetail = ""
$replacedUser = ""
$fItem = $File.ListItemAllFields
#$srcContext.Load($f)
$Ctx.Load($fItem)
$Ctx.ExecuteQuery()
$authorEmail = $fItem["Author"].Email
$editorEmail = $fItem["Editor"].Email
$filepath = $fItem["FileDirRef"]
$fileSizeBytes = $fItem["File_x0020_Size"];
$fileSize = ($fileSizeBytes) / 1MB
$fileName = $fItem["FileLeafRef"]
$title = $fItem["Title"]
$filecreated = $fitem["Created"]
$fileModified = $fitem["Modified"]
$FileUrl = $fItem["FileRef"]
$Fname=$File.Name
if ([string]::IsNullOrEmpty($title))
{
$title = "Not Specified"
}
#$title,$type, $folderName,$name,$path,$fileSize,$created,$createdby,$modifed,$modifiedby,$matterspacetitle,$materspaceUrl
$dateToCompare = Get-Date (Get-Date -Date $fileModified -Format 'dd/MM/yyyy')
#Get the File Name or do something
if (($dateToCompare -ge $startDate -and $dateToCompare -le $endDate) -or ($startDate -eq $null -and $endDate -eq $null))
{
$downloadUrl = -join ($dirPath,$File.Name)
$fromfile = -join ($domainUrl,$FileUrl)
Write-Host "Downloading the file from " $fromfile -ForegroundColor Cyan
try{
$webclient = New-Object System.Net.WebClient
$webclient.Credentials = New-Object Microsoft.SharePoint.Client.SharePointOnlineCredentials ($username,$securePassword)
$webclient.Headers.Add("X-FORMS_BASED_AUTH_ACCEPTED","f")
$webclient.DownloadFile($fromfile,$downloadUrl)
$webclient.Dispose()
}
catch{
$ErrorMessage=$_.Exception.Message
$ErrorMessage = $ErrorMessage -replace "`t|`n|`r",""
$ErrorMessage = $ErrorMessage -replace " ;|; ",";"
$lineContent = "$($Fname),$($fromfile ),$($ErrorMessage)"
Add-Content -Path $global:OutFilePathError -Value "$lineContent"    
Write-Host "Skipping the file and recalling the function" -ForegroundColor Blue
}
WriteLog $title "File" $Folder.Name $fileName $FileUrl $fileSize $created $authorEmail $modified $editorEmail $MTitle $SubWeb
Write-host -f Magenta $File.Name
}
else
{
Write-Host "Skipping the matterspace :" $title " as the matterspace was not in the date range" -ForegroundColor Blue
}
}
catch{
$ErrorMessage=$_.Exception.Message
$ErrorMessage = $ErrorMessage -replace "`t|`n|`r",""
$ErrorMessage = $ErrorMessage -replace " ;|; ",";"
$lineContent = "$($Fname),$($fromfile ),$($ErrorMessage)"
Add-Content -Path $global:OutFilePathError -Value "$lineContent"    
}
}
#Recursively Call the function to get files of all folders
$Ctx.load($Folder.Folders)
$Ctx.ExecuteQuery()
#Exclude "Forms" system folder and iterate through each folder
ForEach($SubFolder in $Folder.Folders | Where {$_.Name -ne "Forms"})
{
Get-FilesFromFolder -Folder $SubFolder -SubWeb $SubWeb -Mtitle $MTitle
}
}
Function Get-SPODocLibraryFiles()
{
param
(
[Parameter(Mandatory=$true)] [string] $SiteURL,
[Parameter(Mandatory=$true)] [string] $LibraryName
)

#Setup the context
$Ctx = New-Object Microsoft.SharePoint.Client.ClientContext($SiteURL)
$Ctx.Credentials = $credentials
$srcWeb = $Ctx.Web
$childWebs = $srcWeb.Webs
$Ctx.Load($childWebs)
$Ctx.ExecuteQuery()
foreach ($childweb in $childWebs)
{
try
{
#Get the Library and Its Root Folder
$Library=$childweb.Lists.GetByTitle($LibraryName)
$Ctx.Load($Library)
$Ctx.Load($Library.RootFolder)
$Ctx.ExecuteQuery()
#Call the function to get Files of the Root Folder
if($childweb.Url.ToLower() -notlike "*ehcontactus*" -and $childweb.Url.ToLower() -notlike "*ehfaqapp*" -and $childweb.Url.ToLower() -notlike "*ehquicksearch*" -and $childweb.Url.ToLower() -notlike "*ehsiteapps*" -and $childweb.Url.ToLower() -notlike "*ehsitelist*" -and $childweb.Url.ToLower() -notlike "*ehwelcomeapp*" -and $childweb.Url.ToLower() -notlike "*ehimageviewer*")
{
Get-FilesFromFolder -Folder $Library.RootFolder -SubWeb $childweb.Url -MTitle $childweb.Title
}
}
catch{
write-host "Skipping the matterpsace as the library does not exists" -ForegroundColor Blue
}
}


}
#Config Parameters
#$SiteURL= "https://impigerspuat.sharepoint.com/sites/ELeave/Eleave1/adminuat@impigerspuat.onmicrosoft.com"
$LibraryName="Documents"
#$securePassword = Read-Host -Prompt "Enter your password: " -AsSecureString 
#Call the function to Get All Files from a document library
if (-not ([string]::IsNullOrEmpty($fromDate)))
{
$startDate = Get-Date (Get-Date -Date $fromDate -Format 'dd/MM/yyyy')
}
else
{
$startDate = $null;
}
if (-not ([string]::IsNullOrEmpty($toDate)))
{
$endDate = Get-Date (Get-Date -Date $toDate -Format 'dd/MM/yyyy')
}
else
{
$endDate = $null
}
Get-SPODocLibraryFiles -SiteURL $srcUrl -LibraryName $LibraryName

您是否尝试过仅运行该函数并为其提供在函数中请求的参数?

将代码复制到 WriteLog.ps1 文件中,然后使用参数调用脚本文件。

即。

Writelog.ps1 $srcUrl $username $fromDate $toDate $folderPath $csvPath

显然,输入数据代替变量。

FWIW,从别人的脚本中提取相关代码段是一项很好的练习技能。你想做的一切都以前做过,但你可能不得不分解别人的工作,然后才能适应你的确切环境。

不幸的是,看起来你必须以旧时尚的方式做到这一点。 问题是作者在下载文件时输出到日志 (csv(。 而不是先下载到暂存区域...

我建议在代码中设置一个早期断点,然后逐步查看它的确切流动方式。 这应该给你一个大致的想法,以及足够的信息来开始编写重构的代码。

逆向工程总是很艰难的,至少可以说,做好准备这将是有条不紊的练习。

坏消息:这将是一个迭代过程,而不是一个单一的"解决方案"。 该代码没有任何"错误",但有一些设计选择使这成为一个挑战。 它不是一致缩进的,它以略有不同的方式编织所有变量赋值。 看起来比我的大多数代码都好,我只是告诉你是什么让它成为一个挑战。

好消息:至少WriteLog函数是独立的。 它实际上只是将内容添加到此处分配的此变量中定义的.csv文件中:

$global:OutFilePath = -join ($csvPath,"Documents.csv")

(我的副本中的第 20 行(

*

建议:(这是一种方法,只是最终解决方案的指南(

  1. 获取现有代码并将其放入 IDE 中,以直观地帮助您。 Windows Powershell ISE是足够的,但我强烈推荐VSCode。

  2. 注释掉最后一行:

Get-SPODocLibraryFiles -SiteURL $srcUrl -LibraryName $LibraryName

因此,您可以保留实际要保留的脚本中的任何其他上下文。

  1. 创建一个单独的函数,如下所示:
function Get-FilesFromLocalFolder ($localdir, $SubWeb, $MTitle)

来代替现有的函数 Get-FilesFromFolder。 这样,您可以遍历所需的任何目录,获取文件,并分配变量以作为参数传递。 然后,当您调用WriteLog时,它将看起来非常相似。 最后两个参数($SubWeb、$MTitle(只是因为 WriteLog 需要它们而传递它们。 您可以将它们设为自己的标签,也可以删除它们并在 WriteLog 中使它们可选。

  1. 您可以从对函数的每个必需参数中的值进行硬编码开始,然后运行它以查看输出是否正常工作。

这将花费您一些迭代(同意@Steven(,这绝对是一个有价值的练习(同意@TheIdesOfMark(。 :)

最新更新