HiveBrain v1.2.0
Get Started
← Back to all entries
patternshellMinor

Gather information about computers from multiple CSV files

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
csvcomputersfilesaboutmultiplefromgatherinformation

Problem

I created a script to import several CSV files from various sources and one CSV file with a list of systems in it. the script searches each CSV file to see if the system exist in the file and if it does then write the property information to a variable. The challenge that I have ran into is there are 26,000 systems to search for and so far the script has been running for over 24 hours and just now over half way through. Any ideas on how to speed this up?

```
$Final = @()

#Import CSV files
$Systems = Import-Csv C:\Projects\Master.csv
$vCenter = Import-Csv C:\Projects\vcenter.csv
$Storage = Import-Csv C:\Projects\Storage.csv
$SCCM = Import-Csv C:\Projects\SCCM.csv
$Database = Import-Csv C:\Projects\Database.csv
$OldAD = Import-Csv C:\Projects\AD_Old.csv
$ADprod = Import-Csv C:\Projects\AD.csv

Write-Host "Import Complete!"
$N = 0

foreach ($System in $Systems)
{
Write-Host "Line " $N

$Sys = New-Object System.Object
$Sys | Add-Member -type NoteProperty -name "System Name" -value $System.Name

#############################
#Database information Compare
#############################
If ($Database.Name -contains $System.Name)
{
#Get the system information from the CSV file being compared
$Domain = $Database | Where-Object { $_.Name -eq $System.name } | Select-object "DomainName" -ExpandProperty "DomainName"
$SQLin = $Database | Where-Object { $_.Name -eq $System.name } | Select-object "SQLInstance" -ExpandProperty "SQLInstance"
$Instance = $Database | Where-Object { $_.Name -eq $System.name } | Select-object "Instance" -ExpandProperty "Instance"
$OS = $Database | Where-Object { $_.Name -eq $System.name } | Select-object "OS" -ExpandProperty "OS"
$SQLver = $Database | Where-Object { $_.Name -eq $System.name } | Select-object "SQLVersion" -ExpandProperty "SQLVersion"
$SQLsp = $Database | Where-Object { $_.Name -eq $System.name } | Select-object "SQLServicePack" -ExpandProperty "SQLSe

Solution

Currently your code recreates $Final array in each iteration, scans each of the CSV files entirely in each iteration for each field using Where with a scriptblock(!), copies the $Sys object on each field (NoteProperty) addition.

All of the above are extremely expensive operations. Instead do the following:

  • Build hashtables from the source CSV files with the name field as a key, thus providing almost instant lookup instead of scanning the entire CSV array.



  • Build a hashtable of properties to add to the final object, then convert it to a custom object in just one operation at the end of the foreach body.



  • Use foreach statement as an expression so that its output is accumulated in the final array automatically without recreating the array in each iteration.




# import CSV as hashtables

$Systems = Import-Csv C:\Projects\Master.csv

function Index-Csv([string]$file) {
    $csv = @{}
    foreach ($rec in (Import-Csv $file)) {
        $csv[$rec.name] = $rec
    }
    $csv
}

$vCenter = Index-Csv C:\Projects\vcenter.csv
$Storage = Index-Csv C:\Projects\Storage.csv
$SCCM = Index-Csv C:\Projects\SCCM.csv
$Database = Index-Csv C:\Projects\Database.csv
$OldAD = Index-Csv C:\Projects\AD_Old.csv
$ADprod = Index-Csv C:\Projects\AD.csv



# process

$Final = foreach ($System in $Systems) {
    $Sys = @{"System Name" = $System.Name}

    if ($rec = $Database[$System.Name]) {
        $Sys += @{
            "Domain Name"                   = $rec.DomainName
            "SQL Instance"                  = $rec.SQLInstance
            "Database Instance"             = $rec.Instance
            "Database Name"                 = $rec.DBName
            "Operating System"              = $rec.OS
            "SQL Version"                   = $rec.SQLVersion
            "SQL Service Pack"              = $rec.SQLServicePack
            "SQL Edition"                   = $rec.SQLEdition
            "Operating System Service Pack" = $rec.ServicePack
            "System Architecture"           = $rec.SystemArchitecture
            "IP Address"                    = $rec.IP
            "Environment"                   = $rec.Environment
            "In Database File"              = "Yes"
        }
    } else {
        $Sys += @{"In Database File" = "No"}
    }

    #########################################
    # other CSV are processed just the same #
    #########################################

    [PSCustomObject]$Sys # PowerShell 3.0
    #New-Object PSObject -property $Sys # PowerShell 2.0
}


The above code shows only $Database, the other CSV tables are processed exactly the same.

Code Snippets

$Systems = Import-Csv C:\Projects\Master.csv

function Index-Csv([string]$file) {
    $csv = @{}
    foreach ($rec in (Import-Csv $file)) {
        $csv[$rec.name] = $rec
    }
    $csv
}

$vCenter = Index-Csv C:\Projects\vcenter.csv
$Storage = Index-Csv C:\Projects\Storage.csv
$SCCM = Index-Csv C:\Projects\SCCM.csv
$Database = Index-Csv C:\Projects\Database.csv
$OldAD = Index-Csv C:\Projects\AD_Old.csv
$ADprod = Index-Csv C:\Projects\AD.csv
$Final = foreach ($System in $Systems) {
    $Sys = @{"System Name" = $System.Name}

    if ($rec = $Database[$System.Name]) {
        $Sys += @{
            "Domain Name"                   = $rec.DomainName
            "SQL Instance"                  = $rec.SQLInstance
            "Database Instance"             = $rec.Instance
            "Database Name"                 = $rec.DBName
            "Operating System"              = $rec.OS
            "SQL Version"                   = $rec.SQLVersion
            "SQL Service Pack"              = $rec.SQLServicePack
            "SQL Edition"                   = $rec.SQLEdition
            "Operating System Service Pack" = $rec.ServicePack
            "System Architecture"           = $rec.SystemArchitecture
            "IP Address"                    = $rec.IP
            "Environment"                   = $rec.Environment
            "In Database File"              = "Yes"
        }
    } else {
        $Sys += @{"In Database File" = "No"}
    }

    #########################################
    # other CSV are processed just the same #
    #########################################

    [PSCustomObject]$Sys # PowerShell 3.0
    #New-Object PSObject -property $Sys # PowerShell 2.0
}

Context

StackExchange Code Review Q#144659, answer score: 2

Revisions (0)

No revisions yet.