HiveBrain v1.2.0
Get Started
← Back to all entries
patternshellModerate

Powershell search millions of files as fast as possible

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
fastsearchmillionspowershellpossiblefiles

Problem

I once asked a similar question but in C#. Now I have the same problem in powershell..

What is the fastest way, to search files newer than 15 minutes, in a file system with more than 1 million files?

Is there any faster way than using pipe?

Get-ChildItem -Path $path -Recurse | Select Name, PSIsContainer, Directory, LastWriteTime, Length| where {($_.LastWriteTime -gt (Get-Date).AddMinutes(-15))}


I already cut off some attributes to minimize the object size. It still takes ages.

Solution

First, you don't need to call Get-Date for every file. Just call it once at the beginning:

$t = (Get-Date).AddMinutes(-15)

Get-ChildItem -Path $path -Recurse | 
    Select Name, PSIsContainer, Directory, LastWriteTime, Length | 
    where {($_.LastWriteTime -gt $t)}


That's saves about 10% (as measured by Measure-Command).

Secondly, you don't need to call Select-Object for each file either. Just change the processing order:

$t = (Get-Date).AddMinutes(-15)

Get-ChildItem -Path $path -Recurse | 
    where {($_.LastWriteTime -gt $t)} |
    Select Name, PSIsContainer, Directory, LastWriteTime, Length


Thirdly, try increasing the buffer size using the OutBuffer parameter:

$t = (Get-Date).AddMinutes(-15)

Get-ChildItem -Path $path -Recurse -OutBuffer 1000 | 
    where {($_.LastWriteTime -gt $t)} |
    Select Name, PSIsContainer, Directory, LastWriteTime, Length


I've used 1000, but you can experiment with the value.

Those three changes reduced the running time to under one half on my system.

Code Snippets

$t = (Get-Date).AddMinutes(-15)

Get-ChildItem -Path $path -Recurse | 
    Select Name, PSIsContainer, Directory, LastWriteTime, Length | 
    where {($_.LastWriteTime -gt $t)}
$t = (Get-Date).AddMinutes(-15)

Get-ChildItem -Path $path -Recurse | 
    where {($_.LastWriteTime -gt $t)} |
    Select Name, PSIsContainer, Directory, LastWriteTime, Length
$t = (Get-Date).AddMinutes(-15)

Get-ChildItem -Path $path -Recurse -OutBuffer 1000 | 
    where {($_.LastWriteTime -gt $t)} |
    Select Name, PSIsContainer, Directory, LastWriteTime, Length

Context

StackExchange Code Review Q#78294, answer score: 16

Revisions (0)

No revisions yet.