Whenever I look for improving the performances of a script or code there is sometimes an option for parallelising the processing/execution of some of the workflows. It generally means adding complexity and is not always the case that it will make necessary the whole process so much faster, but for long-running ones made of independent sub-tasks, it is probably a very effective strategy.
A recent comment gave me the idea for this article, even if the subject could get potentially very complex if we go deep, but I will do my best to keep it very simple and at the same time giving you some good examples of it.
Serial Processing
I’m sure that everybody is familiar with the Foreach-Object cmdlet that allows you to iterate a list of objects passed as input as in this example:
1 2 3 4 5 6 7 8 9 10 11 12 |
1..10 | foreach-object {Write-OutPut $_} 1 2 3 4 5 6 7 8 9 10 |
The benefit of the serial processing is that the code is very easy to read, the elements are iterated sequentially one after the other, so the behaviour is predictable.
Parallel Processing
With Powershell 7 it is been introduced Foreach-object -parallel . In the following script, we will measure its performances and see how it works.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
#Requires -Version 7 # THIS SCRIPT COMPARES SERIAL AND PARALLEL EXECUTION OF A FOREACH LOOP # Paolo Frigo, https://www.scriptinglibrary.com Write-Output "Serial Execution of 10 tasks of 1 seconds" Measure-Command -expression {1..10 | foreach-object {Start-Sleep -seconds 1}} Write-Output "Parallel Execution of 10 tasks of 1 seconds " Measure-Command -expression {1..10 | foreach-object -parallel {Start-Sleep -seconds 1}} #Setting Throttlelimit to 10 instead 5 (default value) #Interesting blog article from Paul Higinbotham : https://devblogs.microsoft.com/powershell/powershell-foreach-object-parallel-feature/ Write-Output "Parallel Execution (throttlelimit = 10 ) of 10 tasks of 1 seconds" Measure-Command -expression {1..10 | foreach-object -parallel {Start-Sleep -seconds 1} -throttlelimit 10} |
This is the output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
Serial Execution of 10 tasks of 1 seconds Days : 0 Hours : 0 Minutes : 0 Seconds : 10 Milliseconds : 17 Ticks : 100175032 TotalDays : 0.000115943324074074 TotalHours : 0.00278263977777778 TotalMinutes : 0.166958386666667 TotalSeconds : 10.0175032 TotalMilliseconds : 10017.5032 Parallel Execution of 10 tasks of 1 seconds Days : 0 Hours : 0 Minutes : 0 Seconds : 2 Milliseconds : 210 Ticks : 22101352 TotalDays : 2.55802685185185E-05 TotalHours : 0.000613926444444444 TotalMinutes : 0.0368355866666667 TotalSeconds : 2.2101352 TotalMilliseconds : 2210.1352 Parallel Execution (throttlelimit = 10 ) of 10 tasks of 1 seconds Days : 0 Hours : 0 Minutes : 0 Seconds : 1 Milliseconds : 240 Ticks : 12402164 TotalDays : 1.43543564814815E-05 TotalHours : 0.000344504555555556 TotalMinutes : 0.0206702733333333 TotalSeconds : 1.2402164 TotalMilliseconds : 1240.2164 |
One important thing to note is the throttlelimit which has a default value of 5, depending on your case you might find the need to modifying it as shown in my example above to improve the performance of your script (e.g. reducing the total execution time).
One article that I found very useful was written by Paul Higinbotham that I recommend you to read.
Parallel Execution using JOBS
Leveraging background Jobs is also another way of parallelising your workflows, but the code you will write could be less simple to read compared to the foreach -parallel of powershell 7, but you don’t need powershell core to run it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
# THIS SCRIPT TEST PARALLEL EXECUTION USING JOBS # Paolo Frigo, https://www.scriptinglibrary.com $ServerList = "www.google.com", "www.bing.com", "www.yahoo.com" #CLEAR THE JOB LIST Get-Job | Remove-Job #START A LIST OF JOBS $ServerList | Foreach-object {Start-job -name "$_" -scriptblock {param ($Target) Test-connection -computername $Target -count 1} -argumentlist $_} # Note that Job States are: RUNNING, COMPLETED, FAILED # WAITS FOR ALL JOBS TO COMPLETE UP TO THE TIMEOUT LIMIT # PREVENTING THE SCRIPT TO RUN FOREVER $Timeout = 60 #seconds $Counter = 0 do{Start-sleep -seconds 1; $Counter+=1} while( (Get-Job).state -contains "Running" -and $Timeout -gt $counter) #GET ALL THE RESULTS WITH KEEP (WITHOUT DELETING THEM) $ServerList | Foreach-object {Receive-Job -name $_ -keep} #CLEAR THE JOB LIST Get-Job | Remove-Job #this step is not required if KEEP flag is removed. |
A background job has three possible states: RUNNING, FAILED, COMPLETED. Once the jobs are started we need to monitor the when they are completed.
In my script, I’ve used a do/while looking for RUNNING state of jobs to wait using a 1-second interval sleep and a counter that will be compared against a Timeout value to avoid waiting forever. To make the script more robust in your case you might consider the case when the state is FAILED and threat that exception appropriately according to your needs.
The last important note when working with Receive-Job cmdlet is the -keep flag. By default without keep the job will get deleted once retrieved so if you want to use the -keep flag you need to remember to clean and not leaving the job result persistent on your system.
With this result:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
PowerShell git:(master) pwsh Test-Job.ps1 Id Name PSJobTypeName State HasMoreData Location Command -- ---- ------------- ----- ----------- -------- ------- 1 www.google.com BackgroundJob Running True localhost param ($Target) Test-con… 3 www.bing.com BackgroundJob Running True localhost param ($Target) Test-con… 5 www.yahoo.com BackgroundJob Running True localhost param ($Target) Test-con… RunspaceId : 24648b2e-f5ab-4c3d-8242-cde4d125f03c Ping : 1 Source : Paolos-MBP.lan Destination : www.google.com Address : 142.250.66.196 DisplayAddress : 142.250.66.196 Latency : 24 Status : Success BufferSize : 32 Reply : System.Net.NetworkInformation.PingReply RunspaceId : d7afc6d9-8a18-4911-b7f2-002b999f5a7f Ping : 1 Source : Paolos-MBP.lan Destination : www.bing.com Address : 13.107.21.200 DisplayAddress : 13.107.21.200 Latency : 27 Status : Success BufferSize : 32 Reply : System.Net.NetworkInformation.PingReply RunspaceId : f55b105e-eb1b-4b22-966d-f07dc627a732 Ping : 1 Source : Paolos-MBP.lan Destination : www.yahoo.com Address : 106.10.250.10 DisplayAddress : 106.10.250.10 Latency : 450 Status : Success BufferSize : 32 Reply : System.Net.NetworkInformation.PingReply |
As usual, if you like these scripts you can find them on my GitHub repository.