Handling large data sets in SharePoint Online requires efficient querying, pagination, and batch processing to avoid performance issues and throttling. PnP PowerShell helps IT admins automate and optimize these tasks.
Key Challenges with Large Data Sets
🔹 SharePoint List View Threshold: Maximum 5000 items per query
🔹 Query Throttling: Too many requests lead to restrictions
🔹 Slow Data Retrieval: Large lists slow down performance
🔹 Exporting Large Data: Time-consuming and complex
This guide provides optimized PnP PowerShell techniques to handle large data efficiently.
Step 1: Connecting to SharePoint Online
1.1 Install PnP PowerShell
Install-Module PnP.PowerShell -Scope CurrentUser
1.2 Connect to SharePoint Online
Connect-PnPOnline -Url "https://yourtenant.sharepoint.com" -Interactive
Result: Establishes a connection to SharePoint Online.
Step 2: Efficiently Retrieving Large Lists with Pagination
2.1 Using Pagination to Retrieve Items in Batches
$ListName = "LargeDocuments"
$BatchSize = 1000
$Items = @()
$NextBatch = $true
$Page = 0
while ($NextBatch) {
$Page++
Write-Host "Fetching Batch $Page..."
$BatchItems = Get-PnPListItem -List $ListName -PageSize $BatchSize -Fields "Title", "ID"
if ($BatchItems.Count -lt $BatchSize) {
$NextBatch = $false
}
$Items += $BatchItems
}
Write-Host "Total Items Retrieved: $($Items.Count)"
Result: Retrieves list items in batches instead of exceeding the 5000-item limit.
Step 3: Handling SharePoint Throttling
3.1 Implementing Auto-Retry for Throttled Queries
Function Get-PnPDataWithRetry {
param (
[string]$ListName,
[int]$BatchSize
)
$RetryCount = 0
$MaxRetries = 5
$Success = $false
$Items = @()
while (-not $Success -and $RetryCount -lt $MaxRetries) {
Try {
$Items = Get-PnPListItem -List $ListName -PageSize $BatchSize
$Success = $true
} Catch {
$RetryCount++
Write-Host "Throttling detected, retrying ($RetryCount/$MaxRetries)..."
Start-Sleep -Seconds (2 * $RetryCount)
}
}
return $Items
}
$ListData = Get-PnPDataWithRetry -ListName "LargeDocuments" -BatchSize 1000
Write-Host "Total Items Retrieved: $($ListData.Count)"
Result: Implements auto-retry for throttled queries with exponential backoff.
Step 4: Exporting Large SharePoint Data to CSV
4.1 Exporting in Chunks to Avoid Memory Issues
$ListName = "LargeDocuments"
$BatchSize = 2000
$FilePath = "C:\Exports\LargeListExport.csv"
$Items = Get-PnPListItem -List $ListName -PageSize $BatchSize -Fields "Title", "Created", "Modified"
$ExportData = $Items | Select Title, Created, Modified
$ExportData | Export-Csv -Path $FilePath -NoTypeInformation
Write-Host "Export completed. File saved at $FilePath"
Result: Exports large lists without exceeding memory limits.
Step 5: Bulk Updating Large SharePoint Lists
5.1 Processing Items in Batches
$ListName = "LargeDocuments"
$BatchSize = 1000
$Items = Get-PnPListItem -List $ListName -PageSize $BatchSize -Fields "ID", "Title"
foreach ($Item in $Items) {
Set-PnPListItem -List $ListName -Identity $Item["ID"] -Values @{"Title" = "Updated - $($Item['Title'])"}
}
Write-Host "Batch update completed!"
Result: Updates large lists efficiently in batches.
Step 6: Deleting Large Data Sets in Bulk
6.1 Deleting Large List Items with Throttling Control
$ListName = "OldDocuments"
$BatchSize = 500
$Items = Get-PnPListItem -List $ListName -PageSize $BatchSize -Fields "ID"
foreach ($Item in $Items) {
Remove-PnPListItem -List $ListName -Identity $Item["ID"] -Force
Start-Sleep -Milliseconds 200 # Prevents throttling
}
Write-Host "Bulk delete completed!"
Result: Deletes large datasets while avoiding throttling.
Step 7: Migrating Large Data to Another List
7.1 Copying Items in Batches
$SourceList = "OldDocuments"
$TargetList = "NewDocuments"
$BatchSize = 500
$Items = Get-PnPListItem -List $SourceList -PageSize $BatchSize -Fields "Title", "Created"
foreach ($Item in $Items) {
Add-PnPListItem -List $TargetList -Values @{
"Title" = $Item["Title"]
"Created" = $Item["Created"]
}
}
Write-Host "Migration completed!"
Result: Efficiently migrates large lists to a new location.
Step 8: Automating Large Data Handling with Task Scheduler
1️⃣ Save the script as HandleLargeData.ps1
2️⃣ Open Task Scheduler
3️⃣ Create a new task
4️⃣ Set trigger to daily/weekly execution
5️⃣ Set action to run PowerShell script:
powershell.exe -ExecutionPolicy Bypass -File "C:\Scripts\HandleLargeData.ps1"
Result: Automates large data handling tasks regularly.