Broken links in SharePoint Online can:
Lead to poor user experience
Cause navigation issues
Affect search and SEO performance
With PnP PowerShell, we can identify and fix broken links in SharePoint pages, documents, and lists.
Step 1: Connect to SharePoint Online
$siteUrl = "https://yourtenant.sharepoint.com/sites/YourSite"
Connect-PnPOnline -Url $siteUrl -Interactive
Write-Host " Connected to SharePoint Online"
✔ Establishes a secure connection.
Step 2: Retrieve Pages and Documents
$libraryName = "Site Pages"
$pages = Get-PnPListItem -List $libraryName -Fields "FileRef", "PublishingPageContent"
Write-Host " Retrieved pages from '$libraryName'"
✔ Fetches all site pages and their content.
For documents:
powershellCopyEdit$docLibrary = "Documents"
$files = Get-PnPListItem -List $docLibrary -Fields "FileRef"
Write-Host " Retrieved documents from '$docLibrary'"
✔ Lists all document URLs.
Step 3: Extract Links from Pages and Documents
Function Extract-LinksFromHtml($html) {
$matches = [regex]::Matches($html, "(?i)href\s*=\s*['""](.*?)['""]")
return $matches | ForEach-Object { $_.Groups[1].Value }
}
$brokenLinks = @()
foreach ($page in $pages) {
$htmlContent = $page["PublishingPageContent"]
$links = Extract-LinksFromHtml $htmlContent
foreach ($link in $links) {
$response = Invoke-WebRequest -Uri $link -Method Head -UseBasicParsing -ErrorAction SilentlyContinue
If ($response.StatusCode -ne 200) {
$brokenLinks += [PSCustomObject]@{
Page = $page["FileRef"]
Link = $link
Status = $response.StatusCode
}
}
}
}
If ($brokenLinks) {
Write-Host " Broken links detected!"
$brokenLinks | Format-Table -AutoSize
} Else {
Write-Host " No broken links found."
}
✔ Extracts links from HTML content.
✔ Checks if links are broken.
Step 4: Generate a Report of Broken Links
$reportPath = "C:\Reports\BrokenLinksReport.csv"
$brokenLinks | Export-Csv -Path $reportPath -NoTypeInformation
Write-Host " Broken links report saved at: $reportPath"
✔ Saves the broken links report to a CSV file.
Step 5: Fixing Broken Links
Option 1: Replace Broken Links Manually
$oldUrl = "https://oldsite.com/file.pdf"
$newUrl = "https://newsite.com/file.pdf"
foreach ($page in $pages) {
$htmlContent = $page["PublishingPageContent"]
If ($htmlContent -match $oldUrl) {
$updatedContent = $htmlContent -replace $oldUrl, $newUrl
Set-PnPListItem -List $libraryName -Identity $page.Id -Values @{"PublishingPageContent" = $updatedContent}
Write-Host " Fixed link in: $($page['FileRef'])"
}
}
✔ Finds and replaces broken links in pages.
Option 2: Remove Broken Links Automatically
foreach ($page in $pages) {
$htmlContent = $page["PublishingPageContent"]
foreach ($broken in $brokenLinks) {
$htmlContent = $htmlContent -replace "<a[^>]+?href=['""]$($broken.Link)['""][^>]*>.*?</a>", ""
}
Set-PnPListItem -List $libraryName -Identity $page.Id -Values @{"PublishingPageContent" = $htmlContent}
Write-Host " Removed broken links from: $($page['FileRef'])"
}
✔ Removes all broken links from pages.
Step 6: Automate Broken Link Checks
Schedule the PowerShell script to run weekly:
$taskName = "SharePoint Broken Link Checker"
$scriptPath = "C:\Scripts\BrokenLinksChecker.ps1"
$action = New-ScheduledTaskAction -Execute "PowerShell.exe" -Argument "-File $scriptPath"
$trigger = New-ScheduledTaskTrigger -Weekly -DaysOfWeek Sunday -At 3AM
Register-ScheduledTask -TaskName $taskName -Action $action -Trigger $trigger -User "SYSTEM" -RunLevel Highest
Write-Host " Automated broken link checker scheduled."
✔ Ensures ongoing monitoring of broken links.