Identifying and Fixing Broken Links in SharePoint Online using PnP PowerShell

Loading

Broken links in SharePoint Online can:
Lead to poor user experience
Cause navigation issues
Affect search and SEO performance

With PnP PowerShell, we can identify and fix broken links in SharePoint pages, documents, and lists.


Step 1: Connect to SharePoint Online

$siteUrl = "https://yourtenant.sharepoint.com/sites/YourSite"
Connect-PnPOnline -Url $siteUrl -Interactive
Write-Host " Connected to SharePoint Online"

✔ Establishes a secure connection.


Step 2: Retrieve Pages and Documents

$libraryName = "Site Pages"
$pages = Get-PnPListItem -List $libraryName -Fields "FileRef", "PublishingPageContent"
Write-Host " Retrieved pages from '$libraryName'"

✔ Fetches all site pages and their content.

For documents:

powershellCopyEdit$docLibrary = "Documents"
$files = Get-PnPListItem -List $docLibrary -Fields "FileRef"
Write-Host " Retrieved documents from '$docLibrary'"

✔ Lists all document URLs.


Step 3: Extract Links from Pages and Documents

Function Extract-LinksFromHtml($html) {
$matches = [regex]::Matches($html, "(?i)href\s*=\s*['""](.*?)['""]")
return $matches | ForEach-Object { $_.Groups[1].Value }
}

$brokenLinks = @()

foreach ($page in $pages) {
$htmlContent = $page["PublishingPageContent"]
$links = Extract-LinksFromHtml $htmlContent

foreach ($link in $links) {
$response = Invoke-WebRequest -Uri $link -Method Head -UseBasicParsing -ErrorAction SilentlyContinue
If ($response.StatusCode -ne 200) {
$brokenLinks += [PSCustomObject]@{
Page = $page["FileRef"]
Link = $link
Status = $response.StatusCode
}
}
}
}

If ($brokenLinks) {
Write-Host " Broken links detected!"
$brokenLinks | Format-Table -AutoSize
} Else {
Write-Host " No broken links found."
}

Extracts links from HTML content.
Checks if links are broken.


Step 4: Generate a Report of Broken Links

$reportPath = "C:\Reports\BrokenLinksReport.csv"
$brokenLinks | Export-Csv -Path $reportPath -NoTypeInformation
Write-Host " Broken links report saved at: $reportPath"

✔ Saves the broken links report to a CSV file.


Step 5: Fixing Broken Links

Option 1: Replace Broken Links Manually

$oldUrl = "https://oldsite.com/file.pdf"
$newUrl = "https://newsite.com/file.pdf"

foreach ($page in $pages) {
$htmlContent = $page["PublishingPageContent"]
If ($htmlContent -match $oldUrl) {
$updatedContent = $htmlContent -replace $oldUrl, $newUrl
Set-PnPListItem -List $libraryName -Identity $page.Id -Values @{"PublishingPageContent" = $updatedContent}
Write-Host " Fixed link in: $($page['FileRef'])"
}
}

Finds and replaces broken links in pages.


Option 2: Remove Broken Links Automatically

foreach ($page in $pages) {
$htmlContent = $page["PublishingPageContent"]

foreach ($broken in $brokenLinks) {
$htmlContent = $htmlContent -replace "<a[^>]+?href=['""]$($broken.Link)['""][^>]*>.*?</a>", ""
}

Set-PnPListItem -List $libraryName -Identity $page.Id -Values @{"PublishingPageContent" = $htmlContent}
Write-Host " Removed broken links from: $($page['FileRef'])"
}

Removes all broken links from pages.


Step 6: Automate Broken Link Checks

Schedule the PowerShell script to run weekly:

$taskName = "SharePoint Broken Link Checker"
$scriptPath = "C:\Scripts\BrokenLinksChecker.ps1"

$action = New-ScheduledTaskAction -Execute "PowerShell.exe" -Argument "-File $scriptPath"
$trigger = New-ScheduledTaskTrigger -Weekly -DaysOfWeek Sunday -At 3AM
Register-ScheduledTask -TaskName $taskName -Action $action -Trigger $trigger -User "SYSTEM" -RunLevel Highest

Write-Host " Automated broken link checker scheduled."

Ensures ongoing monitoring of broken links.

Leave a Reply

Your email address will not be published. Required fields are marked *