Automating document classification in SharePoint Online improves content organization, retrieval, and compliance. Using PnP PowerShell, we can:
Assign content types dynamically
Apply metadata and tags based on file properties
Set up retention labels for compliance
Move or categorize documents automatically
Step 1: Prerequisites
1.1 Install and Update PnP PowerShell
Ensure PnP PowerShell is installed:
Install-Module PnP.PowerShell -Scope CurrentUser
Update if needed:
Update-Module PnP.PowerShell
1.2 Connect to SharePoint
Authenticate with your SharePoint site:
Connect-PnPOnline -Url "https://yourtenant.sharepoint.com/sites/YourSite" -Interactive
Result: You are now connected to your SharePoint environment.
Step 2: Retrieve Documents from a Library
To get all files from a Document Library:
$LibraryName = "Documents"
$Files = Get-PnPListItem -List $LibraryName
$Files | Select Title, FileLeafRef, FileDirRef, ContentType
Result: Fetches document titles, names, paths, and content types.
Step 3: Assign Content Types Based on File Types
To automatically classify documents by file type:
$LibraryName = "Documents"
$Files = Get-PnPListItem -List $LibraryName
foreach ($File in $Files) {
$FileName = $File["FileLeafRef"]
if ($FileName -like "*.pdf") {
Set-PnPListItem -List $LibraryName -Identity $File.Id -Values @{"ContentTypeId"="0x010100ABC123"} # Content Type ID for "Reports"
}
elseif ($FileName -like "*.xlsx") {
Set-PnPListItem -List $LibraryName -Identity $File.Id -Values @{"ContentTypeId"="0x010100DEF456"} # Content Type ID for "Financials"
}
}
Result: Assigns content types based on document format.
Step 4: Apply Metadata Automatically
To classify files based on their names or properties, use:
foreach ($File in $Files) {
$FileName = $File["FileLeafRef"]
if ($FileName -like "*Invoice*") {
Set-PnPListItem -List $LibraryName -Identity $File.Id -Values @{"Category"="Finance"; "Department"="Accounts"}
}
elseif ($FileName -like "*Contract*") {
Set-PnPListItem -List $LibraryName -Identity $File.Id -Values @{"Category"="Legal"; "Department"="Compliance"}
}
}
Result: Assigns metadata tags based on document keywords.
Step 5: Assign Retention Labels Automatically
To enforce data compliance using retention labels:
foreach ($File in $Files) {
if ($File["FileLeafRef"] -like "*Confidential*") {
Set-PnPListItem -List $LibraryName -Identity $File.Id -Values @{"ComplianceTag"="Confidential-5Years"}
}
}
Result: Applies retention labels automatically.
Step 6: Move Documents to Folders Based on Classification
To organize documents by moving them to folders:
foreach ($File in $Files) {
$FileName = $File["FileLeafRef"]
if ($FileName -like "*HR*") {
Move-PnPFile -ServerRelativeUrl "/sites/YourSite/Shared Documents/$FileName" -TargetUrl "/sites/YourSite/Shared Documents/HR/$FileName" -Force
}
}
Result: Moves documents based on classification.
Step 7: Automate with a Scheduled Task
To schedule this classification script daily:
1️⃣ Save the script as ClassifyDocuments.ps1
2️⃣ Open Task Scheduler in Windows
3️⃣ Create a new task
4️⃣ Set the trigger to Daily
5️⃣ Set the action to Run PowerShell script:
powershell.exe -ExecutionPolicy Bypass -File "C:\Scripts\ClassifyDocuments.ps1"
Result: Automates document classification daily.