Dmitry Porotnikov / Reading Microsoft Purview Sensitivity Labels from Office Files

Created Thu, 19 Mar 2026 00:00:00 +0000 Modified Thu, 19 Mar 2026 00:00:00 +0000
710 Words

Reading Microsoft Purview Sensitivity Labels from Office Files

When working with Microsoft Purview (now Microsoft 365 compliance), sensitivity labels applied to Office documents are embedded directly into the file structure. This can be useful for auditing, compliance verification, or automation scenarios where you need to extract label information from multiple files without opening each one.

Office Open XML files (.docx, .xlsx, .pptx, etc.) store label metadata in two locations within their ZIP structure: docProps/custom.xml and docMetadata/LabelInfo.xml. This script reads both locations and returns structured label information.

Add-Type -AssemblyName System.IO.Compression.FileSystem

function Get-XmlPropertyValue {
    param([System.Xml.XmlElement]$PropertyNode)

    foreach ($child in $PropertyNode.ChildNodes) {
        if ($child.NodeType -eq [System.Xml.XmlNodeType]::Element) {
            return $child.InnerText
        }
    }

    return $null
}

function Get-PurviewLabelFromOfficeFile {
    [CmdletBinding()]
    param(
        [Parameter(Mandatory)]
        [string]$Path
    )

    if (-not (Test-Path -LiteralPath $Path)) {
        throw "File not found: $Path"
    }

    $resolvedPath = (Resolve-Path -LiteralPath $Path).Path
    $results = @()

    try {
        $zip = [System.IO.Compression.ZipFile]::OpenRead($resolvedPath)
    }
    catch {
        throw "File is not a readable Office Open XML package (or it is encrypted/protected in a way this ZIP-based reader can't inspect): $resolvedPath"
    }

    try {
        $sensitivityProperty = $null
        $customPropsByLabel = @{}

        # -------- Read docProps/custom.xml --------
        $customEntry = $zip.Entries | Where-Object { $_.FullName -eq 'docProps/custom.xml' } | Select-Object -First 1
        if ($customEntry) {
            $stream = $customEntry.Open()
            try {
                $xml = New-Object System.Xml.XmlDocument
                $xml.Load($stream)

                $propertyNodes = $xml.SelectNodes("//*[local-name()='property']")
                foreach ($prop in $propertyNodes) {
                    $name = $prop.GetAttribute("name")
                    $value = Get-XmlPropertyValue -PropertyNode $prop

                    if ([string]::IsNullOrWhiteSpace($name)) {
                        continue
                    }

                    if ($name -eq 'Sensitivity') {
                        $sensitivityProperty = $value
                        continue
                    }

                    if ($name -match '^MSIP_Label_([0-9a-fA-F-]{36})_(.+)$') {
                        $labelId = $matches[1].ToLowerInvariant()
                        $attrName = $matches[2]

                        if (-not $customPropsByLabel.ContainsKey($labelId)) {
                            $customPropsByLabel[$labelId] = @{}
                        }

                        $customPropsByLabel[$labelId][$attrName] = $value
                    }
                }
            }
            finally {
                $stream.Dispose()
            }
        }

        # -------- Read docMetadata/LabelInfo.xml --------
        $labelInfoEntry = $zip.Entries | Where-Object { $_.FullName -eq 'docMetadata/LabelInfo.xml' } | Select-Object -First 1
        if ($labelInfoEntry) {
            $stream = $labelInfoEntry.Open()
            try {
                $xml = New-Object System.Xml.XmlDocument
                $xml.Load($stream)

                $labelNodes = $xml.SelectNodes("//*[local-name()='label']")
                foreach ($node in $labelNodes) {
                    $labelId  = $node.GetAttribute("id")
                    $tenantId = $node.GetAttribute("siteId")
                    $enabled  = $node.GetAttribute("enabled")
                    $removed  = $node.GetAttribute("removed")
                    $method   = $node.GetAttribute("method")

                    $customData = $null
                    if ($labelId -and $customPropsByLabel.ContainsKey($labelId.ToLowerInvariant())) {
                        $customData = $customPropsByLabel[$labelId.ToLowerInvariant()]
                    }

                    $results += [pscustomobject]@{
                        FilePath            = $resolvedPath
                        Source              = "LabelInfo.xml"
                        LabelId             = $labelId
                        TenantId            = $tenantId
                        Enabled             = $enabled
                        Removed             = $removed
                        Method              = $method
                        SetDate             = if ($customData) { $customData["SetDate"] } else { $null }
                        Name                = if ($customData) { $customData["Name"] } else { $null }
                        ContentBits         = if ($customData) { $customData["ContentBits"] } else { $null }
                        SensitivityProperty = $sensitivityProperty
                    }
                }
            }
            finally {
                $stream.Dispose()
            }
        }

        # -------- Fallback to custom.xml only --------
        if ($results.Count -eq 0 -and $customPropsByLabel.Count -gt 0) {
            foreach ($labelId in $customPropsByLabel.Keys) {
                $data = $customPropsByLabel[$labelId]

                $results += [pscustomobject]@{
                    FilePath            = $resolvedPath
                    Source              = "custom.xml"
                    LabelId             = $labelId
                    TenantId            = $data["SiteId"]
                    Enabled             = $data["Enabled"]
                    Removed             = $null
                    Method              = $data["Method"]
                    SetDate             = $data["SetDate"]
                    Name                = $data["Name"]
                    ContentBits         = $data["ContentBits"]
                    SensitivityProperty = $sensitivityProperty
                }
            }
        }

        # -------- Last fallback: only Sensitivity property --------
        if ($results.Count -eq 0 -and $sensitivityProperty) {
            $results += [pscustomobject]@{
                FilePath            = $resolvedPath
                Source              = "Sensitivity property only"
                LabelId             = $sensitivityProperty
                TenantId            = $null
                Enabled             = $null
                Removed             = $null
                Method              = $null
                SetDate             = $null
                Name                = $null
                ContentBits         = $null
                SensitivityProperty = $sensitivityProperty
            }
        }

        if ($results.Count -eq 0) {
            $results += [pscustomobject]@{
                FilePath            = $resolvedPath
                Source              = "None found"
                LabelId             = $null
                TenantId            = $null
                Enabled             = $null
                Removed             = $null
                Method              = $null
                SetDate             = $null
                Name                = $null
                ContentBits         = $null
                SensitivityProperty = $null
            }
        }

        return $results
    }
    finally {
        $zip.Dispose()
    }
}

Get-PurviewLabelFromOfficeFile -Path C:\Users\user\Desktop\ConfTest.docx

Output

The function returns a PSCustomObject with:

  • FilePath - Path to the analyzed file
  • Source - Which XML location the label was found in
  • LabelId - The unique identifier for the sensitivity label
  • TenantId - The Microsoft 365 tenant identifier
  • Enabled - Whether the label is enabled
  • Removed - Whether the label was removed
  • Method - How the label was applied (manual, automatic, etc.)
  • SetDate - When the label was applied
  • Name - The label display name
  • ContentBits - Content watermark information
  • SensitivityProperty - The raw sensitivity property value

Usage

To use the script, call Get-PurviewLabelFromOfficeFile with the -Path parameter pointing to an Office file:

Get-PurviewLabelFromOfficeFile -Path "C:\path\to\document.docx"

The script works with .docx, .xlsx, .pptx, and other Office Open XML format files. It will not work with encrypted or rights-protected files.