Overview
SCU Cost Estimate
This agent typically consumes 0,01 - 1 SCUs per analysis run, depending on the volume of classification data and the depth of analysis mode selected. Deep analysis mode may use more SCUs.
Introduction
Classification Optimizer helps you get the most out of your Microsoft Purview data classification. If you've ever wondered why certain sensitive information keeps slipping through, or why you're drowning in false positives, this agent is for you. It analyzes how your Sensitive Information Types (SITs) are actually performing in the real world, finds patterns you didn't know existed, and tells you exactly how to improve your classification accuracy.




What It Does
Analyzes real-world SIT performance based on actual detection data, not just theory
Finds co-occurrence patterns showing which sensitive types appear together
Identifies classification gaps where important data isn't being detected
Spots redundant classifiers that overlap and create unnecessary complexity
Recommends new composite SITs based on patterns that consistently appear together
Suggests parameter tuning to reduce false positives and improve accuracy
Identifies trainable classifier candidates for complex, context-dependent patterns
Provides prioritized recommendations with statistical backing (support, lift, confidence)
Use Cases
1. Reducing False Positives
Your DLP policies are firing constantly, but half the alerts aren't real issues. Users are getting frustrated with blocking that doesn't make sense. Classification Optimizer analyzes which SITs are causing problems and recommends specific parameter adjustments (confidence levels, instance counts, thresholds) to improve precision without sacrificing protection.
2. Improving Detection Accuracy
Important sensitive data is getting through your policies. You suspect your classifiers aren't catching everything they should. The agent identifies coverage gaps, analyzes detection patterns, and recommends new SIT combinations or trainable classifiers to catch what you're currently missing.
3. Simplifying Complex Classification Schemes
Over time, you've accumulated dozens or hundreds of SITs, and nobody knows which ones are actually valuable anymore. Classification Optimizer shows you which classifiers are redundant, which consistently appear together (and should be combined), and which aren't detecting anything useful. Finally clean up that classifier sprawl.
4. Building Better Composite Classifiers
You know certain types of sensitive data tend to appear together (like passport numbers with birth dates), but creating the right composite SITs manually is guesswork. The agent analyzes co-occurrence patterns with statistical metrics, then recommends exactly which SITs should be combined and with what parameters.
5. Meeting Regulatory Requirements More Effectively
Compliance frameworks require specific data protections, but your current SITs aren't aligned with those requirements. Classification Optimizer identifies strategic gaps tied to regulatory needs and recommends new classifiers or trainable classifiers to close those gaps.
Why Classification Optimizer?
False positive overload: DLP alerts everywhere, but most aren't real issues
Precision tuning: Specific parameter recommendations to reduce noise while maintaining protection
Important data slipping through: Your policies aren't catching everything they should
Gap analysis: Identifies what's being missed and recommends new detection patterns
Classifier chaos: Too many SITs, unclear which ones matter
Usage analytics: Shows which classifiers are actually valuable vs redundant
Manual guesswork: Building composite SITs based on intuition instead of data
Pattern discovery: Statistical analysis reveals which SITs consistently co-occur
Time-consuming analysis: Manually reviewing classification effectiveness takes forever
Automated insights: Complete analysis with prioritized recommendations in minutes
No strategic direction: Unclear where to focus classification improvement efforts
Prioritized roadmap: Recommendations ranked by impact with supporting metrics
How It Works
What goes in:
Purview alert data and detection events from your specified time range (default 30 days)
Existing SIT configurations and detection patterns
Classification analytics and usage data
Security events showing actual SIT detections
SharePoint, Exchange, and file classification data
What it does:
Calculates baseline metrics for each SIT (detection frequency, distribution)
Builds a co-occurrence matrix showing which SITs appear together
Applies statistical analysis (support, lift, conditional probability)
Identifies patterns, gaps, and optimization opportunities
Generates recommendations with technical justification and priority ranking
What you get:
Baseline SIT performance metrics
Co-occurrence patterns with statistical significance
New composite SIT recommendations with suggested parameters
Parameter tuning guidance for existing SITs
Trainable classifier candidates for complex patterns
Policy optimization suggestions (scoping, rule tuning, groupings)
Prioritized action plan with expected impact
Debug output (optional) showing detailed analysis steps
Last updated
Was this helpful?