# Overview

> &#x20;**SCU Cost Estimate**&#x20;
>
> This agent typically consumes **0,01 - 1 SCUs** per analysis run, depending on the volume of classification data and the depth of analysis mode selected. Deep analysis mode may use more SCUs.

### Introduction

Classification Optimizer helps you get the most out of your Microsoft Purview data classification. If you've ever wondered why certain sensitive information keeps slipping through, or why you're drowning in false positives, this agent is for you. It analyzes how your Sensitive Information Types (SITs) are actually performing in the real world, finds patterns you didn't know existed, and tells you exactly how to improve your classification accuracy.

<figure><img src="/files/Bke8KpkqROcIXpUGAayr" alt=""><figcaption></figcaption></figure>

<div><figure><img src="/files/cTDaONgLyog1MwRygTYt" alt=""><figcaption></figcaption></figure> <figure><img src="/files/u5HcJYhTNeBm85f5Xjju" alt=""><figcaption></figcaption></figure> <figure><img src="/files/kqIU3kyBaXhZneJM0h8b" alt=""><figcaption></figcaption></figure></div>

### What It Does

* **Analyzes real-world SIT performance** based on actual detection data, not just theory
* **Finds co-occurrence patterns** showing which sensitive types appear together
* **Identifies classification gaps** where important data isn't being detected
* **Spots redundant classifiers** that overlap and create unnecessary complexity
* **Recommends new composite SITs** based on patterns that consistently appear together
* **Suggests parameter tuning** to reduce false positives and improve accuracy
* **Identifies trainable classifier candidates** for complex, context-dependent patterns
* **Provides prioritized recommendations** with statistical backing (support, lift, confidence)

### Use Cases

#### 1. Reducing False Positives

Your DLP policies are firing constantly, but half the alerts aren't real issues. Users are getting frustrated with blocking that doesn't make sense. Classification Optimizer analyzes which SITs are causing problems and recommends specific parameter adjustments (confidence levels, instance counts, thresholds) to improve precision without sacrificing protection.

#### 2. Improving Detection Accuracy

Important sensitive data is getting through your policies. You suspect your classifiers aren't catching everything they should. The agent identifies coverage gaps, analyzes detection patterns, and recommends new SIT combinations or trainable classifiers to catch what you're currently missing.

#### 3. Simplifying Complex Classification Schemes

Over time, you've accumulated dozens or hundreds of SITs, and nobody knows which ones are actually valuable anymore. Classification Optimizer shows you which classifiers are redundant, which consistently appear together (and should be combined), and which aren't detecting anything useful. Finally clean up that classifier sprawl.

#### 4. Building Better Composite Classifiers

You know certain types of sensitive data tend to appear together (like passport numbers with birth dates), but creating the right composite SITs manually is guesswork. The agent analyzes co-occurrence patterns with statistical metrics, then recommends exactly which SITs should be combined and with what parameters.

#### 5. Meeting Regulatory Requirements More Effectively

Compliance frameworks require specific data protections, but your current SITs aren't aligned with those requirements. Classification Optimizer identifies strategic gaps tied to regulatory needs and recommends new classifiers or trainable classifiers to close those gaps.

### Why Classification Optimizer?

| The Problem You're Dealing With                                                            | How This Helps                                                                                        |
| ------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------- |
| **False positive overload**: DLP alerts everywhere, but most aren't real issues            | **Precision tuning**: Specific parameter recommendations to reduce noise while maintaining protection |
| **Important data slipping through**: Your policies aren't catching everything they should  | **Gap analysis**: Identifies what's being missed and recommends new detection patterns                |
| **Classifier chaos**: Too many SITs, unclear which ones matter                             | **Usage analytics**: Shows which classifiers are actually valuable vs redundant                       |
| **Manual guesswork**: Building composite SITs based on intuition instead of data           | **Pattern discovery**: Statistical analysis reveals which SITs consistently co-occur                  |
| **Time-consuming analysis**: Manually reviewing classification effectiveness takes forever | **Automated insights**: Complete analysis with prioritized recommendations in minutes                 |
| **No strategic direction**: Unclear where to focus classification improvement efforts      | **Prioritized roadmap**: Recommendations ranked by impact with supporting metrics                     |

### How It Works

**What goes in:**

* Purview alert data and detection events from your specified time range (default 30 days)
* Existing SIT configurations and detection patterns
* Classification analytics and usage data
* Security events showing actual SIT detections
* SharePoint, Exchange, and file classification data

**What it does:**

* Calculates baseline metrics for each SIT (detection frequency, distribution)
* Builds a co-occurrence matrix showing which SITs appear together
* Applies statistical analysis (support, lift, conditional probability)
* Identifies patterns, gaps, and optimization opportunities
* Generates recommendations with technical justification and priority ranking

**What you get:**

* Baseline SIT performance metrics
* Co-occurrence patterns with statistical significance
* New composite SIT recommendations with suggested parameters
* Parameter tuning guidance for existing SITs
* Trainable classifier candidates for complex patterns
* Policy optimization suggestions (scoping, rule tuning, groupings)
* Prioritized action plan with expected impact
* Debug output (optional) showing detailed analysis steps


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://agents.glueckkanja.com/agents/classification-optimizer/overview.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
