Anomaly flagging

The "what stands out" view for a single company. Surfaces metrics where the subject is significantly different from its peer set — using robust statistics so the outliers we're flagging don't blow up the threshold itself.

Robust z-score

For each metric independently, with $s$ = subject's value and $p = (p_{1}, \dots, p_{n})$ = peer values:

z = \frac{s - \tilde{p}}{1.4826 \cdot MAD (p)}

where $\tilde{p}$ is the median of the peer values and:

MAD (p) = median (| p_{i} - \tilde{p} |)

is the median absolute deviation. The constant $1.4826 = 1 / Φ^{- 1} (0.75)$ rescales MAD so that, for a Gaussian $p$ , $1.4826 \cdot MAD$ is a consistent estimator of the population standard deviation. This lets you read $z$ on the same scale as a classical z-score (i.e. $| z | \geq 2$ ≈ "two standard deviations away") even though the underlying statistics are robust.

Why robust statistics, not mean + stdev

The whole point of anomaly flagging is to find outliers. Using mean + standard deviation defeats that:

A single huge outlier inflates the standard deviation, which raises the threshold, which masks the next-most-anomalous value.
The mean itself shifts toward the outlier, distorting "typical."

Median + MAD has a breakdown point of 50% — up to half the peers can be arbitrarily corrupted before the statistic drifts. Mean + stdev breaks down with a single outlier. Cross-sectional finance peer sets routinely contain post-merger goodwill spikes, NOL-distorted ROEs, and accounting one-offs; non-robust statistics flag the wrong things.

Threshold and gates

EqtyTrk uses $| z | \geq 2.0$ as the default flag threshold (configurable via the ?threshold= query param). The endpoint also enforces two skip conditions before computing $z$ :

Min peers. A metric is skipped when fewer than 5 peers have valid (non-None) values for that metric. Below this, both the median and MAD estimates are too noisy to be meaningful.
MAD = 0. When all valid peer values are identical, MAD is zero and $z$ would be undefined (or $\pm \infty$ for any non-equal subject value). The metric is skipped rather than flagged.

The response is sorted by $| z |$ descending so the most-anomalous metrics surface first.

Direction-aware interpretation

Whether an anomaly is good news or concerning depends on the metric's direction (see MetricDoc.direction in src/eqtytrk/metrics/metadata.py):

Metric direction	Anomaly above peers ( $z > 0$ )	Anomaly below peers ( $z < 0$ )
`higher_better` (e.g. ROE, FCF margin)	Good — green	Concerning — red
`lower_better` (e.g. net debt / EBITDA, P/E)	Concerning — red	Good — green
`context` (e.g. dividend yield, growth rates)	Neutral — gray	Neutral — gray

The frontend AnomaliesPanel colors each row using this rule. A high-leverage company with net_debt_to_ebitda 3 MADs above peers gets a red badge; a high-margin company with gross_margin 3 MADs above peers gets green.

Endpoint

GET /v1/companies/{ticker}/anomalies

Param	Default	Notes
`period`	`5y`	Used only for FY-end resolution.
`method`	`sub_industry`	Peer construction method.
`index`	(auto-recommend)	Index for peer pool.
`threshold`	`2.0`	Minimum $\| z \|$ to flag.
`min_peers`	`5`	Below this, metric is skipped.

Response: each entry returns subject_value, peer_median, peer_mad, peer_count, signed z_score, and direction so the caller can color appropriately and explain "subject is X vs peer median Y, n=Z peers."

Limitations

Peer choice dominates. A subject with sub_industry peers (mono-sector by construction) is being compared to a tighter group than one with size_band peers. Anomalies surfaced under one peer method may vanish under another. Try both for confidence.
Cross-sectional only. This is a snapshot at the latest FY end, not a time series. A persistent anomaly across multiple periods is more informative than a one-off; the current view doesn't compute that.
No multiple-comparisons correction. Scanning 60+ metrics and reporting those with $| z | \geq 2$ implicitly inflates the false-positive rate. Treat the panel as a starting list of hypotheses, not as confirmed signals.
MAD scaling assumption. The $1.4826$ factor assumes peer values are roughly Gaussian. For metrics with heavy-tailed distributions (e.g. P/E across a peer set with both growth and value names), $z$ values are still ordinally meaningful but the "two standard deviations" reading is approximate.

Implementation notes

Endpoint handler in src/eqtytrk/api/routers/analysis.py
Per-metric core: _compute_anomaly_for_metric(subject_value, peer_values, *, threshold, min_peers) — pure function, no DB dependency, returns (peer_median, peer_mad, z_score, peer_count) | None
Median computed in pure Python via the standard "lower of the two middles for even $n$ " convention; same for MAD
Frontend: AnomaliesPanel component on CompanyPage, useAnomalies hook in frontend/src/api/hooks.ts
Direction lookup: METRIC_DOCS in src/eqtytrk/metrics/metadata.py

References

Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J., & Stahel, W.A. (1986). Robust Statistics: The Approach Based on Influence Functions. Wiley. (Foundational reference for breakdown points and MAD.)
Leys, C., Ley, C., Klein, O., Bernard, P., & Licata, L. (2013). "Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median." Journal of Experimental Social Psychology, 49(4), 764–766.

Anomaly flagging ​

Robust z-score ​

Why robust statistics, not mean + stdev ​

Threshold and gates ​

Direction-aware interpretation ​

Endpoint ​

Limitations ​

Implementation notes ​

References ​