Skip to content

Sector Aggregation

The sector heatmap gives a one-page view of how each GICS sector stacks up across eight fundamental metrics. Each cell is a single aggregate number — median, mean, or market-cap-weighted mean — representing that sector's typical value for that metric. Direction-aware shading makes the best and worst cells immediately obvious without requiring the reader to remember which metrics are "higher is better."

Aggregation modes

Three aggregation modes are supported, selectable via the toggle in the heatmap UI:

Median

The 50th-percentile value across all companies in the sector that have a computable metric value for the period. Median is the default mode and the most robust: it is unaffected by single-company extremes, which are common in fundamentals data (negative-equity firms, one-time write-downs, micro-cap outliers).

Mean

Simple arithmetic mean across the same set of companies. Useful for understanding the total pool (e.g., total revenue proportional to count) but sensitive to outliers. A single mega-cap with anomalous margins can move the mean significantly while the median stays stable.

Weighted market cap

Each company's metric value is weighted by its market cap:

x¯sector=imcixiimci

where the sum runs over all companies in the sector with both a non-None metric value and a positive market cap. This mode answers the question "what is the typical metric value for a dollar invested in this sector?" — large-cap leaders dominate the estimate. If fewer than two companies in a sector-metric cell have both a metric value and a market cap (the _MIN_WEIGHTED_N = 2 guard), the cell falls back to the simple median.

Minimum-N rule

A cell is suppressed (displayed as "—" with no background color) when fewer than three companies in the sector have a computable value for that metric. This threshold (MIN_N = 3) is applied after aggregation, in the cache-write step. The purpose is to prevent a one- or two-company sector from producing a cell that looks like an industry representative but is actually a single-name view.

For weighted_mcap, the minimum-N check is on companies with metric values (same as median/mean), not on companies with both metric values and market caps. A warning is logged when a sector falls back from weighted_mcap to median due to insufficient cap coverage.

The eight heatmap metrics

The heatmap covers eight metrics, chosen to span the key dimensions of fundamental quality:

MetricDimensionDirection
gross_marginPricing powerHigher better
operating_marginOperating efficiencyHigher better
roicCapital allocationHigher better
roeEquity returnsHigher better
fcf_marginCash generationHigher better
revenue_growth_3y_cagrGrowthHigher better
debt_to_ebitdaLeverageLower better
pe_ratioValuationLower better

Direction-aware shading

Within each column, cells are ranked by percentile and colored on a continuous scale from forest green (best) to brick red (worst). The direction of "best" is metric-specific:

  • For six of the eight metrics, higher values are better (forest = highest, brick = lowest).
  • For debt_to_ebitda and pe_ratio, lower values are better; the percentile scale is inverted so the sector with the lowest debt load and the cheapest valuation gets the forest shading.

The color intensity is proportional to distance from the column midpoint. Cells within ±10 percentile points of the column median are rendered in a neutral gray; cells above 55th percentile shade toward forest, below 45th toward brick, with intensity increasing toward the extremes.

A small "↓ better" label appears below the value in debt_to_ebitda and pe_ratio cells to signal the inverted convention.

Cache architecture

Computing sector aggregates requires pulling facts for ~500 companies, computing metrics in memory for each, and aggregating by sector — a multi-second operation unsuitable for a hot API path. EqtyTrk uses a persistent cache table (sector_medians_cache) keyed on (index_id, aggregation, period).

The cache key format is "{index_id}:{aggregation}:{period}". The period field is always "fy_latest" (latest completed fiscal year per company). Each aggregation mode is cached separately; the UI toggle switches between pre-computed rows rather than triggering a recompute.

Refresh cadence

The cache is refreshed by the recompute Lambda (eqtytrk-recompute-cache) via two pathways:

  1. Daily EventBridge backstop — fires at 08:00 UTC (04:00 ET, after the prior trading day's data has settled). This is the safety-net path that guarantees at most 24-hour staleness.
  2. Post-ingest async invoke — after each successful ingest job, the ingest worker fires an asynchronous Lambda invocation of the recompute handler. This means the heatmap is typically updated within minutes of new filings being ingested.

A self-debounce mechanism (240-second window) prevents back-to-back recomputes from running simultaneously when multiple ingest jobs complete in a short burst.

The cache is written after both the metrics cache (company_metrics_cache) and betas (companies.beta_5y_monthly_vs_spy) have been refreshed, ensuring that pe_ratio (which requires market cap from companies) and roic (which requires prior-year equity facts) are computed from fresh inputs.

Limitations

  • Latest FY only. The heatmap represents each company's most recently completed fiscal year. Sectors with significant calendar-year misalignment (some companies' FY ends in March, others in December) mix snapshots from different points in the macroeconomic cycle.
  • Equal-weight count for median/mean. A 500-company sector and a 20-company sector both produce one cell per metric. The statistical reliability of the two-company sector median is much lower.
  • PE ratio sign convention. pe_ratio requires a positive market cap and positive net income. Companies with negative net income (loss-making) are excluded from the PE cell, which biases the sector PE upward (only profitable companies contribute). Cyclical sectors with many loss-makers in downturns will show artificially high PE.
  • GICS source lag. Sector classification is sourced from iShares CSV files (updated as ETF holdings change). Companies that were recently reclassified may sit in the wrong sector bucket for up to a day after the CSV refreshes.

Implementation notes

  • compute_sector_medians() (pure aggregation logic) in src/eqtytrk/sectors/aggregation.py
  • compute_and_cache_sector_matrix() (full compute + DB write path) in src/eqtytrk/sectors/cache.py
  • HEATMAP_METRICS list and MIN_N = 3 constant defined in src/eqtytrk/sectors/cache.py
  • LOWER_BETTER set (containing debt_to_ebitda and pe_ratio) and direction-aware color logic in frontend/src/routes/SectorsPage.tsx
  • Recompute Lambda entry point: src/eqtytrk/worker/recompute_handler.py
  • EventBridge schedule: cron(0 8 * * ? *) in infra/ingest-worker.yml

EqtyTrk methodology reference. Data from SEC EDGAR.