Skip to content

Cache freshness

EqtyTrk pre-computes three expensive data sets and stores them in the database so that API requests return in milliseconds instead of seconds. This page documents what each cache contains, which job populates it, and how stale it can get before a refresh fires.

company_metrics_cache

One row per CIK. Columns: cik, period_end, metrics (JSONB — a flat {metric_id: value} object for every metric in the engine), market_cap, computed_at.

Populated by the eqtytrk-recompute-cache Lambda (eqtytrk.worker.recompute_handler). That Lambda walks every distinct CIK in xbrl_facts, runs compute_all_metrics_for_cik, and upserts the result. With ~500 CIKs and 60+ metrics each, the metrics phase takes roughly four minutes.

Two triggers fire the Lambda:

  1. Post-ingest async invoke — the eqtytrk-ingest-worker Lambda fire-and-forgets an async invocation of eqtytrk-recompute-cache after each successful CIK ingest. This keeps the cache warm within ~5 minutes of new data landing.
  2. Daily EventBridge backstopeqtytrk-recompute-daily fires at cron(0 8 * * ? *) (08:00 UTC, 04:00 ET) as a safety net for any tickers that landed overnight without triggering a post-ingest refresh.

The Lambda self-debounces: if both company_metrics_cache and sector_medians_cache are younger than RECOMPUTE_DEBOUNCE_SECONDS (default 240 s), the handler logs "debounce" and exits early. This keeps a burst of ten consecutive ingest invocations from hammering Neon with ten back-to-back full-table scans.

Reserved concurrency is set to 1 (ReservedConcurrentExecutions: 1 in infra/ingest-worker.yml) so AWS serializes concurrent invocations and retries them with exponential backoff rather than letting them race.

sector_medians_cache

One row per (index_id, aggregation, period) key. Columns: cache_key, index_id, aggregation, period, matrix (JSONB — sector × metric grid), warnings, computed_at.

Populated immediately after company_metrics_cache in the same eqtytrk-recompute-cache Lambda run. The sector phase calls compute_and_cache_sector_matrix (in src/eqtytrk/sectors/cache.py) three times — once for each of the median, mean, and weighted_mcap aggregations — all scoped to sp500 / fy_latest. The sector phase reads from xbrl_facts and companies.market_cap directly (not from company_metrics_cache) and takes roughly 70 s total.

The /v1/sectors/medians endpoint reads from this table. If no cached row exists for the requested key, it falls through to a live recompute (same compute_and_cache_sector_matrix function) and upserts before returning.

Freshness follows the same two-trigger model as company_metrics_cache: post-ingest async invoke (common case) and daily EventBridge backstop at 08:00 UTC.

companies.beta_5y_monthly_vs_spy

A column group on the companies table, added in migration 008_companies_beta_cache.sql: beta_5y_monthly_vs_spy (numeric), beta_computed_at (timestamptz), beta_window_start (date), beta_window_end (date).

Populated by recompute_all_betas in src/eqtytrk/correlation/beta.py, called from the same eqtytrk-recompute-cache Lambda run after the metrics phase and before the sector phase. For each company the function pulls five years of daily prices from prices_daily, converts to month-end returns, and runs an OLS regression against SPY monthly returns over the same window. A minimum of 24 aligned monthly observations is required; companies with less price history get NULL (handled the same way as a fresh per-request regression that returned None — the peer is dropped from the CAPM-adjusted correlation ranking).

Used by _apply_etsr_capm in the analysis router to compute CAPM-adjusted excess returns without regressing per-peer on every /correlate request.

EqtyTrk methodology reference. Data from SEC EDGAR.