Skip to content

Recommender (engine)

The recommendation algorithm: one patient, one week. Sectioned by algorithm step — bootstrap / repeat / MVT criterion / substitute search / update / top-up — producing an introspectable RecommendationResult.

recommender

Recommendation engine — substrate-agnostic.

The engine takes an EngineState (any substrate — pandas DataFrame adapter, in-memory dict, future polars/xarray) and a SimilarityMatrix and produces a weekly schedule for ONE patient at a time. The schedule is n distinct protocols laid out across days × protocols_per_day slots.

After phase F2, none of the engine internals depend on pandas. The engine works on list[ProtocolRow] throughout. pandas only appears at two boundaries: - INPUT: production callers pass a pd.DataFrame for scoring; that gets wrapped by coerce_engine_state into a PatientState (see engine.py). - OUTPUT: the final per-protocol schedule is materialized as a pd.DataFrame for backwards-compatible consumption (and for RecommendationResult.recommendations).

This file is organized in sections that mirror the algorithm's steps:

1.  trace               — build the structured audit dict
2.  bootstrap strategy  — first-ever schedule
3.  repeat strategy     — week was skipped, copy prior
4.  MVT criterion       — which prescribed protocols to swap
5.  substitute search   — two-tier pick: unused / least-used-similar
6.  update strategy     — swap loop assembly
7.  topup               — fill the 7×ppd grid post-step
8.  RecommendationResult — introspectable output (PCA-style)
9.  Recommender         — entry-point class wiring everything

Behavior is byte-for-byte identical to v0.3.1; all unit tests pass.

SubstituteResult dataclass

SubstituteResult(
    protocol_id: int | None,
    tier: str,
    candidates: list[int],
    removed_id: int | None = None,
    similarity: float | None = None,
    reason: str = "",
)

Outcome of a substitute search. Auditable — carries which tier matched, which candidates were considered, plus (when materialized from a trace event) the removed protocol + similarity + reason.

RecommendationResult dataclass

RecommendationResult(
    recommendations: DataFrame,
    trace: dict[str, Any],
    patient_state: EngineState,
    branch: str,
    swap_decisions: list[SubstituteResult],
    topup_events: list[dict[str, Any]],
    mvt_mean: float | None,
    swap_targets: list[int],
    swap_reasons: dict[int, str],
    scoring_attrs: dict[str, Any],
)

All artifacts the engine produced for one (patient, week).

Recommender

Recommender(
    scoring: DataFrame | EngineState,
    n: int = N,
    days: int = N_DAYS,
    protocols_per_day: int = PROTOCOLS_PER_DAY,
)

Clinical Decision Support System core.

Recommends a 7-day × protocols_per_day schedule for ONE patient.

Substrate-agnostic: scoring can be a pd.DataFrame (production path) or any EngineState (synthetic / dict / future polars). Similarly protocol_similarity accepts pd.DataFrame, dict, or SimilarityMatrix.

Source code in src\ai_cdss\recommender.py
def __init__(
    self,
    scoring: pd.DataFrame | EngineState,
    n: int = N,
    days: int = N_DAYS,
    protocols_per_day: int = PROTOCOLS_PER_DAY,
) -> None:
    self._scoring = scoring
    self.n = n
    self.days = days
    self.protocols_per_day = protocols_per_day

recommend

recommend(
    patient_id: int, protocol_similarity: Any
) -> RecommendationResult

Full pipeline. Accepts any substrate; coerces to the engine protocols at the boundary.

Source code in src\ai_cdss\recommender.py
def recommend(
    self,
    patient_id: int,
    protocol_similarity: Any,
) -> RecommendationResult:
    """Full pipeline. Accepts any substrate; coerces to the engine
    protocols at the boundary."""
    state = coerce_engine_state(self._scoring, patient_id)
    sim   = coerce_similarity(protocol_similarity)

    if not state.has_data:
        raise ValueError(f"Patient {patient_id} has no data.")

    trace = _init_trace(
        state=state, n=self.n,
        n_days=self.days, protocols_per_day=self.protocols_per_day,
    )

    rows = self._run_strategy(state, sim, trace)
    rows = _top_up_schedule(
        state, rows,
        n_days=self.days,
        protocols_per_day=self.protocols_per_day,
        n=self.n, trace=trace,
    )

    recommendations = self._rows_to_dataframe(rows, state)
    trace["final"] = _serialize_final_rows(rows)
    attrs = dict(state.scoring_attrs)
    attrs["trace"] = trace
    recommendations.attrs = attrs

    return self._assemble_result(
        state=state, rows=rows,
        recommendations=recommendations, trace=trace,
    )