scored heuristic predictor
A set-level recommendation engine for workout logging. It estimates a credible working target from up to five recent matching sessions, the current exercise position, elapsed time since last exposure, and the number of visible set rows in the logger. The system is deliberately heuristic: explicit about what it uses, conservative about what it cannot know, and tuned for stable recommendations rather than theatrical precision.
Introduction / overview
The Scored Heuristic Predictor exists to answer a narrow but persistent product question inside workout logging: what is the most credible working-set starting point for this exercise right now?
In practice, users usually remember fragments of prior performance rather than a precise set-level target. They may remember last week's top set, a lifetime best, or whether the lift felt easy or terrible. What they often do not remember is the most believable target for the first visible working set under today's conditions.
The predictor therefore tries to recover a plausible anchor set from a small set of observed signals and then extend that estimate across the remaining visible sets. It is not trying to announce a maximum capability. It is trying to put the user close enough to start well.
Why naive prediction is not enough
Simple rules are appealing because they are legible, but they collapse different training conditions into a single shortcut. "Repeat last time" treats the previous session as if it were a neutral sample. "Add five pounds" assumes linear progression. "Use the best historical set" ignores today's context entirely.
- Last-time prediction. This is overly sensitive to one unusually good or unusually poor session.
- Highest-ever prediction. A lifetime best is not the same thing as today's likely working target.
- Fixed progression prediction. A blanket increment rule ignores whether the exercise is earlier or later than usual and whether the exposure gap is short, normal, or unusually long.
The product problem is therefore not simply "predict the weight." It is "recommend a credible working-set starting point under the conditions that meaningfully shape logged performance."
Why the system is heuristic-based
The predictor is heuristic on purpose because the underlying workout data is useful but materially incomplete. logit observes the exercise match, set order, reps, load, workout date, exercise order inside the workout, and the number of visible set rows. That is enough to build a grounded recommendation engine. It is not enough to infer internal effort or physiology with high confidence.
The system does not observe RPE, repetitions in reserve, rest interval length, sleep, body-mass change, travel, illness, or whether a set was intentionally conservative. Any product that claims full intelligence from this data would be overstating what the evidence supports.
A heuristic system is preferable here because it is explicit about what it uses, explicit about what it cannot know, and easy to inspect when the methodology needs tuning. The goal is not to hide behind a black box. The goal is to make a bounded, believable recommendation from observable workout behavior.
Inputs used by the model
Recent matching exercise history. Up to the five most recent sessions where the normalized exercise name matches the current exercise. With less history, the system can still emit an output, but confidence is reduced.
Current exercise position. The current exercise index inside the draft workout compared against the user's historical median position for the same exercise.
Recovery from time since last exposure. The number of days between the draft workout date and the most recent logged exposure to the same exercise.
Visible set count. The number of set rows currently shown in the logger, which determines how many predicted sets the system needs to generate.
The model predicts the anchor set first and then derives later visible sets from the user's own historical backoff pattern. If later-set history is thin, it falls back to a conservative stepped pattern instead of pretending to know more than the data supports.
Core notation
load for a set, expressed in stored pounds
repetitions performed
capped strength signal for a set
recency index, where i = 0 is most recent
days since last exercise exposure
exercise position within the workout
baseline strength from recency-weighted anchors
recovery multiplier from days since last exposure
position multiplier relative to historical median placement
damped short-term trend multiplier
confidence score for the final recommendation
1. Capped strength calculation
Raw load and raw reps do not compare cleanly across nearby working sets. The model therefore converts each candidate set into a capped strength signal. This preserves the useful intuition behind e1RM-style rep adjustment while preventing very high-rep sets from dominating the estimate.
The rep cap is intentional. High-rep sets can distort strength proxies and create unstable anchor selection. The predictor is more reliable when unusually long sets are allowed to inform the score without overwhelming it.
2. Anchor set selection
Each historical session is reduced to one representative anchor working set. Because the product does not store explicit warmup flags, the implementation defines the anchor as the weighted set with the highest capped strength score. Ties break toward the earliest set, which keeps the rule deterministic and inspectable.
If a session contains only bodyweight sets, the model falls back to the highest-rep set. Those predictions remain useful, but they are treated more conservatively and cannot earn the same confidence ceiling as weighted history.
3. Recency weighting and baseline strength
The engine looks at up to five recent anchors. More recent sessions matter more, but the weighting decays smoothly enough that one unusual day does not take over the estimate. The result is a baseline that stays responsive without becoming jumpy.
4. Trend adjustment
The predictor includes only a mild trend term. Short-term strength behavior is noisy, so trend is allowed to nudge the anchor estimate rather than drive it. In the implementation, this term only activates once at least three anchor sessions exist.
This clamp is deliberately tight. The model is allowed to acknowledge a recent rise or cooling-off period, but not to turn a short run of sessions into an aggressive extrapolation.
5. Recovery curve and the goldilocks window
Recovery is not modeled as a linear reward for more time away from the lift. Very short gaps often imply residual fatigue. Very long gaps often imply some detraining or loss of recent movement groove. The model therefore uses a centered recovery window rather than a monotonic rule. The product assumption is simple: there is a normal range where repeat exposure is timely without being so close that residual fatigue or so distant that recent specificity is lost.
| Days since last exposure | Recovery multiplier |
|---|---|
| 0-1 | 0.94 |
| 2 | 0.98 |
| 3-5 | 1.00 |
| 6-8 | 0.99 |
| 9-14 | 0.97 |
| 15-28 | 0.94 |
| 29-60 | 0.90 |
| 60+ | 0.85 |
6. Exercise position adjustment
The same exercise performed first is not directly comparable to the same exercise performed third. Position therefore enters the model as a relative adjustment against the user's own historical median placement. If the system has no meaningful positional history for the lift, this term defaults to neutral rather than inventing a penalty or bonus.
7. Historical backoff profile
The predictor does not solve every visible set from scratch. It predicts the anchor set first and then reconstructs later sets using the user's historical backoff structure. This keeps the output closer to how the user actually trains instead of treating each later set as an isolated forecasting problem.
Across recent sessions, the model takes the median ratio and median rep delta for each set index. If no stable history exists for a later set, it falls back to a conservative stepped pattern rather than inventing unnecessary precision. Missing information makes the model simpler, not more confident.
8. Final anchor prediction
The final anchor estimate is the weighted baseline after applying recovery, position, and trend adjustments. That strength estimate is then translated back into load using a target rep count from the user's recent anchor history, clamped to a practical working range.
Before the recommendation is surfaced, the implementation applies a few guardrails. The predicted load is rounded to the available gym increment, then sanity-clamped against the most recent anchor. Upward movement is intentionally limited to one increment. Downward movement is also bounded, with a little more room allowed after a longer layoff.
This produces a recommendation that behaves like a plausible workout for today rather than a disconnected estimate for one set in isolation.
Confidence scoring
A recommendation without confidence invites false precision. The model therefore scores the output using four dimensions: history depth, internal consistency of recent sessions, recency of last exposure, and positional match. The confidence score is not meant to impress. It is meant to communicate how much trust the available evidence deserves.
The confidence label is intentionally coarse. High confidence means the data is dense and coherent. Medium confidence means the recommendation is useful but should be treated with some caution. Low confidence means the product is surfacing a best-effort starting point, not a strong claim.
The implementation also applies hard conservative rules on top of the weighted score. A prediction based on only one matching session is forced to low confidence, and bodyweight-only predictions cannot exceed medium confidence.
Limitations and product framing
The predictor is intentionally narrow. It performs best when the user is repeating a familiar exercise with somewhat stable logging patterns. It becomes less certain when history is sparse, when bodyweight-only work dominates, when training intent changes sharply, or when an exercise has not been performed in a long time.
It also cannot distinguish between different reasons for the same logged output. A conservative set, a fatigued set, and a set performed during a calorie deficit may all appear similar in the stored data. That ambiguity is not a model bug. It is a property of the information the product currently observes.
For that reason, the Scored Heuristic Predictor is framed as a recommendation engine rather than a perfect predictor, a programming oracle, or a hidden physiology model. Its job is to provide a credible place to start. Its job is not to claim certainty where the evidence does not support it.