research

scored heuristic predictor

A set-level recommendation engine for workout logging. It estimates a credible working target from up to five recent matching sessions, the current exercise position, elapsed time since last exposure, and the number of visible set rows in the logger. The system is deliberately heuristic: explicit about what it uses, conservative about what it cannot know, and tuned for stable recommendations rather than theatrical precision.

Introduction / overview

The Scored Heuristic Predictor exists to answer a narrow but persistent product question inside workout logging: what is the most credible working-set starting point for this exercise right now?

In practice, users usually remember fragments of prior performance rather than a precise set-level target. They may remember last week's top set, a lifetime best, or whether the lift felt easy or terrible. What they often do not remember is the most believable target for the first visible working set under today's conditions.

The predictor therefore tries to recover a plausible anchor set from a small set of observed signals and then extend that estimate across the remaining visible sets. It is not trying to announce a maximum capability. It is trying to put the user close enough to start well.

Why naive prediction is not enough

Simple rules are appealing because they are legible, but they collapse different training conditions into a single shortcut. "Repeat last time" treats the previous session as if it were a neutral sample. "Add five pounds" assumes linear progression. "Use the best historical set" ignores today's context entirely.

  1. Last-time prediction. This is overly sensitive to one unusually good or unusually poor session.
  2. Highest-ever prediction. A lifetime best is not the same thing as today's likely working target.
  3. Fixed progression prediction. A blanket increment rule ignores whether the exercise is earlier or later than usual and whether the exposure gap is short, normal, or unusually long.

The product problem is therefore not simply "predict the weight." It is "recommend a credible working-set starting point under the conditions that meaningfully shape logged performance."

Why the system is heuristic-based

The predictor is heuristic on purpose because the underlying workout data is useful but materially incomplete. logit observes the exercise match, set order, reps, load, workout date, exercise order inside the workout, and the number of visible set rows. That is enough to build a grounded recommendation engine. It is not enough to infer internal effort or physiology with high confidence.

The system does not observe RPE, repetitions in reserve, rest interval length, sleep, body-mass change, travel, illness, or whether a set was intentionally conservative. Any product that claims full intelligence from this data would be overstating what the evidence supports.

A heuristic system is preferable here because it is explicit about what it uses, explicit about what it cannot know, and easy to inspect when the methodology needs tuning. The goal is not to hide behind a black box. The goal is to make a bounded, believable recommendation from observable workout behavior.

Inputs used by the model

01

Recent matching exercise history. Up to the five most recent sessions where the normalized exercise name matches the current exercise. With less history, the system can still emit an output, but confidence is reduced.

02

Current exercise position. The current exercise index inside the draft workout compared against the user's historical median position for the same exercise.

03

Recovery from time since last exposure. The number of days between the draft workout date and the most recent logged exposure to the same exercise.

04

Visible set count. The number of set rows currently shown in the logger, which determines how many predicted sets the system needs to generate.

The model predicts the anchor set first and then derives later visible sets from the user's own historical backoff pattern. If later-set history is thin, it falls back to a conservative stepped pattern instead of pretending to know more than the data supports.

Core notation

w

load for a set, expressed in stored pounds

r

repetitions performed

S

capped strength signal for a set

i

recency index, where i = 0 is most recent

d

days since last exercise exposure

p

exercise position within the workout

B

baseline strength from recency-weighted anchors

R

recovery multiplier from days since last exposure

P

position multiplier relative to historical median placement

T

damped short-term trend multiplier

C

confidence score for the final recommendation

1. Capped strength calculation

Raw load and raw reps do not compare cleanly across nearby working sets. The model therefore converts each candidate set into a capped strength signal. This preserves the useful intuition behind e1RM-style rep adjustment while preventing very high-rep sets from dominating the estimate.

reff=min⁡(r,12)r_{\mathrm{eff}} = \min(r, 12)reff​=min(r,12)S=w×(1+reff30)S = w \times \left(1 + \frac{r_{\mathrm{eff}}}{30}\right)S=w×(1+30reff​​)
Interpretation. reff is the effective rep count after capping high-rep sets at twelve. S is the capped strength score used by the predictor.

The rep cap is intentional. High-rep sets can distort strength proxies and create unstable anchor selection. The predictor is more reliable when unusually long sets are allowed to inform the score without overwhelming it.

2. Anchor set selection

Each historical session is reduced to one representative anchor working set. Because the product does not store explicit warmup flags, the implementation defines the anchor as the weighted set with the highest capped strength score. Ties break toward the earliest set, which keeps the rule deterministic and inspectable.

ak=arg max⁡jSk,ja_k = \operatorname*{arg\,max}_j S_{k,j}ak​=argmaxj​Sk,j​
Interpretation. For session k, the anchor ak is the set index that maximizes capped strength within that session.

If a session contains only bodyweight sets, the model falls back to the highest-rep set. Those predictions remain useful, but they are treated more conservatively and cannot earn the same confidence ceiling as weighted history.

3. Recency weighting and baseline strength

The engine looks at up to five recent anchors. More recent sessions matter more, but the weighting decays smoothly enough that one unusual day does not take over the estimate. The result is a baseline that stays responsive without becoming jumpy.

ωi=e−0.35i\omega_i = e^{-0.35i}ωi​=e−0.35iB=∑iSiωi∑iωiB = \frac{\sum_i S_i \omega_i}{\sum_i \omega_i}B=∑i​ωi​∑i​Si​ωi​​
Interpretation. ωi is the recency weight for historical anchor i. B is the weighted baseline strength before recovery, trend, and position adjustments.

4. Trend adjustment

The predictor includes only a mild trend term. Short-term strength behavior is noisy, so trend is allowed to nudge the anchor estimate rather than drive it. In the implementation, this term only activates once at least three anchor sessions exist.

τ=S0−S2S2\tau = \frac{S_0 - S_2}{S_2}τ=S2​S0​−S2​​T=clamp⁡(1+0.35τ,0.97,1.03)T = \operatorname{clamp}(1 + 0.35\tau, 0.97, 1.03)T=clamp(1+0.35τ,0.97,1.03)
Interpretation. τ measures the relative change between the most recent anchor and the third-most-recent anchor. T is a damped trend multiplier.

This clamp is deliberately tight. The model is allowed to acknowledge a recent rise or cooling-off period, but not to turn a short run of sessions into an aggressive extrapolation.

5. Recovery curve and the goldilocks window

Recovery is not modeled as a linear reward for more time away from the lift. Very short gaps often imply residual fatigue. Very long gaps often imply some detraining or loss of recent movement groove. The model therefore uses a centered recovery window rather than a monotonic rule. The product assumption is simple: there is a normal range where repeat exposure is timely without being so close that residual fatigue or so distant that recent specificity is lost.

R=f(d)R = f(d)R=f(d)
Interpretation. R is the recovery multiplier selected from a piecewise function of elapsed days d.
Days since last exposureRecovery multiplier
0-10.94
20.98
3-51.00
6-80.99
9-140.97
15-280.94
29-600.90
60+0.85

6. Exercise position adjustment

The same exercise performed first is not directly comparable to the same exercise performed third. Position therefore enters the model as a relative adjustment against the user's own historical median placement. If the system has no meaningful positional history for the lift, this term defaults to neutral rather than inventing a penalty or bonus.

Δp=pcurrent−median⁡(phistory)\Delta_p = p_{\mathrm{current}} - \operatorname{median}(p_{\mathrm{history}})Δp​=pcurrent​−median(phistory​)P=clamp⁡(1−0.015Δp,0.92,1.05)P = \operatorname{clamp}(1 - 0.015\Delta_p, 0.92, 1.05)P=clamp(1−0.015Δp​,0.92,1.05)
Interpretation. Δp is the difference between the current exercise position and the historical median position. P is the resulting position multiplier.

7. Historical backoff profile

The predictor does not solve every visible set from scratch. It predicts the anchor set first and then reconstructs later sets using the user's historical backoff structure. This keeps the output closer to how the user actually trains instead of treating each later set as an isolated forecasting problem.

ρj=wjwanchor\rho_j = \frac{w_j}{w_{\mathrm{anchor}}}ρj​=wanchor​wj​​δj=rj−ranchor\delta_j = r_j - r_{\mathrm{anchor}}δj​=rj​−ranchor​
Interpretation. ρj is the weight ratio for later set j relative to the anchor. δj is the rep delta relative to the anchor.

Across recent sessions, the model takes the median ratio and median rep delta for each set index. If no stable history exists for a later set, it falls back to a conservative stepped pattern rather than inventing unnecessary precision. Missing information makes the model simpler, not more confident.

8. Final anchor prediction

The final anchor estimate is the weighted baseline after applying recovery, position, and trend adjustments. That strength estimate is then translated back into load using a target rep count from the user's recent anchor history, clamped to a practical working range.

Spred=B×R×P×TS_{\mathrm{pred}} = B \times R \times P \times TSpred​=B×R×P×Twpred=Spred1+min⁡(rtarget,12)30w_{\mathrm{pred}} = \frac{S_{\mathrm{pred}}}{1 + \frac{\min(r_{\mathrm{target}}, 12)}{30}}wpred​=1+30min(rtarget​,12)​Spred​​
Interpretation. Spred is predicted anchor strength. The resulting load wpred is recovered from the target anchor reps, rounded to the available gym increment, and sanity-clamped against recent reality.

Before the recommendation is surfaced, the implementation applies a few guardrails. The predicted load is rounded to the available gym increment, then sanity-clamped against the most recent anchor. Upward movement is intentionally limited to one increment. Downward movement is also bounded, with a little more room allowed after a longer layoff.

w^j=w^anchor×median⁡(ρj)\hat{w}_j = \hat{w}_{\mathrm{anchor}} \times \operatorname{median}(\rho_j)w^j​=w^anchor​×median(ρj​)r^j=r^anchor+median⁡(δj)\hat{r}_j = \hat{r}_{\mathrm{anchor}} + \operatorname{median}(\delta_j)r^j​=r^anchor​+median(δj​)
Interpretation. The predicted weight and reps for later set j are derived from the anchor prediction and the user's median backoff structure.

This produces a recommendation that behaves like a plausible workout for today rather than a disconnected estimate for one set in isolation.

Confidence scoring

A recommendation without confidence invites false precision. The model therefore scores the output using four dimensions: history depth, internal consistency of recent sessions, recency of last exposure, and positional match. The confidence score is not meant to impress. It is meant to communicate how much trust the available evidence deserves.

C=0.35H+0.30K+0.20E+0.15MC = 0.35H + 0.30K + 0.20E + 0.15MC=0.35H+0.30K+0.20E+0.15M
Interpretation. C is the final confidence score. H is history depth, K is consistency, E is exposure recency, and M is positional match.

The confidence label is intentionally coarse. High confidence means the data is dense and coherent. Medium confidence means the recommendation is useful but should be treated with some caution. Low confidence means the product is surfacing a best-effort starting point, not a strong claim.

The implementation also applies hard conservative rules on top of the weighted score. A prediction based on only one matching session is forced to low confidence, and bodyweight-only predictions cannot exceed medium confidence.

Limitations and product framing

The predictor is intentionally narrow. It performs best when the user is repeating a familiar exercise with somewhat stable logging patterns. It becomes less certain when history is sparse, when bodyweight-only work dominates, when training intent changes sharply, or when an exercise has not been performed in a long time.

It also cannot distinguish between different reasons for the same logged output. A conservative set, a fatigued set, and a set performed during a calorie deficit may all appear similar in the stored data. That ambiguity is not a model bug. It is a property of the information the product currently observes.

For that reason, the Scored Heuristic Predictor is framed as a recommendation engine rather than a perfect predictor, a programming oracle, or a hidden physiology model. Its job is to provide a credible place to start. Its job is not to claim certainty where the evidence does not support it.