Analysis Modules
19 analysis modules organized into 5 tiers — from classical statistics to AI/ML prediction.
| # | Book | Author | Primary Contribution |
|---|---|---|---|
| 1 | Secrets of Winning Lotto & Lottery | Avery Cardoza | Positional analysis, cluster analysis, bankroll management |
| 2 | Lottery Master Guide | Gail Howard | Hot/cold tracking, wheeling systems, pattern recognition |
| 3 | Lotto: How to Wheel a Fortune | Gail Howard | Abbreviated wheels, key number wheels, coverage systems |
| 4 | The Mathematics of Lottery | Catalin Barboianu | Formal probability, combinatorial design, covering theory |
| 5 | Lottery Winning Strategies & 70% Win Formula | Gail Howard | 70% sum range rule, 9+5 selection/group tips |
| 6 | AI and the Lottery: Defying Odds | Gary Covella | LSTM prediction, ensemble ML, Monte Carlo simulation |
| 7 | Lottery: The Algorithm that Beat Chance | José Proaño | Mandel covering designs, expected value, syndicate strategy |
Tier 1 — Core Statistical Analysis
Tracks how often each number has been drawn across three time windows:
- Short-term — last 10 drawings
- Medium-term — last 30 drawings
- Long-term — last 100 drawings
Numbers are classified as Hot (top 20%), Cold (bottom 20%), or Overdue. Trend direction (Rising / Falling) is derived by comparing short-term vs. long-term rates. Output feeds the heatmap grid on the dashboard.
Each winning draw is sorted ascending. Module 2 builds a positional frequency matrix: how often each number appears in position 1, 2, 3 … 6 (Lotto) or 1–4 (Two Step) or 1–5 (Powerball).
When generating picks, numbers are preferentially placed into their statistically strongest position. This is Cardoza's key innovation — it constrains the search space dramatically.
Counts how often every pair (i, j) appears in the same draw. Extends to triplets. Computes an Affinity Score: pair_count / expected_count. Values > 1.5 = strong cluster.
Anti-clusters are pairs that almost never appear together — these are excluded during pick generation. The pair network chart shows the top 20 pairs visually.
Filters picks to match historical distribution patterns. ~80% of Texas Lotto jackpots had a 3/3, 4/2, or 2/4 odd-to-even split. Combinations of 6/0 or 5/1 are penalized.
High/Low: Texas Lotto splits at 28 — roughly equal high and low counts dominate winners. Same logic applied per game. Acts as a hard filter gate.
The sum of a 6-number pick from 1–54 can range from 21 (1+2+3+4+5+6) to 309 (49+50+51+52+53+54). But real winners cluster tightly in the middle.
70% Rule: Gail Howard found that 70% of jackpot winners fall within the 15th–85th percentile of the historical sum distribution. Any generated pick outside this band is flagged and penalized in the composite score. This is a primary validation gate.
For each number, computes: average skip (mean draws between appearances), current skip (draws since last seen), and due score:
due_score = current_skip / average_skip
- Due score > 1.5 → Overdue — consider including
- Due score < 0.5 → Recently hit — may cool off
Texas Lotto numbers are grouped into decades: 1–9, 10–19, 20–29, 30–39, 40–49, 50–54. Most jackpot winners span 4–5 different groups.
Picks that concentrate in only 1–2 groups (e.g., all numbers 30–39) are penalized. The pick generator enforces minimum group spread based on historical data.
Calculates what percentage of historical winning draws contained at least one consecutive pair (e.g., 12 and 13 both drawn). And what percentage had two or more.
This is used to calibrate the pick generator — it doesn't blindly avoid all consecutive pairs, but limits them to match the historical rate.
Tier 2 — Advanced Mathematical Analysis
Implements Barboianu's formal lottery matrix model L(n, k, p, t) for exact combinatorial probability at every prize tier. Uses the hypergeometric formula:
P = C(k,t) × C(n-k, p-t) / C(n,p)
Outputs an exact odds table for Texas Lotto, Two Step, and Powerball. For multi-ticket plays, calculates how probability changes with N tickets.
Given a pool of N candidate numbers and a budget of M tickets, generates ticket sets that maximize pair coverage using a greedy algorithm.
- Full Wheel — every combination
- Abbreviated — budget-optimized subset
- Key Number — fixed numbers on every ticket
Calculates EV per ticket at any jackpot level:
EV = Σ(prize × probability) − cost
Accounts for taxes, annuity discount, and jackpot-split probability at high amounts. Outputs a FAVORABLE / CONSIDER / HOLD signal with a breakeven threshold.
Tier 3 — Machine Learning & AI
A Bidirectional LSTM (Long Short-Term Memory) model trained on sliding windows of 10–20 historical draws. Each draw is one-hot encoded. The network outputs a probability score for each number appearing in the next draw.
Trained on 80% of data, validated on 20%. Models are saved to ml_models/ and reused until new draws are uploaded.
Important: On truly random data, LSTM predictions converge toward uniform distribution. Value is in detecting short-term artifacts, not in predicting truly random outcomes.
Ensemble: Random Forest + Gradient Boosting trained on features (frequency, recency, positional strength, gap, momentum). Averaged with LSTM for final ML score.
Monte Carlo: Generates 100,000+ synthetic draws using historical frequency distributions. Numbers and combinations that appear most in simulations get a higher score.
Monte Carlo acts as a validation layer — if a generated pick rarely appears in simulation, it is flagged for review.
Tier 4–5 — Pick Generation & Jackpot Intelligence
| Factor | Default Weight | Source Module |
|---|---|---|
| Medium-term frequency | 15% | Module 1 |
| Positional strength | 12% | Module 2 |
| Cluster affinity | 10% | Module 3 |
| Due score | 12% | Module 7 |
| Trend momentum | 8% | Module 1 |
| Short-term heat | 8% | Module 1 |
| Group balance contribution | 5% | Module 8 |
| LSTM prediction score | 10% | Module 12 |
| Ensemble ML score | 10% | Module 13 |
| Monte Carlo frequency | 5% | Module 13 |
| Coverage value | 5% | Module 11 |