Data & Methods — Upstart Clearing Simulator

How the simulator was built, where the data comes from, and how borrowers flow through the risk model.

These borrowers use the exact same generation algorithm as the simulator. Export from either page and compare results side-by-side.

Generating Synthetic Borrowers

The simulator generates borrowers using statistical distributions calibrated to public market data. Each borrower has 1,800+ potential features (like real Model 18); we display the core fields here.

1) FICO Score Distribution

Distribution: Normal(μ=667, σ=65) · Bounds: [520, 820]

Real Upstart population is bimodal (prime + near-prime). We simplified to unimodal for clarity.
// Box-Muller transform implementation
function gauss(m, s) {
  let u = 0, v = 0;
  while (!u) u = Math.random();
  while (!v) v = Math.random();
  return m + s * Math.sqrt(-2 * Math.log(u)) * Math.cos(2 * Math.PI * v);
}

2) Loan Amount Distribution

Distribution: LogNormal(μ=$12,000, σ=0.45) · Bounds: [$2,000, $45,000]

Most loans cluster $8K–$18K. This aligns with Upstart-era personal-loan averages.
// LogNormal sampling formula
function lognorm(m, s) {
  return Math.exp(Math.log(m) + s * gauss(0, 1));
}

3) Feature Engineering: From FICO to Risk Features

Representative engineered features (real model uses many more):

FICOrisk_fico_tierrisk_score_normalizedemployment_tierincome_volatility_estcash_flow_signaldti_proxy

The Hidden-Prime Discovery

28% of borrowers in FICO 580–720 are flagged hidden-prime by non-bureau signals (cash flow consistency, tenure, payment behavior). Model 18 can then offer materially lower APR while preserving partner return floors.

Classic FICO APR (650)
23.5%
Model 18 APR (Hidden Prime)
15.0%

Code Reference

Show simulator math functions

        
Total borrowers in dataset
25
Avg FICO
Hidden-prime rate
#NameFICOIncomeLoan AmtPurposeHidden PrimeRisk ScoreP(Default)Clearing APROutcomeDetails

Download Dataset

Case Study 1: Maria — Hidden-Prime Discovery

Borrower Profile: FICO 650, Income $78K, Loan $14K, Debt Consolidation, hidden-prime ON.
Layer 1 (Eligibility): Eligible partners: Eltura, Aperture. WestBank+ spots fail FICO floors.
Layer 2 (Pricing): Classic ≈ 23.5%; Model 18 lowers APR materially using cash-flow-derived hidden-prime signal. Lower payment improves modeled default probability and still clears partner return floor.
Layer 3 (Routing): Waterfall checks partner APR floors + capacity in priority order; first tier that passes receives the loan.

Case Study 2: Carlos — Supply-Side Failure

Borrower Profile: FICO 590, Income $92K, Loan $20K, Purpose: Small Business.
Layer 1 (Eligibility): Small Business purpose is unsupported by marketplace partners in this simulation → 0/5 eligible.
Outcome: Fails before pricing. PM implication: this is a product/inventory gap, not necessarily a borrower-quality problem.

Case Study 3: James — Clean Approval / Low-APR Placement

Borrower Profile: FICO 790, Income $140K, Loan $25K, Home Improvement, hidden-prime OFF.
Pricing: Both models produce low APR because baseline risk is already low.
Routing Insight: Extremely low APR can miss partner return floors, increasing probability of balance-sheet placement.

Primary Sources

Upstart IR: S-1 filing (2020), Q1/Q2 2022 earnings calls (Model 18, approval lift, APR delta, balance sheet discussions).

Federal/Industry: Federal Reserve consumer credit, TransUnion/Experian benchmarking.

Academic/Technical: XGBoost (Chen & Guestrin 2016), fairness in ML credit literature, Plaid cash-flow intelligence materials.

Data Assumptions & Calibration

ParameterSourceValueNotes
FICO distributionFed + TransUnionNormal(μ≈660, σ=65)Simplified from bimodal reality
Loan amountS-1 + ExperianLogNormal(μ=$12K, σ=0.45)Clamped to [$2K, $45K]
Hidden-prime rateIR-derived concept28% in 580–720Simulation parameter
APR reductionModel 18 narrativeGauss(8.5%, 1.5%)Clamped [5,12]
P(default)Industry-shaped proxyLinear FICO mappingDidactic, not production

Known Simplifications

Unimodal FICO, simplified default modeling, estimated partner capacities, no fraud/income verification layer, no macro-rate regime shifts.

Contact & Attribution

This is an educational approximation for PM interview preparation, not an official Upstart model or internal tool. Public sources only; proprietary underwriting logic is not replicated.

Further Reading

Upstart Investor Relations

https://ir.upstart.com/

XGBoost paper

Chen, T. & Guestrin, C. (2016) KDD.

Plaid Cash Flow Intelligence

https://plaid.com/

TransUnion Consumer Research

https://www.transunion.com/

← Return to Simulator