AI Model Inference
Load the ONNX model, run a forward pass, convert logits to probabilities, and map the prediction to an ATC-20 safety placard.
ModelManager class
python1# app/models/manager.py2import onnxruntime as ort3import numpy as np45class ModelManager:6 LABELS = ["no-damage", "minor-damage", "major-damage", "destroyed"]78 def __init__(self, onnx_path: str):9 self.session = ort.InferenceSession(10 onnx_path, providers=["CPUExecutionProvider"])11 self.input_name = self.session.get_inputs()[0].name1213 def predict(self, tensor: np.ndarray) -> np.ndarray:14 logits = self.session.run(None, {self.input_name: tensor})[0]15 return logits[0] # shape (4,)
Plain-English explanation
A small class that owns the loaded model and exposes one method: predict(tensor) → logits.
Why it matters
Encapsulating the model in a class means main.py loads it once at startup, not on every request — saving seconds per call.
Line by line
ort.InferenceSession(path)Loads the .onnx graph into memory.providers=["CPUExecutionProvider"]Run on CPU; swap for CUDA on GPU machines.session.get_inputs()[0].nameLook up the input tensor name baked into the ONNX graph.session.run(None, feed)None = return all outputs. feed maps input name → numpy array.
Common errors
Why load the model once in a class?
Load model at startup
python1# app/main.py2from app.models.manager import ModelManager3model: ModelManager | None = None45@app.on_event("startup")6def load_model():7 global model8 model = ModelManager("app/models/damage_v1.onnx")9 print("Model loaded:", model.LABELS)
Plain-English explanation
Run the model loader exactly once when Uvicorn starts. Store the instance in a module-level variable.
Why it matters
Startup hooks guarantee the model is ready before the first /classify call arrives.
Line by line
@app.on_event("startup")FastAPI runs this function during server boot.global modelAllow the function to assign to the module-level name.
Common errors
Where should the ONNX session live?
Softmax — convert logits to probabilities
python1def softmax(x: np.ndarray) -> np.ndarray:2 e = np.exp(x - x.max()) # numerical stability3 return e / e.sum()
Plain-English explanation
Turn raw network outputs (any real numbers) into a probability distribution that sums to 1.
Why it matters
ATC-20 logic compares probabilities, not raw logits. Without softmax you cannot threshold a confidence.
Line by line
x - x.max()Subtracting the max prevents overflow when logits are large.np.exp(...)Exponentiate so values become positive./ e.sum()Normalize so the result sums to 1.0.
Why subtract x.max() before exp?
Confidence and entropy
python1def confidence(probs: np.ndarray) -> float:2 return float(probs.max())34def entropy(probs: np.ndarray) -> float:5 # Shannon entropy in nats, normalized by log(N)6 p = probs.clip(1e-9, 1.0)7 h = -(p * np.log(p)).sum()8 return float(h / np.log(len(p)))
Plain-English explanation
Confidence is the highest probability. Entropy measures how spread-out the distribution is — high entropy = uncertain.
Why it matters
A confident prediction with entropy 0.1 is far more trustworthy than one with entropy 0.9, even if the top class is the same.
Line by line
probs.clip(1e-9, 1.0)Avoid log(0).-(p * np.log(p)).sum()Shannon entropy formula./ np.log(len(p))Normalize to [0, 1] so different class counts are comparable.
A high entropy value indicates…
ATC-20 mapping
python1def to_placard(label: str, conf: float) -> tuple[str, str]:2 if conf < 0.55: # low-confidence default3 return "YELLOW", "Confidence low — engineer inspection required."4 if label == "no-damage":5 return "GREEN", "Safe to enter — routine inspection only."6 if label == "minor-damage":7 return "YELLOW", "Restricted use — limited entry permitted."8 if label == "major-damage":9 return "ORANGE", "Engineer inspection required before re-entry."10 return "RED", "Unsafe — do NOT enter."
Plain-English explanation
Map the model's class label to an official ATC-20 placard plus a human-readable engineer recommendation.
Why it matters
The placard is what gets posted on the actual building. The mapping is conservative: low confidence always escalates to a human.
Line by line
if conf < 0.55Default to YELLOW whenever the model is not confident.tuple[str, str]Returns (placard, recommendation).
Why does low confidence escalate to YELLOW instead of trusting the top class?
Real /classify response
python1@router.post("", response_model=ClassifyResponse)2async def classify(file: UploadFile = File(...)):3 raw = await file.read(); validate_filename(file.filename, len(raw))4 tensor = preprocess(raw, file.filename) # from Week 256 logits = model.predict(tensor)7 probs = softmax(logits)8 idx = int(probs.argmax())9 label = ModelManager.LABELS[idx]10 conf = confidence(probs)11 placard, rec = to_placard(label, conf)1213 return ClassifyResponse(14 filename=file.filename, placard=placard, top_class=label,15 confidence=conf, entropy=entropy(probs),16 classes=[DamageClass(label=l, probability=float(p))17 for l, p in zip(ModelManager.LABELS, probs)],18 recommendation=rec,19 )
Plain-English explanation
The complete endpoint: preprocess → predict → softmax → placard. Every line builds on Weeks 1 and 2.
Why it matters
This is the production handler. Everything else (batch, compare, visualise) wraps this same core.
Line by line
model.predict(tensor)Calls the ONNX session and returns the raw logits.probs.argmax()Index of the most likely class.DamageClass(label=l, probability=float(p))Build the per-class list for the frontend chart.
Common errors
Which step turns model output into a human-readable safety decision?