Image Upload + Preprocessing
Validate uploaded files, read 16-bit GeoTIFF satellite imagery with Rasterio, normalize bands, and shape the tensor the model expects.
.tif from xBD, Sentinel, or Maxar.File validation by extension and size
python1from fastapi import HTTPException23ALLOWED = {".tif", ".tiff", ".jpg", ".jpeg", ".png"}4MAX_BYTES = 50 * 1024 * 1024 # 50 MB56def validate_filename(name: str, size: int):7 ext = "." + name.rsplit(".", 1)[-1].lower()8 if ext not in ALLOWED:9 raise HTTPException(415, f"Unsupported file type: {ext}")10 if size > MAX_BYTES:11 raise HTTPException(413, "File larger than 50 MB")
Plain-English explanation
Reject uploads that are not images, or are too large, before they reach the model.
Why it matters
If you skip validation a 4 GB file or a Word document can crash the server. Fail fast with a clear HTTP error.
Line by line
ALLOWED = {...}A set of extensions we will accept.HTTPException(415, ...)415 = Unsupported Media Type. 413 = Payload Too Large.name.rsplit(".", 1)Splits the filename once from the right so 'foo.bar.tif' → 'tif'.
Common errors
Which HTTP status means 'unsupported media type'?
Magic-byte detection
python1def detect_type(b: bytes) -> str:2 if b[:4] == b"\x89PNG": return "png"3 if b[:3] == b"\xff\xd8\xff": return "jpeg"4 if b[:4] in (b"II*\x00", b"MM\x00*"): return "tiff"5 raise HTTPException(415, "Not an image")
Plain-English explanation
Read the first few bytes of the file and compare them to the standard signatures for PNG, JPEG, and TIFF.
Why it matters
A user can rename mal.exe to sat.tif. Trusting the extension is a security hole; the magic bytes are what the file actually is.
Line by line
b[:4] == b"\x89PNG"Every PNG starts with the bytes 89 50 4E 47."II*\x00" or "MM\x00*"TIFF: II = little-endian, MM = big-endian.
Why check magic bytes after the extension?
Read a GeoTIFF with Rasterio
python1import rasterio2import numpy as np34def read_geotiff(path: str):5 with rasterio.open(path) as src:6 arr = src.read() # shape: (bands, H, W) dtype=uint167 meta = {8 "crs": str(src.crs),9 "transform": list(src.transform)[:6],10 "bounds": list(src.bounds),11 "width": src.width,12 "height": src.height,13 "count": src.count,14 }15 return arr, meta
Plain-English explanation
Open the GeoTIFF, read every band into a NumPy array, and capture the geospatial metadata.
Why it matters
Rasterio preserves the full 16-bit range and the CRS — both needed for accurate inference and for showing the user where on Earth the tile came from.
Line by line
with rasterio.open(path) as srcContext manager closes the file when the block exits.src.read()Returns shape (bands, H, W). Note: bands first, not last.str(src.crs)Coordinate reference system, e.g. 'EPSG:4326'.src.transformAffine matrix mapping pixel (col, row) → geographic (x, y).
Common errors
Why use Rasterio instead of OpenCV for GeoTIFF?
Select bands 5, 3, 2 (false-color)
python1# Sentinel-2 / Landsat false-color: NIR=5, Red=4, Green=3 (1-indexed)2def select_532(arr: np.ndarray) -> np.ndarray:3 nir, red, green = arr[4], arr[2], arr[1] # 0-indexed4 rgb = np.stack([nir, red, green], axis=-1) # H, W, 35 return rgb
Plain-English explanation
Pick the three bands that highlight vegetation, urban damage, and water, then stack them as an RGB-like image.
Why it matters
The team's model was trained on this exact 5-3-2 false-color composite. Feeding raw bands would produce meaningless predictions.
Line by line
arr[4], arr[2], arr[1]Bands are 1-indexed in remote sensing literature; NumPy is 0-indexed.np.stack([...], axis=-1)Stack along the last axis so the result is (H, W, 3).
Common errors
If bands are 1-indexed in docs, what NumPy index is band 5?
Normalize 16-bit imagery to [0, 1]
python1def normalize(img: np.ndarray) -> np.ndarray:2 img = img.astype(np.float32)3 lo, hi = np.percentile(img, (2, 98))4 img = np.clip((img - lo) / (hi - lo + 1e-6), 0, 1)5 return img
Plain-English explanation
Stretch the 2nd to 98th percentile of pixel values to fill the 0–1 range that the model expects.
Why it matters
Raw uint16 values can span 0–65535. Without normalization the network sees a uniform 'dark' image and predicts garbage.
Line by line
img.astype(np.float32)Convert from uint16 so subtraction does not underflow.np.percentile(img, (2, 98))Robust min/max — ignores outliers like sensor saturation.np.clip(..., 0, 1)Values outside the percentile band get pinned.+ 1e-6Prevents division-by-zero on flat tiles.
Why use the 2nd and 98th percentiles instead of min and max?
Capture geospatial metadata
python1def extract_meta(src) -> dict:2 return {3 "crs": str(src.crs),4 "resolution": src.res, # (x_size, y_size) in CRS units5 "bounds": tuple(src.bounds),6 "height": src.height,7 "width": src.width,8 "bands": src.count,9 }
Plain-English explanation
Return everything an engineer needs to locate the tile on a map and reproduce the read.
Why it matters
Inspection reports must cite WHERE a damaged building is. Without CRS and bounds the prediction is useless.
Line by line
src.resPixel size in CRS units (often meters).src.bounds(left, bottom, right, top) in CRS coordinates.
What does src.res represent?
Quality checks
python1def quality_checks(arr: np.ndarray, meta: dict) -> list[str]:2 warnings = []3 if meta["bands"] < 3:4 warnings.append("Fewer than 3 bands — false-color disabled.")5 if arr.std() < 1.0:6 warnings.append("Flat image — possible cloud or fill.")7 if np.isnan(arr).any():8 warnings.append("NaNs detected — interpolation recommended.")9 return warnings
Plain-English explanation
Catch obvious data problems before they reach the model and confuse the engineer.
Why it matters
A confident wrong prediction on a cloudy tile is dangerous. Surface uncertainty up front.
Line by line
arr.std() < 1.0Low standard deviation = flat = cloud / fill / corruption.np.isnan(arr).any()NaN propagates through softmax and produces undefined output.
Why warn on low standard deviation?
Standard image preprocessing with PIL
python1from PIL import Image2import numpy as np34def load_standard(path: str, size=(512, 512)) -> np.ndarray:5 img = Image.open(path).convert("RGB").resize(size)6 arr = np.asarray(img, dtype=np.float32) / 255.0 # H, W, 37 return arr
Plain-English explanation
For JPG/PNG photos (drone or phone shots), use Pillow to load, resize, and convert to a normalized array.
Why it matters
Phone uploads are 8-bit RGB and need a simpler path than GeoTIFF. Same downstream shape, different reader.
Line by line
.convert("RGB")Drops alpha/grayscale variations to a consistent 3-channel image..resize(size)The model expects exactly 512×512 input./ 255.0Normalize 8-bit values to [0, 1].
Common errors
Why .convert("RGB") before resizing?
Reshape to (1, 3, H, W)
python1def to_tensor(img_hwc: np.ndarray) -> np.ndarray:2 # H, W, C → C, H, W → 1, C, H, W3 chw = np.transpose(img_hwc, (2, 0, 1))4 batch = np.expand_dims(chw, axis=0).astype(np.float32)5 return batch # ready for ONNX Runtime
Plain-English explanation
Switch from height-width-channel to channel-first, then add a batch dimension. This is the exact shape ONNX expects.
Why it matters
PyTorch (and therefore ONNX export) uses NCHW. Feeding NHWC produces silently wrong outputs.
Line by line
np.transpose(img_hwc, (2, 0, 1))Reorder axes: (H,W,C) → (C,H,W).np.expand_dims(chw, axis=0)Add a leading dim so shape becomes (1, C, H, W).
Common errors
Which tensor layout does ONNX (from PyTorch) expect?
Update /classify with real preprocessing
python1@router.post("", response_model=ClassifyResponse)2async def classify(file: UploadFile = File(...)):3 raw = await file.read()4 validate_filename(file.filename, len(raw))5 kind = detect_type(raw)67 tmp = f"uploads/{file.filename}"8 open(tmp, "wb").write(raw)910 if kind == "tiff":11 arr, meta = read_geotiff(tmp)12 rgb = select_532(arr)13 else:14 rgb = load_standard(tmp) * 65535 # match GeoTIFF scale15 meta = {"crs": None, "bands": 3}1617 norm = normalize(rgb)18 tensor = to_tensor(norm)19 # Week 3: model.run(tensor) goes here.20 return ClassifyResponse(filename=file.filename, placard="GREEN",21 top_class="no-damage", confidence=0.72, entropy=0.83,22 classes=[...], recommendation="Stub — model wires up Week 3.")
Plain-English explanation
Wire every preprocessing function into the endpoint. The model call is the only missing piece — that arrives in Week 3.
Why it matters
By the end of this cell, your server accepts both GeoTIFF and standard images and produces the exact tensor the AI model will consume.
Line by line
raw = await file.read()Read the upload into memory once so size and magic bytes can both be checked.validate_filename / detect_typeTwo-layer validation — extension and content.to_tensor(norm)Final NCHW float32 tensor ready for ONNX Runtime.
Why save the upload to disk before reading?