reranking is where it gets interesting. temporal decay + recency bias is the default but often wrong — old context is sometimes more important than recent noise. ended up with a hybrid: BM25 for keyword precision, embeddings for semantic similarity, then a scorer that weights by event type. mistakes and corrections rank highest regardless of age.