Gaurav Luthra - Writing

Gaurav Luthra - Writing https://luthrag.github.io/writing/ Deep dives on distributed systems, observability, performance and data platforms. en-us Fri, 19 Jun 2026 00:00:00 +0000 The Hard Part of Sampling Is the Zero https://luthrag.github.io/writing/the-hard-part-of-sampling-is-the-zero.html https://luthrag.github.io/writing/the-hard-part-of-sampling-is-the-zero.html Fri, 19 Jun 2026 00:00:00 +0000 You can answer a huge aggregate query by reading a sliver of the data and multiplying back. The multiply is the easy part. The hard part is never reporting a confident zero when your sliver missed everything - measure the matches, widen once, then fail closed. sampling approximate queries observability statistics Where Your LLM Tokens Actually Go https://luthrag.github.io/writing/where-your-llm-tokens-go.html https://luthrag.github.io/writing/where-your-llm-tokens-go.html Mon, 15 Jun 2026 00:00:00 +0000 You profile CPU and memory. The context window is the most expensive resource in an AI app and almost nobody profiles it. How llmprof flame-graphs every request's tokens, prices the call offline for 1000+ models, and finds the dollars you can reclaim - all locally. llm observability flame graphs cost tokens Continuous Profiling: Finding Where the CPU Actually Goes https://luthrag.github.io/writing/continuous-profiling.html https://luthrag.github.io/writing/continuous-profiling.html Sun, 14 Jun 2026 00:00:00 +0000 Metrics say a service is slow; profiling says which function. How always-on sampling profilers work, how to read a flame graph, and what it takes to ingest every runtime into one model. profiling observability flame graphs performance Anomaly Detection for Alerting, at Nanosecond Cost https://luthrag.github.io/writing/anomaly-detection-at-nanosecond-cost.html https://luthrag.github.io/writing/anomaly-detection-at-nanosecond-cost.html Sun, 14 Jun 2026 00:00:00 +0000 How to take metric anomaly detection from 200ms per evaluation to 8.5 nanoseconds and 100k+ timeseries, by moving every expensive thing out of the hot path. observability time series performance statistics