IntermediateSITUATIONAL
Given a REST API that begins to suffer high latency under load, outline the steps you would take to identify the bottleneck and at least three concrete code or infrastructure changes you might implement to improve throughput.
Backend Developer
General

Sample Answer

When a public API I owned spiked from 200ms to 1.2s under peak (10k RPS), I first reproduced the load in a staging cluster and collected distributed traces and real CPU/memory profiles. Traces showed long DB queries and frequent GC pauses from oversized JSON unmarshalling. I prioritized fixes: 1) add a read replica pool and offload heavy reads (reduced DB p99 by ~60%), 2) introduce request-level caching with Redis for hot endpoints (cache hit ~45%), and 3) replace blocking JSON parsing with a streaming parser and connection pooling in the HTTP client, cutting CPU and latency. I also added circuit breakers and rate limiting so failures didn’t amplify, and monitored throughput and error budget until stable.

Keywords

Reproduce under load and gather telemetry (traces, CPU, GC, DB slow logs)Prioritize fixes that give biggest latency/throughput improvement (DB replicas, caching, parsing optimizations)Add operational protections (circuit breakers, rate limits) and measure impact