Given a REST API that begins to suffer high latency under load, outline the steps you would take to identify the bottleneck and at least three concrete code or infrastructure changes you might implement to improve throughput.

Question

VirtualInterview.AI · Accepted Answer

When a public API I owned spiked from 200ms to 1.2s under peak (10k RPS), I first reproduced the load in a staging cluster and collected distributed traces and real CPU/memory profiles. Traces showed long DB queries and frequent GC pauses from oversized JSON unmarshalling. I prioritized fixes: 1) add a read replica pool and offload heavy reads (reduced DB p99 by ~60%), 2) introduce request-level caching with Redis for hot endpoints (cache hit ~45%), and 3) replace blocking JSON parsing with a streaming parser and connection pooling in the HTTP client, cutting CPU and latency. I also added circuit breakers and rate limiting so failures didn’t amplify, and monitored throughput and error budget until stable.

Sample Answer

Keywords