Go context timeouts: a practical recipe for fast APIs

Q: Why can a few slow requests bring down an otherwise healthy API?

A slow request holds onto limited resources while it waits: a goroutine, memory for buffers and response objects, and often a database connection or HTTP client connection. When enough requests wait at the same time, queues form, latency rises for all traffic, and the service can fail even if each request would eventually finish.

Q: What’s the simplest way to prevent slow requests from piling up in Go?

Set a clear deadline at the request boundary (proxy/gateway and in the Go server), derive a timed context in the handler, and pass that into every blocking call (database and outbound HTTP). When the deadline is hit, return quickly with a consistent timeout response and stop any in-flight work that supports cancellation.

Q: When should I use WithTimeout vs WithDeadline vs WithCancel?

Use when you want “stop after this duration,” which is the most common in handlers. Use when you already have a fixed cutoff time to honor. Use when some internal condition should stop work early, like “we already have an answer” or “the client disconnected.”

Q: Why do I need to call cancel() if the timeout will happen anyway?

Always call the cancel function, typically with right after creating the derived context. Canceling releases the timer and lets any child work get a clear stop signal, especially in code paths that return early before the deadline triggers.

Q: How do I make sure my deadline actually reaches lower layers?

Create the request context once in the handler and pass it down as the first argument to functions that may block. A quick check is to search for or in request code paths; those often break cancellation propagation by disconnecting work from the request’s deadline.

Q: How do I apply timeouts to PostgreSQL queries properly?

Use context-aware database methods like , , and (or the equivalents in your driver). When the context ends, the driver can ask PostgreSQL to cancel the query so you don’t keep burning time and connections after the request is already over.

Q: How do I stop outbound HTTP calls from hanging after the client gave up?

Attach the parent request context to the outbound request using , and also configure client/transport timeouts so you’re protected during connect, TLS, and waiting for response headers. Even on errors or non-200 responses, always close the response body so connections return to the pool.

Q: How should I “budget” time across DB work and multiple external calls?

Pick one total budget for the request first, then give each dependency a smaller slice that fits within it, leaving a small buffer for handler overhead and response encoding. If the parent context has little time left, avoid starting expensive work that can’t realistically finish before the deadline.

Q: What status code should I return on context timeouts and cancellations?

A common default is mapping to with a short message like “request timed out.” For , it usually means the client disconnected; often the best action is to stop work and return without writing a body, so you don’t waste more resources.

Q: What are the most common mistakes that make handlers ignore timeouts?

The most frequent ones are dropping the request context by using , starting retries or sleeps without checking , and forgetting to attach to blocking calls. Another subtle issue is stacking many unrelated timeouts everywhere, which makes failures hard to reason about and can cause surprising early cutoffs.

Go context timeouts: a practical recipe for fast APIs | Koder.ai

Why slow requests can melt an API

A single slow request is rarely "just slow." While it waits, it keeps a goroutine alive, holds memory for buffers and response objects, and often occupies a database connection or a slot in a pool. When enough slow requests pile up, your API stops doing useful work because its limited resources are stuck waiting.

You usually feel it in three places. Goroutines accumulate and scheduling overhead rises, so latency gets worse for everyone. Database pools run out of free connections, so even fast queries start queueing behind slow ones. Memory climbs from in-flight data and partially built responses, which increases GC work.

Adding more servers often doesn't fix it. If each instance hits the same bottleneck (a small DB pool, one slow upstream, shared rate limits), you just move the queue around and pay more while errors still spike.

Picture a handler that fans out: it loads a user from PostgreSQL, calls a payments service, then calls a recommendation service. If the recommendation call hangs and nothing cancels it, the request never finishes. The DB connection might get returned, but the goroutine and HTTP client resources stay tied up. Multiply that by hundreds of requests and you get a slow meltdown.

The goal is simple: set a clear time limit, stop work when time is up, free resources, and return a predictable error. Go context timeouts give every step a deadline so work stops when the user is no longer waiting.

Context timeouts in plain terms

A context.Context is a small object you pass down your call chain so every layer agrees on one thing: when this request must stop. Timeouts are the common way to prevent one slow dependency from tying up your server.

A context can carry three kinds of information: a deadline (when work must stop), a cancellation signal (someone decided to stop early), and a few request-scoped values (use this sparingly, and never for big data).

Cancellation isn't magic. A context exposes a Done() channel. When it closes, the request is canceled or its time is up. Code that respects context checks Done() (often with a select) and returns early. You can also check ctx.Err() to learn why it ended, usually context.Canceled or context.DeadlineExceeded.

Use context.WithTimeout for "stop after X seconds." Use context.WithDeadline when you already know the exact cutoff time. Use context.WithCancel when a parent condition should stop the work (client disconnected, user navigated away, you already have the answer).

When a context is canceled, the correct behavior is boring but important: stop doing work, stop waiting on slow I/O, and return a clear error. If a handler is waiting on a database query and the context ends, return quickly and let the database call abort if it supports context.

Set time limits at the API boundary

The safest place to stop slow requests is the boundary where traffic enters your service. If a request is going to time out, you want it to happen predictably and early, not after it has tied up goroutines, DB connections, and memory.

Start at the edge (load balancer, API gateway, reverse proxy) and set a hard cap for how long any request is allowed to live. That protects your Go service even if a handler forgets to set a timeout.

Inside your Go server, set HTTP timeouts so the server doesn't wait forever for a slow client or a stalled response. At a minimum, configure timeouts for reading headers, reading the full request body, writing the response, and keeping idle connections alive.

Pick a default request budget that matches your product. For many APIs, 1 to 3 seconds is a reasonable starting point for typical requests, with a higher limit for known slow operations like exports. The exact number matters less than being consistent, measuring it, and having a clear rule for exceptions.

Streaming responses need extra care. It's easy to create an accidental infinite stream where the server keeps the connection open and writes tiny chunks forever, or waits forever before the first byte. Decide upfront whether an endpoint is truly a stream. If it isn't, enforce a maximum total time and a maximum time-to-first-byte.

Once the boundary has a clear deadline, it's much easier to propagate that deadline through the whole request.

Step by step: add a timeout to an HTTP handler

The simplest place to start is the HTTP handler. It's where one request enters your system, so it's a natural place to set a hard limit.

1) Derive a timed context

Create a new context with a deadline, and make sure you cancel it. Then pass that context to anything that might block: database work, HTTP calls, or slow computations.

func (s *Server) GetUser(w http.ResponseWriter, r *http.Request) {
	ctx, cancel := context.WithTimeout(r.Context(), 2*time.Second)
	defer cancel()

	userID := r.URL.Query().Get("id")
	if userID == "" {
		http.Error(w, "missing id", http.StatusBadRequest)
		return
	}

	user, err := s.loadUser(ctx, userID)
	if err != nil {
		writeError(w, ctx, err)
		return
	}

	writeJSON(w, http.StatusOK, user)
}

2) Pass ctx into every blocking call

A good rule: if a function can wait on I/O, it should accept a context.Context. Keep handlers readable by pushing details into small helper functions like loadUser.

func (s *Server) loadUser(ctx context.Context, id string) (User, error) {
	return s.repo.GetUser(ctx, id) // repo should use QueryRowContext/ExecContext
}

3) Return early when the context is done

If the deadline is hit (or the client disconnects), stop work and return a user-friendly response. A common mapping is context.DeadlineExceeded to 504 Gateway Timeout, and context.Canceled to "client is gone" (often no response body).

func writeError(w http.ResponseWriter, ctx context.Context, err error) {
	if errors.Is(err, context.DeadlineExceeded) {
		http.Error(w, "request timed out", http.StatusGatewayTimeout)
		return
	}
	if errors.Is(err, context.Canceled) {
		// Client went away. Avoid doing more work.
		return
	}
	http.Error(w, "internal error", http.StatusInternalServerError)
}

This pattern prevents pile-ups. Once the timer expires, every context-aware function down the chain gets the same stop signal and can exit quickly.

Propagate deadlines to PostgreSQL calls

Once your handler has a context with a deadline, the most important rule is simple: use that same ctx all the way into the database call. That's how timeouts stop work instead of only stopping your handler from waiting.

With database/sql, prefer the context-aware methods:

func (s *Server) getUser(w http.ResponseWriter, r *http.Request) {
    ctx := r.Context()

    row := s.db.QueryRowContext(ctx,
        "SELECT id, email FROM users WHERE id = $1",
        r.URL.Query().Get("id"),
    )

    var id int64
    var email string
    if err := row.Scan(&id, &email); err != nil {
        // handle below
    }
}

Pick a DB timeout that fits your request budget

If the handler budget is 2 seconds, the database should get only a slice of that. Leave time for JSON encoding, other dependencies, and error handling. A simple starting point is to give Postgres 30% to 60% of the total budget. With a 2 second handler deadline, that might be 800ms to 1.2s.

What happens when you cancel a query

When the context is canceled, the driver asks Postgres to stop the query. Usually the connection returns to the pool and can be reused. If cancellation happens during a bad network moment, the driver may discard that connection and open a new one later. Either way, you avoid a goroutine waiting forever.

When checking errors, treat timeouts differently from real DB failures. If errors.Is(err, context.DeadlineExceeded), you ran out of time and should return a timeout. If errors.Is(err, context.Canceled), the client went away and you should stop quietly. Other errors are normal query problems (bad SQL, missing row, permissions).

Propagate deadlines to external HTTP requests

Get rewarded for your build

Share what you built with Koder.ai and earn credits for future projects.

Earn Credits

If your handler has a deadline, your outbound HTTP calls should respect it too. Otherwise the client gives up, but your server keeps waiting on a slow upstream and ties up goroutines, sockets, and memory.

Build outbound requests with the parent context so cancellation travels automatically:

func fetchUser(ctx context.Context, url string) ([]byte, error) {
	// Add a small per-call cap, but never exceed the parent deadline.
	ctx, cancel := context.WithTimeout(ctx, 800*time.Millisecond)
	defer cancel()

	req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
	if err != nil {
		return nil, err
	}

	resp, err := http.DefaultClient.Do(req)
	if err != nil {
		return nil, err
	}
	defer resp.Body.Close() // always close, even on non-200

	return io.ReadAll(resp.Body)
}

That per-call timeout is a safety net. The parent request deadline is still the real boss. One clock for the whole request, plus smaller caps for risky steps.

Also set timeouts at the transport level. Context cancels the request, but transport timeouts protect you from slow handshakes and servers that never send headers.

One detail that bites teams: response bodies must be closed on every path. If you return early (status code check, JSON decode error, context timeout), still close the body. Leaking bodies can quietly exhaust connections in the pool and turn into "random" latency spikes.

A concrete scenario: your API calls a payment provider. The client times out after 2 seconds, but the upstream hangs for 30 seconds. Without request cancellation and transport timeouts, you keep paying for that 30 second wait for every abandoned request.

Timeout budgeting through the whole request

A single request usually touches more than one slow thing: handler work, a database query, and one or more external APIs. If you give each step a generous timeout, the total time quietly grows until users feel it and your server piles up.

Budgeting is the simplest fix. Set one parent deadline for the whole request, then give each dependency a smaller slice. Child deadlines should be earlier than the parent so you fail fast and still have time to return a clean error.

Rules of thumb that hold up in real services:

Pick a total budget first (for example, 2 seconds for a user-facing endpoint).
Reserve a small buffer for handler overhead and formatting the response (for example, 100 to 200ms).
Split the remaining time across dependencies.
If you have multiple external calls, cap each one instead of letting one eat the whole budget.
If the parent context has only 120ms left, don't start a dependency call that normally needs 300ms.

Avoid stacking timeouts that fight each other. If your handler context has a 2 second deadline and your HTTP client has a 10 second timeout, you're safe but it's confusing. If it's the other way around, the client may cut off early for unrelated reasons.

For background work (audit logs, metrics, emails), don't reuse the request context. Use a separate context with its own short timeout so client cancellations don't kill important cleanup.

Common mistakes that keep requests hanging

Build context-aware Go APIs

Generate Go handlers that pass context deadlines into Postgres and outbound HTTP calls.

Try Free

Most timeout bugs aren't in the handler. They happen one or two layers down, where the deadline quietly gets lost. If you set timeouts at the edge but ignore them in the middle, you can still end up with goroutines, DB queries, or HTTP calls that keep running after the client has gone.

The patterns that show up most often are simple:

Dropping the request context and calling lower layers with context.Background() (or TODO). That disconnects work from the client cancel and the handler deadline.
Sleeping, retrying, or looping without checking ctx.Done(). The request is canceled, but your code keeps waiting.
Wrapping every helper in its own context.WithTimeout. You end up with many timers and confusing deadlines.
Forgetting to attach ctx to blocking calls (DB queries, outbound HTTP, message publishes). A handler timeout does nothing if the dependency call ignores it.
Treating timeouts like generic failures and returning 500. Clients need a clear signal that the request ran out of time.

A classic failure: you add a 2 second timeout in the handler, then your repository uses context.Background() for the database query. Under load, a slow query continues running even after the client gave up, and the pile grows.

Fix the basics: pass ctx as a first argument through your call stack. Inside long work, add quick checks like select { case <-ctx.Done(): return ctx.Err() default: }. Map context.DeadlineExceeded to a timeout response (often 504) and context.Canceled to a client-cancel style response (often 408 or 499 depending on your conventions).

Testing and observing timeouts in practice

Timeouts only help if you can see them happen and confirm the system recovers cleanly. When something is slow, the request should stop, resources should be released, and the API should stay responsive.

Log what matters (and keep it consistent)

For every request, log the same small set of fields so you can compare normal requests vs timeouts. Include the context deadline (if it exists) and what ended the work.

Useful fields include the deadline (or "none"), total elapsed time, cancellation reason (timeout vs client canceled), a short operation label ("db.query users", "http.call billing"), and a request ID.

A minimal pattern looks like this:

start := time.Now()
deadline, hasDeadline := ctx.Deadline()
err := doWork(ctx)
log.Printf("op=%s hasDeadline=%t deadline=%s elapsed=%s err=%v",
  "getUser", hasDeadline, deadline.Format(time.RFC3339Nano), time.Since(start), err)

Metrics that show timeouts before users complain

Logs help you debug one request. Metrics show trends.

Track a few signals that usually spike early when timeouts are wrong: count of timeouts by route and dependency, in-flight requests (should level off under load), DB pool wait time, and latency percentiles (p95/p99) split by success vs timeout.

Reproduce timeouts locally (so you trust your fix)

Make slowness predictable. Add a debug-only delay to one handler, slow a DB query with a deliberate wait, or wrap an external call with a test server that sleeps. Then verify two things: you see the timeout error, and the work stops soon after cancellation.

A small load test helps too. Run 20 to 50 concurrent requests for 30 to 60 seconds with one forced slow dependency. Goroutine count and in-flight requests should rise and then level off. If they keep climbing, something is ignoring context cancellation.

Quick checklist before you ship

Timeouts only help if they're applied everywhere a request can wait. Before you deploy, do one pass over your codebase and confirm the same rules are followed in every handler.

Every incoming request has a clear time budget at the boundary (router, middleware, handler). No infinite handlers.
Every database query receives the request context (or a child context) and checks errors for context.DeadlineExceeded and context.Canceled.
Every outbound HTTP call uses http.NewRequestWithContext (or req = req.WithContext(ctx)) and the client has transport timeouts (dial, TLS, response header). Avoid relying on http.DefaultClient in production paths.
Errors are mapped consistently: timeouts become the same API response every time (often 504), client cancellations map cleanly (often 499 or 408 depending on your convention), and internal timeouts don't leak raw driver errors.
Logs and metrics make timeouts obvious: handler name, elapsed time, and which dependency timed out.

A quick "slow dependency" drill before release is worth it. Add an artificial 2 second delay to one SQL query and confirm three things: the handler returns on time, the DB call actually stops (not just the handler), and your logs clearly say it was a DB timeout.

A realistic example: one request, three slow dependencies

Iterate safely on reliability

Experiment with timeout changes and roll back fast if something breaks.

Use Snapshots

Imagine an endpoint like GET /v1/account/summary. One user action triggers three things: a PostgreSQL query (account plus recent activity) and two external HTTP calls (for example, a billing status check and a profile enrichment lookup).

Give the whole request a hard 2 second budget. Without a budget, one slow dependency can keep goroutines, DB connections, and memory tied up until your API starts timing out everywhere.

A simple split might be 800ms for the DB query, 600ms for external call A, and 600ms for external call B.

Once you know the overall deadline, pass it down. Each dependency gets its own smaller timeout, but it still inherits cancellation from the parent.

func AccountSummary(w http.ResponseWriter, r *http.Request) {
  ctx, cancel := context.WithTimeout(r.Context(), 2*time.Second)
  defer cancel()

  dbCtx, dbCancel := context.WithTimeout(ctx, 800*time.Millisecond)
  defer dbCancel()

  aCtx, aCancel := context.WithTimeout(ctx, 600*time.Millisecond)
  defer aCancel()

  bCtx, bCancel := context.WithTimeout(ctx, 600*time.Millisecond)
  defer bCancel()

  // Use dbCtx for QueryContext, aCtx/bCtx for outbound HTTP requests.
}

If external call B slows down and takes 2.5 seconds, your handler should stop waiting at 600ms, cancel the in-flight work, and return a clear timeout response to the client. The client sees a fast failure instead of a hanging spinner.

Your logs should make it obvious what used the budget, for example: DB finished quickly, external A succeeded, external B hit its cap and returned context deadline exceeded.

Next steps: standardize timeouts across your API

Once one real endpoint works well with timeouts and cancellation, turn it into a repeatable pattern. Apply it end to end: handler deadline, DB calls, and outbound HTTP. Then copy the same structure to the next endpoint.

You'll move faster if you centralize the boring parts: a boundary timeout helper, wrappers that ensure ctx is passed into DB and HTTP calls, and one consistent error mapping and log format.

If you want to prototype this pattern quickly, Koder.ai (koder.ai) can generate Go handlers and service calls from a chat prompt, and you can export the source code to apply your own timeout helpers and budgets. The goal is consistency: slow calls stop early, errors look the same, and debugging doesn't depend on who wrote the endpoint.

FAQ

Why can a few slow requests bring down an otherwise healthy API?

A slow request holds onto limited resources while it waits: a goroutine, memory for buffers and response objects, and often a database connection or HTTP client connection. When enough requests wait at the same time, queues form, latency rises for all traffic, and the service can fail even if each request would eventually finish.

What’s the simplest way to prevent slow requests from piling up in Go?

Set a clear deadline at the request boundary (proxy/gateway and in the Go server), derive a timed context in the handler, and pass that ctx into every blocking call (database and outbound HTTP). When the deadline is hit, return quickly with a consistent timeout response and stop any in-flight work that supports cancellation.

When should I use WithTimeout vs WithDeadline vs WithCancel?

Use context.WithTimeout(parent, d) when you want “stop after this duration,” which is the most common in handlers. Use context.WithDeadline(parent, t) when you already have a fixed cutoff time to honor. Use context.WithCancel(parent) when some internal condition should stop work early, like “we already have an answer” or “the client disconnected.”

Why do I need to call cancel() if the timeout will happen anyway?

Always call the cancel function, typically with defer cancel() right after creating the derived context. Canceling releases the timer and lets any child work get a clear stop signal, especially in code paths that return early before the deadline triggers.

How do I make sure my deadline actually reaches lower layers?

Create the request context once in the handler and pass it down as the first argument to functions that may block. A quick check is to search for context.Background() or context.TODO() in request code paths; those often break cancellation propagation by disconnecting work from the request’s deadline.

How do I apply timeouts to PostgreSQL queries properly?

Use context-aware database methods like QueryContext, QueryRowContext, and ExecContext (or the equivalents in your driver). When the context ends, the driver can ask PostgreSQL to cancel the query so you don’t keep burning time and connections after the request is already over.

How do I stop outbound HTTP calls from hanging after the client gave up?

Attach the parent request context to the outbound request using http.NewRequestWithContext(ctx, ...), and also configure client/transport timeouts so you’re protected during connect, TLS, and waiting for response headers. Even on errors or non-200 responses, always close the response body so connections return to the pool.

How should I “budget” time across DB work and multiple external calls?

Pick one total budget for the request first, then give each dependency a smaller slice that fits within it, leaving a small buffer for handler overhead and response encoding. If the parent context has little time left, avoid starting expensive work that can’t realistically finish before the deadline.

What status code should I return on context timeouts and cancellations?

A common default is mapping context.DeadlineExceeded to 504 Gateway Timeout with a short message like “request timed out.” For context.Canceled, it usually means the client disconnected; often the best action is to stop work and return without writing a body, so you don’t waste more resources.

What are the most common mistakes that make handlers ignore timeouts?

The most frequent ones are dropping the request context by using context.Background(), starting retries or sleeps without checking ctx.Done(), and forgetting to attach ctx to blocking calls. Another subtle issue is stacking many unrelated timeouts everywhere, which makes failures hard to reason about and can cause surprising early cutoffs.