At Hamkee, we’ve developed Gatekeeper to solve a deceptively difficult infrastructure problem: preventing duplicate HTTP requests from reaching your backend. Gatekeeper is a lightweight C service that sits behind NGINX, computes a SHA-256 hash of every incoming request body as it streams through, and uses atomic filesystem locks to guarantee that only the first request with a given body is allowed past the gate. It is built for environments where duplicates are not just wasteful, but dangerous.
The Duplicate Request Problem
Consider a payment processing pipeline. A mobile client submits a charge request, but the connection drops before it receives the response. The client retries. Now two identical charge requests are in flight. If both reach the processing backend, the customer is charged twice.
Or consider an IoT telemetry ingestion endpoint receiving sensor data from thousands of devices. Network hiccups, retry logic in embedded firmware, and load balancer failovers routinely produce duplicate submissions. At modest scale, this is a nuisance. At high volume, duplicates consume processing capacity, inflate storage costs, and corrupt aggregation results.
The standard approach is to deduplicate at the application layer, usually with a database-backed idempotency key. This works, but it pushes the problem deep into the stack, where every backend service must implement and coordinate deduplication logic. It also means duplicate requests still consume network bandwidth, pass through TLS termination, get parsed, and hit the application before being rejected.
We wanted to stop duplicates earlier, at the reverse proxy layer, before they ever reach the backend. And we wanted to do it without requiring clients to send special headers or idempotency tokens. The request body itself is the identity.
Our Approach: Hash the Body, Lock the Hash
Gatekeeper operates on a simple principle: if two requests carry the same body, they produce the same SHA-256 digest. The first request to produce a given digest acquires an exclusive lock. Every subsequent request with the same digest is rejected.
The request flow through the system looks like this:
- A client sends a POST request to NGINX.
- NGINX proxies the raw body to Gatekeeper’s
/gateendpoint. - Gatekeeper computes
SHA-256(body)incrementally as bytes arrive — no buffering, bounded memory. - It attempts to create a lock file keyed by the hex digest using an atomic filesystem operation.
- If the lock is acquired: 202 Accepted with
X-Gate-Decision: ALLOW. - If the lock already exists: 409 Conflict with
X-Gate-Decision: DROP. - NGINX acts on the response — either forwarding the allowed request or silently dropping the duplicate.
This design pushes deduplication to the outermost layer of the stack, before any application logic executes. It requires zero changes to backend services and zero cooperation from clients.
Streaming Hash Computation
A critical design decision was to avoid buffering the entire request body in memory. Gatekeeper uses OpenSSL’s EVP interface to compute the SHA-256 digest incrementally as libmicrohttpd delivers chunks of upload data:
// Gate: stream body into SHA256; finalize when upload_data_size==0
if (ctx->path_gate && strcmp(method, "POST") == 0) {
if (*upload_data_size != 0) {
// streaming hash update (bounded memory)
if (EVP_DigestUpdate(ctx->mdctx, upload_data, *upload_data_size) != 1) {
ctx->error_code = "digest_update_failed";
atomic_fetch_add(&gk_metrics.error_count, 1);
}
*upload_data_size = 0;
return MHD_YES;
}
// finalize
int rc_dec = handle_gate_finalize(ctx, cfg->lock_root);
// ...
}
Each call to EVP_DigestUpdate feeds a chunk of body data into the running hash state. When the upload is complete (upload_data_size == 0), the digest is finalized and the lock acquisition begins. This means Gatekeeper’s memory usage per connection is constant regardless of body size — a 1 KB payload and a 50 MB payload consume the same amount of RAM during hashing.
Atomic Lock Acquisition
The lock mechanism is the heart of Gatekeeper’s correctness guarantee. We use the O_CREAT|O_EXCL flags on the open() system call, which is the classic POSIX pattern for atomic file creation: the kernel guarantees that if two processes (or threads) attempt to create the same file simultaneously, exactly one will succeed and the other will receive EEXIST.
static int handle_gate_finalize(struct req_ctx *ctx, const char *lock_root) {
EVP_DigestFinal_ex(ctx->mdctx, ctx->digest, NULL);
sha256_hex(ctx->digest, ctx->sha_hex);
char lock_path[1024];
if (build_lock_path(lock_path, sizeof(lock_path), lock_root, ctx->sha_hex) != 0) {
ctx->error_code = "build_lock_path";
atomic_fetch_add(&gk_metrics.error_count, 1);
return -1;
}
if (ensure_shard_dirs(lock_path) != 0) {
ctx->error_code = "mkdir_shards";
atomic_fetch_add(&gk_metrics.error_count, 1);
return -1;
}
// Atomic acquire
int fd = open(lock_path, O_CREAT|O_EXCL|O_WRONLY|O_CLOEXEC, GK_LOCK_MODE);
if (fd >= 0) {
(void)write_lock_metadata(fd);
close(fd);
ctx->decision = DEC_ALLOW;
return 0;
}
if (errno == EEXIST) {
// stale recovery...
ctx->decision = DEC_DROP;
return 0;
}
// ...
}
There is no advisory locking, no database, no distributed consensus protocol. The filesystem itself is the coordination mechanism. This gives us atomicity on any POSIX-compliant system without introducing external dependencies.
Sharded Lock Directory
Storing all lock files in a single directory would degrade performance as the number of entries grows — directory lookups in most Linux filesystems slow down beyond tens of thousands of entries. We shard locks into a two-level directory tree based on the first four hex characters of the digest:
/var/lib/gatekeeper/locks/2c/f2/2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824
This distributes locks across up to 65,536 directories (256 x 256), keeping each directory small even under heavy load.
Stale Lock Recovery
In production, processes crash. If a backend finishes processing but never calls /release, the lock persists forever, permanently blocking that body hash. Gatekeeper handles this with a stale lock TTL: if a lock file’s mtime is older than the configured threshold (default 300 seconds), Gatekeeper treats it as abandoned, removes it, and allows the new request through.
static int is_stale_lock(const char *path) {
struct stat st;
if (stat(path, &st) != 0) return 0;
uint64_t now = now_unix_sec();
uint64_t mtime = (uint64_t)st.st_mtime;
return (now > mtime) && ((now - mtime) > (uint64_t)GK_STALE_TTL_SECONDS);
}
This is a deliberate trade-off: after the TTL expires, a true duplicate could slip through. But the alternative — permanent lock retention — is worse in practice, because it means a single process crash can permanently block a class of requests with no manual intervention.
NGINX Integration
Gatekeeper is designed to work as an NGINX auth subrequest backend. The included NGINX configuration proxies the ingestion endpoint to Gatekeeper and translates duplicate rejections into silent connection drops:
location = /ingest {
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header X-Real-IP $remote_addr;
proxy_request_buffering off;
proxy_buffering off;
proxy_pass http://gatekeeper_backend/gate;
error_page 409 = @drop_duplicate;
}
location @drop_duplicate {
return 444;
}
The proxy_request_buffering off directive is important: it tells NGINX to stream the body directly to Gatekeeper rather than spooling it to disk first. This keeps latency low and avoids unnecessary disk I/O for large payloads.
The 444 response is an NGINX-specific status code that closes the TCP connection with no response at all. For retry-happy clients, this is often preferable to returning an explicit error, which some client libraries interpret as a signal to retry more aggressively.
Operational Visibility
Gatekeeper exposes a Prometheus-compatible /metrics endpoint with four counters:
static int handle_metrics(int *out_status, char *out_txt, size_t outsz) {
unsigned long long allow = atomic_load(&gk_metrics.allow_count);
unsigned long long drop = atomic_load(&gk_metrics.drop_count);
unsigned long long stale = atomic_load(&gk_metrics.stale_recovered_count);
unsigned long long err = atomic_load(&gk_metrics.error_count);
*out_status = 200;
int n = snprintf(out_txt, outsz,
"# TYPE gatekeeper_allow_total counter\n"
"gatekeeper_allow_total %llu\n"
"# TYPE gatekeeper_drop_total counter\n"
"gatekeeper_drop_total %llu\n"
"# TYPE gatekeeper_stale_recovered_total counter\n"
"gatekeeper_stale_recovered_total %llu\n"
"# TYPE gatekeeper_error_total counter\n"
"gatekeeper_error_total %llu\n",
allow, drop, stale, err);
// ...
}
The stale_recovered_total counter is particularly worth monitoring. A steady increase indicates that backends are failing to release locks, which may point to crashes, timeouts, or a misconfigured TTL. Every request is also logged as a JSON line to stdout, including the remote IP, SHA-256 digest, decision, and latency in milliseconds — making it straightforward to integrate with any log aggregation system.
Fail-Open vs. Fail-Closed
One of the more deliberate design decisions in Gatekeeper is the error handling policy. By default, if an internal error prevents lock acquisition (a filesystem permission error, a full disk, a failed mkdir), Gatekeeper allows the request through and sets an X-Gate-Error header to flag the problem. This is fail-open: availability is preserved at the cost of potentially admitting a duplicate.
For environments where correctness is more important than availability — financial transactions, for instance — Gatekeeper can be compiled in fail-closed mode with -DGK_FAIL_CLOSED=1, which returns a 503 Service Unavailable on internal errors instead.
This is a compile-time toggle rather than a runtime flag, which is a deliberate choice. The failure policy is an architectural decision that should be made once during deployment, not something that changes at runtime.
Practical Applications
Gatekeeper fits naturally into several common infrastructure patterns:
Webhook ingestion. Third-party services (payment providers, CI/CD platforms, SaaS integrations) often deliver webhooks with at-least-once semantics. Gatekeeper deduplicates these at the edge before they reach your processing queue.
API gateway protection. For any POST endpoint where the same body should not be processed twice — form submissions, order placements, file uploads — Gatekeeper provides a transparent deduplication layer with no application changes required.
IoT and telemetry pipelines. Embedded devices with aggressive retry logic are a common source of duplicate data. Gatekeeper’s streaming hash design handles arbitrarily large payloads without memory pressure, making it suitable for high-throughput sensor data ingestion.
Event-driven architectures. In systems where events are published via HTTP, Gatekeeper can serve as a deduplication gate at the ingestion boundary, ensuring each unique event is processed exactly once regardless of delivery retries.
Conclusion
Gatekeeper demonstrates that effective request deduplication does not require complex distributed systems or database-backed idempotency layers. By combining streaming SHA-256 hashing with atomic POSIX file creation, we built a solution that is fast, correct under high concurrency, and operationally transparent. It runs as a single binary with two library dependencies, integrates cleanly with NGINX, and provides the observability hooks that production systems demand.
Gatekeeper was developed by the engineering team at Hamkee, where we specialize in high-performance Unix/Linux solutions. The project is open source under the MIT license — we invite you to explore the repository, test it against your own workloads, and contribute.