Show HN: I built a S3 proxy that combines storage from S3/clouds into one target

  • Posted 10 hours ago by munch-o-man
  • 2 points
https://github.com/afreidah/s3-orchestrator
I wanted offsite copies of my nomad cluster backups without paying. So I started thinking about how to maximize free S3 storage from multiple providers. What if I could just stack them all together and treat them as one big bucket? I started hacking out a simple proxy and then I was having fun, so I kept building. My weekend project turned into what is basically a fully production-ready S3 orchestration service. You configure multiple S3-compatible backends (AWS, OCI, Backblaze, R2, MinIO, whatever), set a quota on each one, and the orchestrator presents them to your apps as a single S3 endpoint. Clients just sees one bucket with no idea files are being spread across multiple s3 providers

What makes it useful:

- Combine free tiers — set per-backend quotas to match each provider's free limit and the proxy fills them in order (pack mode) or evenly (spread mode). 10GB + 10GB + 10GB = 30GB of free offsite storage

- Multi-cloud replication — set replication.factor: 2 and every object automatically lands on two different providers. Instant redundancy, zero client-side changes

- Full S3 API — works with aws cli, rclone, boto3, any S3 SDK. SigV4 auth, multipart uploads, range reads, batch deletes, the works

- Virtual buckets — multiple apps can share the orchestrator with isolated namespaces and independent credentials

- Monthly usage limits — cap API requests, egress, and ingress per backend so you never blow past a free tier

- Write safety — all metadata and quota updates happen inside PostgreSQL transactions. Object location inserts and quota counter changes are atomic — if anything fails, the whole operation rolls back. Orphaned objects from partial failures get caught by a persistent cleanup queue with exponential backoff retry instead of silently leaking storage

- TLS and mTLS — native TLS termination with configurable min version (1.2/1.3), plus mutual TLS support for environments where you want to restrict access to clients with a valid certificate. Certificate reload on SIGHUP for zero-downtime rotation

- Multi-instance / split-mode deployment — run with -mode all (default), -mode api (request serving only), or -mode worker (background tasks only). Scale API instances independently from workers behind a load balancer.

- Trusted proxy awareness — configure trusted CIDR ranges so rate limiting targets real client IPs from X-Forwarded-For, not your load balancer (rebalancer, replicator, cleanup queue, lifecycle) use PostgreSQL advisory locks so only one worker runs each task at a time — no duplicate work, no coordination needed

- Circuit breaker — if the metadata DB goes down, reads keep working via broadcast to all backends. Writes fail cleanly

- Automatic rebalancing — if you add a new backend, the rebalancer redistributes objects across all of them

- Backend draining — need to remove a provider? s3-orchestrator admin drain <backend> live-migrates all objects off that backend to the remaining pool with progress tracking. Once drained, admin remove-backend cleans up the database records (optionally purging the S3 objects too). No downtime, no manual file shuffling — swap providers without your clients noticing

- Web dashboard — storage summary, backend status, file browser, upload/delete, log viewer

- Production observability — Prometheus metrics (60+ gauges/counters), OpenTelemetry tracing, structured audit logging with request ID correlation

- Lifecycle rules — auto-expire objects by prefix and age

- Config hot-reload — update credentials, quotas, rate limits, replication, and rebalance settings without restarting via SIGHUP

- Comes with production ready Kubernetes and Nomad manifests/jobs that can be run with, a custom grafana dashboard utilizing the exported metrics

A bit nervous to share this but I think it is ready to be seen and maybe somebody else would find it useful.

2 comments

    Loading..
    Loading..