Ceph · Production consulting

Make Ceph boring again.

Evidence-first consulting for Ceph clusters in production. I collaborate with your engineers to diagnose issues, reduce risk, and ship changes safely—no guessing.

No-guessing strategy
Observe → form a hypothesis → validate with metrics and controlled changes.
Deliverables
Prioritized plan (P0/P1/P2), runbooks & guardrails, validation checklist.
Engagements
Assessment, incident/performance support, upgrade planning & execution.

I don’t sell “one-size-fits-all” configs or vague advice. If the right solution is “don’t change anything yet—collect evidence first”, that’s what I’ll recommend.

This keeps production safe and avoids expensive “tuning-by-superstition”.

Ceph public telemetry

Snapshots from the public community dashboard.

Total Storage Capacity by Version

Total Storage Capacity by Version

Version by Cluster Count

Version by Cluster Count

Active Clusters (daily, last 7d)

Active Clusters (daily, last 7d)

View the full public telemetry dashboard

Why Ceph

Unified storage
One system that can serve block, object, and file.
Scale-out economics
Grow capacity and throughput by adding nodes.
Resilience
Designed for failure: recovery and rebalancing are core behaviors.
Vendor flexibility
Open ecosystem, avoids hard lock-in choices.
What I optimize for (no-bullshit)
  • Stability under failure and during recovery, not "benchmark-only" tuning.
  • Measurable changes with rollback paths.
  • Runbooks your team can execute without guesswork.