Platform Engineer
Lead DevOps Engineer
Some of the largest companies on the planet trust us to make sure doors open when they should—and stay shut when they shouldn't. When your platform controls physical access for tens of thousands of employees, "we'll fix it Monday" isn't an option.
We're a small team (you'd be leading 2 engineers) running the same platform across Azure, air-gapped on-prem clusters, and customer-provided Kubernetes environments we've never seen before. If you're the kind of engineer who gets a kick out of explaining to customers why their "active-active" setup is actually cold standby, keep reading.
What you'll actually work on
Real examples from the past few months:
Air-gapped deployments — Building ZARF packages for Fortune 500 enterprises that can't touch the internet. Debugging why flags break your scripts in a datacenter you can't SSH into.
Multi-cloud Kubernetes at scale — Same platform running in Azure, on-prem RKE2, and customer-provided clusters with creative networking. You'll need to make it all work.
Observability that actually matters — Building Grafana dashboards that help you understand what's actually happening, not vanity metrics. We recently migrated our stack to a new Mimir cluster.
Incident response with real stakes — When a service is eating 40GB of RAM and a customer's badge readers stop working, you're the one figuring out why.
Who this role is for
- You actually enjoy YAML. Not ironically.
- You've debugged production issues where "just restart it" wasn't an option
- You're comfortable owning decisions without waiting for detailed specs
- You can context-switch between cloud-native tooling and air-gapped on-prem constraints
Who this role is NOT for
- People who want to manage a large team (it's 2 people, not 20)
- People who need detailed requirements before starting work
- People who don't want to touch application code
- People who only want pure cloud (we do a lot of on-prem)
Must-haves
- 2-4 years in DevOps/Platform Engineering (we care more about depth than years)
- Kubernetes—you've run clusters, not just deployed to them
- Infrastructure as Code (Terraform/OpenTofu preferred)
- Strong debugging skills and comfort with Linux systems
- Located in US, Canada (UTC-4 to UTC-6 preferred)
- Fluent English—you'll talk to customers directly
Nice-to-haves
- Experience with air-gapped/disconnected deployments (ZARF, Rancher, etc.)
- TypeScript, C#/.NET, or Python—we write application code too
- Observability stack experience (Grafana, Prometheus, Loki)
- Any CNCF or Linux Foundation certifications
Our stack
- Infrastructure & Platform: Azure, OpenTofu, Docker, Kubernetes, ArgoCD, GitOps, Rancher RKE2, Zarf
- Observability: Grafana, Prometheus, Mimir, Loki, Tempo
- Backend: TypeScript, Node.js, C#/.NET, Python, Postgres, MSSQL, Redis
- Pipelines: GitLab CI, Azure DevOps, GitHub Actions
The tech evolves constantly. Ability to learn matters more than knowing everything already.
Interview process
- 30-min intro call — We'll tell you more about the role, you'll tell us about yourself
- 60-min technical deep-dive — Architecture discussion plus a hands-on troubleshooting scenario (we give you a broken cluster, you fix it)
- Final call with Director — Meet the team, ask the hard questions
Compensation & Benefits
- Salary: ($140,000-$150,000)
- Flexible hours — We care about output, not butts in seats
- Fully remote — US/Canada. Occasional travel for team meetups.
- Small team, real ownership — No layers of management, your work ships
About RightCrowd
Two decades of experience, startup-like spirit. Small teams, low corporate overhead. You'll work directly with engineering leadership and talk to actual customers.
We're not looking for perfect resumes. We want people who are curious, willing to learn, and comfortable with ambiguity. If you've read this far and thought "that sounds like me," apply.