Staff Software Engineer - AI Orchestration ML, AI 🏆 - CoreWeave - Sunnyvale, CA

Staff Software Engineer - AI Orchestration 💰 Salary: $188,000 - 275,000 per year

At CoreWeave we are looking for a ML, AI engineer!

🛠️ Our tech stack:
AI, Cloud, Support, Kubernetes, Machine-Learning

📝 Rquirements:
- Over 8 years of experience in software engineering, particularly in distributed systems or cloud platforms
- Strong knowledge of Go programming language and experience in creating large-scale, long-term production systems
- In-depth understanding of Kubernetes internals, scheduling strategies, and controller-based architectures
- Proven experience in designing or enhancing orchestration, scheduling, or resource-management platforms
- Ability to lead technical projects across teams without direct authority
- Strong operational mindset with a background in managing mission-critical systems at scale
- Preferred experience with orchestration frameworks such as Kueue, Volcano, Ray, or similar Kubernetes-native tools
- Background in AI infrastructure, ML platforms, HPC, or extensive batch and streaming systems
- Comprehensive understanding of scheduling principles including fairness, pre-emption, quota management, and multi-tenant isolation
- Experience in defining and managing SLOs, capacity models, and significant reliability enhancements
- Contributions to open-source infrastructure or orchestration initiatives

👩‍💻👨‍💻 Your responsibilities are:
- Drive the technical vision and architecture for key aspects of the AI Workload Orchestration Platform
- Create scalable and reliable orchestration components for AI workloads across diverse schedulers and environments
- Lead cross-functional architectural reviews and ensure alignment among infrastructure, CKS, and managed inference teams
- Establish platform standards for reliability, observability, capacity management, and operational excellence
- Identify and address systemic performance, scalability, and fairness challenges across large GPU clusters
- Mentor senior engineers and foster technical leadership