Software Engineer 2

Bengaluru, Karnataka, India
Aug 27, 2024
Sep 22, 2025
Hybrid
Full-Time
4 Years
Job Description

As a Software Engineer 2 on the Azure ML Infrastructure team, you will play a crucial role in developing and enhancing components for Azure ML’s cutting-edge AI infrastructure. You will work alongside top-tier engineers to advance cluster orchestration, job scheduling, storage, networking, containerization, and OS integration. Your work will facilitate distributed deep learning training and inference, ensuring high availability and scalability of Microsoft Service Fabric and Kubernetes clusters.

Key Responsibilities

  1. Container Orchestration. Deliver a robust container orchestration platform for Singularity, enhancing the efficiency of AI workloads.
  2. Scheduling Sub-System. Design and implement a scheduling sub-system that meets SLAs for AI training and inferencing workloads.
  3. Storage and Caching. Build and optimize storage and caching systems to support efficient deep neural network (DNN) training and inferencing.
  4. Control Plane APIs. Develop APIs for creating and managing training jobs and inference model metadata, streamlining operations.
  5. Node Management. Implement node management, fault detection, and repair services to improve job and model reliability.
  6. Monitoring Systems. Create world-class monitoring systems and telemetry pipelines to boost service observability and performance.
  7. Security and Compliance. Strengthen system defenses against malicious attacks and ensure compliance with security requirements.
  8. Performance Optimization. Utilize performance and profiling tools to identify and address bottlenecks across hardware and software layers, optimizing end-to-end job performance.

Qualifications

  • Experience. Minimum of 4 years of coding experience in C#, C, C++, Rust, or Go.
  • Technical Skills. Proficiency in Linux OS and Kubernetes cluster orchestration. Experience in enhancing service operations and engineering fundamentals.
  • Collaboration. Strong teamwork and collaboration skills, with a proven track record of delivering engineering excellence.
  • Education. Master’s or Bachelor’s degree in Computer Science or a related field.
  • Production Experience. At least 3 years of experience in building and shipping production software or services.

Why Join Us?

  1. Innovative Projects. Work on cutting-edge AI infrastructure that challenges the limits of technology and delivers real-world impact.
  2. Collaborative Environment. Partner with top engineering talent and contribute to a team dedicated to excellence and innovation.
  3. Career Growth. Enhance your skills and career trajectory in a dynamic and supportive environment.
  4. Inclusive Culture. Microsoft is committed to fostering a diverse and inclusive workplace. We welcome applicants from all backgrounds and provide reasonable accommodations for those with disabilities.

Apply Now. If you are driven by solving complex problems and advancing AI infrastructure, we would love to hear from you. Apply today to join our team and help us build the future of AI at Microsoft.

Related Jobs