Job Description
Roles & Responsibilities
Responsibilities
- Design, build, and maintain scalable ML pipelines for training, testing, and deployment
- Deploy & maintain machine learning models and ensure their performance, reliability, and monitoring
- Collaborate with data scientists and engineers to streamline experimentation and deployment workflows
- Implement CI/CD practices for ML systems (ML CI/CD)
- Manage and optimize cloud-based infrastructure for ML workloads
- Develop monitoring, logging, and alerting systems for model performance and data drift
- Ensure reproducibility, versioning, and governance of ML models and datasets
- Advocate for best practices in MLOps, DevOps, and software engineering
- 5+ years of experience in software engineering, DevOps, or MLOps roles
- Strong programming skills in Python (and familiarity with Java/Go is a plus)
- Experience with ML frameworks such as TensorFlow, PyTorch, or similar
- Hands-on experience with containerization and orchestration tools (Docker, Kubernetes)
- Experience with cloud platforms (AWS, GCP, or Azure)
- Familiarity with CI/CD tools (e.g., GitHub Actions, Jenkins, GitLab CI)
- Strong understanding of data pipelines, distributed systems, and API development
- Experience with monitoring tools and logging frameworks
Desired Candidate Profile
5+ years of experience in software engineering, DevOps, or MLOps roles
Strong programming skills in Python (and familiarity with Java/Go is a plus)
Experience with ML frameworks such as TensorFlow, PyTorch, or similar
Hands-on experience with containerization and orchestration tools (Docker, Kubernetes)
Experience with cloud platforms (AWS, GCP, or Azure)
Familiarity with CI/CD tools (e.g., GitHub Actions, Jenkins, GitLab CI)
Strong understanding of data pipelines, distributed systems, and API development
Experience with monitoring tools and logging frameworks