- Design, implement, and maintain scalable AI infrastructure solutions to support machine learning and data-driven applications.
 - Collaborate with data scientists, software engineers, and IT teams to optimize AI workflows and deployment pipelines.
 - Monitor system performance, troubleshoot issues, and ensure high availability and reliability of AI platforms.
 - Automate the provisioning, configuration, and management of AI infrastructure using modern DevOps tools and practices.
 - Evaluate and integrate new hardware and software technologies to enhance AI infrastructure capabilities.
 - Develop and enforce security best practices for AI systems, ensuring data privacy and compliance.
 - Manage cloud and on-premises resources for AI workloads, optimizing for cost, performance, and scalability.
 - Document infrastructure architecture, processes, and best practices for internal knowledge sharing.
 - Provide technical support and guidance to internal teams regarding AI infrastructure usage and optimization.
 - Stay updated with industry trends and emerging technologies in AI infrastructure, recommending improvements as needed.
 
  Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related field.3 to 5 years of experience in IT infrastructure, with a focus on supporting AI or machine learning environments.Proficiency in managing cloud platforms such as AWS, Azure, or Google Cloud for AI workloads.Hands-on experience with containerization technologies like Docker and orchestration tools such as Kubernetes.Strong understanding of networking, storage, and security principles in distributed computing environments.Familiarity with infrastructure-as-code tools (e.g., Terraform, Ansible, or similar).Experience with monitoring, logging, and performance tuning of AI systems.Proficient in Linux/UNIX or Windows environments, including scripting and administration.2+ years in deploying VMWare and/or Redhat cloud technologies.Familiarity with cloud data services and big data processing tools.Strong automation development skills. Needs to be proficient in one or more automation languages [i. e. Ansible, Cloud Formations, Chef, Puppet, Terraform].Professional knowledge of cloud computing delivery models (IaaS, PaaS, and SaaS) and deployment models related to Public, Private and Hybrid Cloud serviceStrong background in network architecture, database programming (SQL/NoSQL), and data modeling.Deep understanding of cloud component architecture: Microservices, Containers, IaaS, Storage, Security Knowledge of routing/switching technologies.