AI Software Engineer (LLM & Full-Stack Systems)
About the Role
- We are seeking an end-to-end AI Software Engineer who is equally strong in engineering and applied AI, someone who can fine-tune models, build scalable AI agents, and ship & manage production-grade applications.
- The ideal candidate has built and deployed LLM-powered systems that serve real users, handling everything from prompt orchestration and fine-tuning to inference optimization, caching, latency reduction, logging, monitoring, and post-deployment model evaluation.
- You’ll work closely with our AI, product, and infrastructure teams to deliver high-performance, secure, and scalable AI applications across web and enterprise environments.
Key Responsibilities
1. Generative AI Engineering
- Fine-tune, quantize, and deploy open-source LLMs (e.g., Llama-3, Mistral, Falcon, DeepSeek, Phi, etc.) for specific domains and clients
- Build and maintain multi-agent orchestration systems using frameworks like LangChain, LlamaIndex, or custom Python pipelines.
- Create and maintain retrieval-augmented generation (RAG) pipelines with vector databases (Qdrant, Pinecone, Weaviate, FAISS).
- Create architectures for different AI services, optimize prompt templates, context windows, and chain execution for efficiency and precision.
- Integrate and evaluate speech, vision, and multimodal models for production-ready use cases.
2. LLM Platform Management & Production Operations
- Deploy and maintain LLM inference servers (vLLM, Ollama, TGI, or custom microservices).
- Apply prompt caching, embedding pre-computation, and batch inference for performance optimization.
- Evaluate model performance and quality using automated feedback loops and analytics dashboards.
3. Full-Stack & Systems Development
- Architect, develop, and maintain frontend interfaces (React / Next.js) and backend services (FastAPI / Node.js).
- Design robust REST / GraphQL APIs for interaction with AI microservices.
- Integrate vector databases, relational DBs, and cloud storage in scalable data architectures.
- Contribute to both rapid prototyping (Streamlit / Gradio) and enterprise-grade platforms.
4. Collaboration & Product Development
- Work cross-functionally with AI researchers, product designers, and cloud architects to deliver complete user experiences.
- Participate in technical planning, code reviews, and performance audits.
- Document internal tools, APIs, and best practices for future developers.
- Contribute to architectural decisions for scalability, cost optimization, and compliance.
Required Skills & ExperienceLanguages: Python, TypeScript/JavaScript, SQL, Bash.Frameworks: FastAPI, Flask, Django, React, Next.js, LangChain, LlamaIndex.Databases: PostgreSQL, MongoDB, Redis, Qdrant, Pinecone.AI & ML Stack: PyTorch, Transformers, Hugging Face, vLLM, Ollama, OpenAI SDK.DevOps & Cloud: Docker, Kubernetes, GCP, GitHub Actions, Terraform.Observability: Prometheus, Grafana, OpenTelemetry, Sentry, ELK stack.Preferred Add-ons: Experience with edge inference, GPU orchestration, or local on-prem deployments. QualificationsExperience RequiredMinimum 4–5 years of professional experience in software development, including hands-on work with Generative AI applications in production environments. Bachelor’s or Master’s degree in Computer Science, Artificial Intelligence, or related field.Proven experience in building and scaling AI-powered applications in production.Demonstrated ability to own the full lifecycle — from ideation to deployment and monitoring. Soft SkillsOwnership mentality and strong debugging instincts.Ability to design clean architectures and maintain production-grade code.Deep curiosity for emerging AI trends and optimization methods.Excellent communication, documentation, and collaboration skills.