--
Gramian Consulting Group

Job Details

Gramian Consultancy is a boutique consultancy specializing in IT professional services and engineering talent solutions.
With a strong background in software engineering and leadership, we help companies build high-performing teams by matching them with professionals who truly fit their needs.
Role Overview We are building large-scale evaluation and training datasets to help LLMs solve realistic software engineering problems.
As part of this initiative, we create verifiable software engineering tasks derived from public repository histories, using a synthetic, human-in-the-loop approach.
Our goal is to expand dataset coverage across programming languages, difficulty levels, and task types.
We’re looking for experienced Software Engineers ho are familiar with high-quality public GitHub repositories and can actively contribute to this effort.
The role is hands-on and includes: NOTE: In this project you will not build a project, but specifically generate data to improve model performance for one of the biggest foundational model companies.
Duration: 3 months Commitment: 40h/week, 4h/day overlap with PST Model: Contract, time and material Location: 100% Remote: India, Pakistan, Nigeria, Kenya, Egypt, Ghana, Bangladesh, Turkey, Mexico Interview: 75 min async test Key Responsibilities Review and prioritize GitHub issues from popular open-source projects Clone, configure, and containerize repositories (including Docker setup) for local execution Analyze unit test coverage and test reliability Run and adapt codebases locally to evaluate LLM behavior in debugging and bug-fix scenarios Partner closely with research teams to surface repositories and problems that meaningfully challenge LLMs Potentially lead small groups of junior engineers and guide collaborative project work Work in a fully remote environment.
Opportunity to work on cutting-edge AI projects with leading LLM companies.
Minimum 3+ years of overall experience Strong experience with at least one of the following languages:  C++ Proficiency with Git, Docker, and basic software pipeline setup.
Ability to understand and navigate complex codebases.
Comfortable running, modifying, and testing real-world projects locally.
Experience contributing to or evaluating open-source projects is a plus.
Nice-to-Haves: Experience with dataset creation, annotation, evaluation, or ML pipelines Familiarity with benchmarks like SWE Bench or Terminal Bench Background in QA automation, DevOps, ML systems, or data engineering

Similar Jobs