Job Description
Roles & Responsibilities
At Sana Commerce, we're looking for a Data Engineer (ML/AI) to design, build, and scale data systems that power our analytics and machine learning initiatives. Your work will ensure high-quality, reliable, and ML-ready data pipelines that enable both traditional analytics and advanced AI-driven solutions across the business.
What you'll be doing
- Designing and maintaining data pipelines optimized for ML/AI workloads, including handling of large-scale, unstructured, and semi-structured data.
- Building feature pipelines and feature stores that ensure reusability and consistency of data used by machine learning models.
- Collaborating with Data Scientists and ML Engineers to understand data requirements for training, validation, and production deployment.
- Ensuring data quality, lineage, and governance meet standards required for AI/ML applications.
- Supporting MLOps practices by integrating data pipelines with model training, monitoring, and deployment workflows.
- Leveraging distributed processing frameworks (e.g., Spark, Databricks, Azure Synapse) for scalable ML data processing.
What you bring
- 6+ years of experience as a Data Engineer, working with Azure and Databricks, ideally with exposure to ML/AI-related data workflows.
- College degree that demonstrates your analytic abilities, such as Econometrics, Computer Sciences, Mathematics or similar;
- Excellent analytical and problem-solving skills;
- Experience with data preparation for ML/AI: managing large datasets, feature engineering, and real-time or batch data pipelines.
- Familiarity with MLOps concepts and how data engineering supports model lifecycle management.
- Experience with orchestration frameworks (Airflow, Prefect, or Azure Data Factory) for complex ML pipelines.
- Knowledge of unstructured data processing (text, images, logs) is a plus.
- Strong SQL and Python skills; experience with distributed data processing (PySpark, Dask, etc.) is a plus.
Desired Candidate Profile
What you bring
- 6+ years of experience as a Data Engineer, working with Azure and Databricks, ideally with exposure to ML/AI-related data workflows.
- College degree that demonstrates your analytic abilities, such as Econometrics, Computer Sciences, Mathematics or similar;
- Excellent analytical and problem-solving skills;
- Experience with data preparation for ML/AI: managing large datasets, feature engineering, and real-time or batch data pipelines.
- Familiarity with MLOps concepts and how data engineering supports model lifecycle management.
- Experience with orchestration frameworks (Airflow, Prefect, or Azure Data Factory) for complex ML pipelines.
- Knowledge of unstructured data processing (text, images, logs) is a plus.
- Strong SQL and Python skills; experience with distributed data processing (PySpark, Dask, etc.) is a plus.