--
Easygenerator

Job Details

Job Description

Roles & Responsibilities

What you'll be doing

  • Designing and maintaining data pipelines optimized for ML/AI workloads, including handling of large-scale, unstructured, and semi-structured data.
  • Building feature pipelines and feature stores that ensure reusability and consistency of data used by machine learning models.
  • Collaborating with Data Scientists and ML Engineers to understand data requirements for training, validation, and production deployment.
  • Ensuring data quality, lineage, and governance meet standards required for AI/ML applications.
  • Supporting MLOps practices by integrating data pipelines with model training, monitoring, and deployment workflows.
  • Leveraging distributed processing frameworks (e.g., Spark, Databricks, Azure Synapse) for scalable ML data processing.

Desired Candidate Profile

What you bring

  • 6+ years of experience as a Data Engineer, working with Azure and Databricks, ideally with exposure to ML/AI-related data workflows.
  • College degree that demonstrates your analytic abilities, such as Econometrics, Computer Sciences, Mathematics or similar;
  • Excellent analytical and problem-solving skills;
  • Experience with data preparation for ML/AI: managing large datasets, feature engineering, and real-time or batch data pipelines.
  • Familiarity with MLOps concepts and how data engineering supports model lifecycle management.
  • Experience with orchestration frameworks (Airflow, Prefect, or Azure Data Factory) for complex ML pipelines.
  • Knowledge of unstructured data processing (text, images, logs) is a plus.
  • Strong SQL and Python skills; experience with distributed data processing (PySpark, Dask, etc.) is a plus.

Similar Jobs

About Easygenerator
Egypt, Alexandria