Bosta -
Egypt , Cairo
--
Bosta

Job Details

Job Description

Roles & Responsibilities

We re rebuilding Bosta s data platform end-to-end: from MongoDB at the source, through a governed semantic layer that LLM-native tools (NL-to-SQL agents, AI analysts, embedded copilots) can sit on top of safely and cheaply. You will help own that rebuild.This is a hands-on role. You ll set patterns, write the foundational code, and ship across the stack from CDC and ingestion through dbt, the semantic layer, and the interfaces that BI tools and AI agents consume.

Job Responsibilities

  • End-to-end pipeline work: MongoDB CDC ingestion lakehouse warehouse dbt semantic layer BI/AI consumers
  • Co-ownership of architecture decisions with the Data Engineering Lead
  • CDC from production MongoDB without degrading operational DB performance; ingestion patterns that make adding a new source a config change, not a project
  • Orchestration that s observable end-to-end (Airflow, Dagster, or Prefect)
  • The dbt project: structure, conventions, tests, contracts, exposures, CI
  • The semantic / metrics layer (dbt Semantic Layer, Cube, or equivalent) one canonical definition per business metric
  • LLM-readiness: column-level documentation, PII tagging, query cost guardrails, materialized metric tables, and evals on AI-generated SQL
  • Migration of existing logic out of the Tableau and Metabase sprawl into modeled, governed sources

Desired Candidate Profile

Job Qualifications

  • 4+ years across data engineering and/or analytics engineering you ve spent meaningful time on both sides
  • Comfort spanning the stack: comfortable shipping a Debezium connector one week and a dbt mart the next
  • Deep dbt and SQL you ve owned a non-trivial project, not just contributed to one
  • Production CDC experience (Debezium, Kafka Connect, Airbyte, or hand-rolled) against operational databases bonus if that database was MongoDB
  • A cloud warehouse you know deeply (Redshift, Snowflake, BigQuery, or Databricks)
  • Strong Python; comfortable in Linux, infra-as-code, and CI/CD
  • Working understanding of how LLM tooling (RAG, NL-to-SQL, embedded agents) consumes a data platform and what breaks when the platform isn t ready
  • Strong opinions on modeling, lightly held; bias toward observability

Bonus:

  • MongoDB schema evolution at scale
  • Production semantic-layer rollouts (dbt Semantic Layer, Cube, LookML, MetricFlow)
  • Lakehouse formats (Iceberg, Delta, Hudi) or streaming experience (Kafka, Flink, Kinesis)
  • Data catalog / lineage tooling (DataHub, Atlan, Collibra)
  • Logistics, marketplaces, or other operationally heavy domains
  • Evals on AI-generated SQL or analytical reasoning

Similar Jobs

About Bosta
Egypt, Cairo
Management Consulting