Key Responsibilities: Data Architecture Design: Architect and design large-scale, end-to-end data solutions that meet the needs of business and technical stakeholders, focusing on scalability, security, and performance ETL/ELT Pipeline Development: Lead the design, development, and optimization of efficient ETL/ELT pipelines that extract, transform, and load data from various sources into structured formats, ready for business intelligence and advanced analytics Data Source Analysis: Perform detailed analysis of structured and unstructured data sources, providing strategic insights on how best to ingest, process, and classify data to create a high-quality data layer Data Layer Development: Design and build a robust, scalable data layer that integrates seamlessly with business analytics platforms and supports both batch and real-time data processing Data Ingestion, Cleansing, and Classification: Develop and implement data ingestion strategies for real-time and batch processes, ensuring data is thoroughly cleansed, validated, and classified in alignment with business goals and governance policies Query Optimization: Take ownership of optimizing complex SQL queries and enhancing database performance, with a focus on reducing query execution times and improving resource efficiency across large datasets Data Migration: Lead and manage large-scale data migration efforts from legacy systems to modern cloud-based platforms, ensuring data accuracy, integrity, and minimal downtime Performance Monitoring & Troubleshooting: Proactively monitor the performance of data systems, troubleshoot bottlenecks, and implement solutions that maximize the speed and efficiency of data processing workflows Collaboration: Work closely with data analysts, data scientists, and business teams to understand data requirements, build data models, and ensure the data architecture is aligned with both technical and business objectives Data Governance & Security: Ensure that all data solutions comply with data governance and security standards, implementing best practices around data classification, data quality, and regulatory compliance (e.
g., GDPR, HIPAA) Mentorship: Provide technical leadership and mentorship to junior data engineers, fostering a culture of innovation, best practices, and continuous improvement Requirements Education: Bachelor's or Master's degree in Computer Science, Data Science, Information Technology, or a related field.
8+ years of experience in data engineering, with a proven track record in architecting and building large-scale ETL/ELT pipelines and data systems Extensive experience in analyzing data sources, designing data layers, and managing large-scale data migrations Strong hands-on experience in optimizing complex SQL queries and improving the performance of databases handling large datasets Deep expertise with cloud platforms (Azure, AWS, GCP), big data technologies (e.
g., Hadoop, Spark), and data warehouse solutions (Snowflake, Azure Synapse) Technical Skills: Advanced SQL skills with a focus on query optimization, indexing, partitioning, and performance tuning Hands-on experience with ETL/ELT tools (Azure Data Factory, Apache Airflow, SSIS) and designing complex data pipelines Proficiency in working with relational databases (SQL Server, PostgreSQL) and NoSQL databases (MongoDB, CosmosDB) Strong knowledge of data governance, data quality management, and regulatory frameworks (GDPR, HIPAA) Experience with big data processing frameworks (Databricks, Apache Spark) and integrating them with modern data warehouses Expertise in data ingestion from diverse sources (databases, APIs, flat files, streaming data) and transforming raw data into structured formats for analysis Strong familiarity with CI/CD pipelines and automation of data workflows using DevOps practices Experience with data visualization tools (Power BI, Tableau) and their integration with data platforms