Detail-oriented Data Engineer designs, develops and maintains highly scalable, secure and reliable data structures. Accustomed to working closely with system architects, software architects and design analysts to understand business or industry requirements to develop comprehensive data models. Proficient at developing database architectural strategies at the modeling, design and implementation stages.
Overview
4
4
years of professional experience
8
8
years of post-secondary education
Work History
Data Engineer
Moove
Dubai
04.2023 - Current
Collaborated on ETL (Extract, Transform, Load) pipeline to ingest google drive events of the org using google api, maintaining data integrity and verifying pipeline stability.
Worked on open-metadata datahub tool API automation
Created dynamic and scalable data quality reports on open-metadata platform, leveraging Kubernetes for execution, S3 for storing partitioned data, Glue Catalog for schema retrieval, and QuickSight with Redshift external tables for data visualization
Created an end-to-end pipeline for ingesting Firebase app events, utilizing
BigQuery tables, BigQuery Storage API for efficient data ingestion, storing the data in S3, and building QuickSight dashboards to track and analyze app events and usage over time.
Data Scientist Engineer
Hyke Digital Distribution Platform; Dicetek LLC
Dubai
06.2021 - 04.2023
Handled all aspects for the product from POC phase to Production including tasks like data collection, analysis, modeling, deployment and understanding the “why”s
Designed technical solutions and roadmaps, lead POCs within the team, shared findings and recommendations to upper level management and stakeholders
Worked on generating sales prediction using time-series algorithms like FBProphet, Pycaret ensemble and other boosting algos like LGBM, catboost that resulted in increase of forecast score metric
Developed Hyke Forecasting System (HFS) Pipeline to replace AWS Forecasting legacy code replacing 30 lambda functions, 4 step functions and off office hours manual support to a single automated Airflow DAG hence considerably reducing manual work involved in the process
Increased efficiency of sales level forecasting by understanding variables like sales trend, product to be forecasted and implemented features like pareto classification, safety stock, abc-xyz classification, which helped the system to achieve maximum availability
Worked closely with the Inventory team to understand problems like out of stock forecasting, reduce SLA and incorporated solutions in the System replenishment process to solve these problems
Developed an automated rule and user based solution to resolve legacy problems of allocating stock to an NPI(New Product Introduction) SKU, this helped to remove manual effort from stakeholders during off office hours and increase availability
Built and managed several airflow dags for enhancing and automating inventory ops processes like System Replenishment, NPI Seed Allocation, Marketplace Allocations. Transformed code with rule based automation to reduce chances of errors, hotfixes and manual interventions.
Data Engineer
To The New
02.2019 - 06.2021
Developed 20+ end to end incremental ETL pipelines to ingest data from transactional DB to data warehouse, which involved creating generic NiFi flows for data ingestion to S3, and Lambda + Step Functions to load data to
Redshift in an incremental way which is robust to handle data redundancy, back dated entries and maintain history of updates on data
Created a Service Improvement Program (SIP) to check the health of the data warehouse and aid with data quality checks of ETL flows
Designed code utility library which can be used for various operations performed in Redshift
Designed BI and Fact tables for commercial and inventory reports after applying analytical transformations on data stored at warehouse
Developed a near real-time email parsing solution, which performs ETL operations for use cases like EDI Reporting/Acknowledgement, Shipment and Activation Data Ingestion.