Summary
Overview
Work History
Education
Skills
Certification
Languages
Timeline
Generic
Natallia Lahun

Natallia Lahun

Abu Dhabi

Summary

Dynamic Data Engineer with extensive experience at EPAM Systems, excelling in building scalable ETL pipelines using PySpark and AWS. Proven leadership in managing teams and enhancing data processing efficiency. Skilled in web scraping and containerization with Docker, driving impactful data solutions that align with business objectives.

Overview

20
20
years of professional experience
1
1
Certification

Work History

Data Engineer

Integrated Data Intelligence
06.2023 - Current

Palantir Foundry responsibilities:

  • Led end-to-end work in Palantir Foundry, covering platform administration, pipeline development, ontology modeling, and application delivery.
  • Managed workspace setup, user onboarding, permissioning (object ACLs, code repos, ontology layers), and role-based access controls across domains.
  • Designed and implemented data pipelines using Code Workbooks, Pipelines, and custom logic in SQL/Python, ingesting from APIs, cloud storage, and relational sources.
  • Modeled business domains using Foundry Ontology, mapping complex entities with inheritance, relationships, and access layers.
  • Used DevOps app for CI/CD workflows: pipeline promotion, testing, and deployment across environments.
  • Built internal tools and dashboards using: Slate, Workshop, Time Series, ,Contour, Writebacks etc.
  • Developed custom applications using Foundry SDK (Python) and integrated Action API to trigger workflows, update datasets, and interact with external systems in real-time.
  • Implemented data validation, audit trails, version control, and modular reusable logic for large-scale analytics projects.
  • Published reusable templates, documentation, and standards for teams using Foundry across engineering, data science, and business units.

Other experience:

  • Built and maintained end-to-end data pipelines using SparkQL, and PySpark, supporting scalable ETL processes for structured and unstructured data.
  • Leveraged AWS (Lambda, Elasticsearch, CloudWatch) to create serverless and monitored pipelines for real-time and batch processing.
  • Created containerized workflows with Docker for reliable deployments.
  • Developed full-cycle Python-based web scraping pipelines using Beautiful Soup and Selenium to enrich training datasets.
  • Collaborated with data scientists to align data transformations with model training requirements.

Technologies: PySpark, SparkQL, Python, Palantir Foundry, Rest API, Typescript, AWS, Web Scraping

Lead Data Engineer

EPAM Systems
09.2021 - 05.2023

Outsourcing to a FinTech company.

  • Designed and deployed scalable ETL workflows using PySpark, SparkQL, and Palantir Foundry to support predictive analytics and model evaluation.
  • Developed custom APIs and automation scripts in Python and TypeScript to streamline data ingestion.
  • Supported collaborative development with global teams of data scientists, ensuring high-quality datasets, and governance compliance.
  • Developed and maintained data pipelines to ingest and store large volumes of data from multiple sources.

Technologies: PySpark, SparkQL, Python, Palantir Foundry, REST API, TypeScript, MySQL, Beautiful SOAP, Web Scraping, Docker.

Internal role - Resource Manager.

  • Led unit of 25+ team members across multiple sub-units, supporting workforce planning and project resource allocation.
  • Collaborated with project managers to assess staffing needs and align talent with delivery priorities.
  • Participated in hiring and onboarding processes to ensure team capacity matched evolving project demands.
  • Oversaw task assignment and helped optimize team structure to meet delivery and quality goals.

Senior Database Developer

EPAM Systems
06.2017 - 09.2021

Outsourcing to a FinTech company in the USA.

  • Supporting existing SQL code, developing new features according to client's requests, and creating pipelines to migrate project data from MS SQL database to Oracle.
  • Participated in creation of real-time, high-loaded solution to sync data between two different systems: one is based on MS SQL database, and second one is based on Oracle DB.
  • Technologies: MSSQL, Oracle, Pentaho DI, SOAP, Crystal Report, VB 6, Git, Gerrit, Jira.

Senior Database Developer

EPAM Systems
01.2016 - 05.2017
  • Creating application in Java for data uploading to MarkLogic database using Java + Groovy. Creating data pipelines for data transformation.
  • Outsourcing for Thomson Reuters.
  • Technologies: Java, Marklogic DB, Groovy

Senior Database Developer

EPAM Systems
03.2015 - 01.2016

Outsourcing for a telecom company in Germany.

  • Implementing analytics scripts in Hive SQL, supporting existing and creating new Pig scripts for data transformation and storage, supporting existing and creating Bash scripts for automatic deployment of Apache Hadoop/Pig/Hive, working on rewriting Logstash output plugin in Ruby, researching best solution for visualization BI tool for customer, and creating dashboards via Pentaho.
  • Technologies: Bash, Logstash, Apache Hadoop, Pig, Hive, Pentaho BI, Kafka.

Senior Database Developer

EPAM Systems
05.2014 - 02.2015
  • Working with business analysts, participation in requirements discussing, implementing data transformation via Hive, Pig, helping Java developer to implement MapReduce jobs, configuring Sqoop jobs, implementing Oozie workflow, maintenance development Cloudera Hadoop cluster, supporting Git repository, implementing Python tools for automation testing, team coordination.
  • Outsourcing to Gas&Oil US company.
  • Technologies: Cloudera Hadoop, Impala, Hive, Sqoop, Oozie, Pig, Java, Python

Senior Database Developer

EPAM Systems
04.2013 - 05.2014
  • Analysis old database model to get rid of obsolete functionality, redesign database, designing new databases for project purposes, database development (stored procedures, triggers), communication with customer.
  • Outsourcing to Thomson Reuters.
  • Technologies: MS SQL database

Senior Database Developer

EPAM Systems
01.2012 - 04.2013
  • Implementation of data ETL tools in Perl, data modeling, and creating data pipelines.
  • Optimized SQL server performance by utilizing various features like indexing, query optimization techniques.
  • Documented all database changes made over time so that they can be easily tracked and audited when required.
  • Technologies: Perl, Vectorwise DB, Bash.

Database Developer

EPAM Systems
05.2011 - 01.2012
  • Analyzing data domain and customers' requirements, analyzed input data sources, data modeling, developed ETL pipelines using Q scripts.
  • Outsourcing to Thomson Reuters.
  • Financial project: KDB+ database development.
  • Technologies: KDB+ DB, Q Language, Bash

Senior DB Developer

System Technologies
05.2008 - 05.2011
  • Responsible to developing new functionality on database side (Sybase): SPs, views, functions etc.
  • Data modelling
  • Analyzing data domain and customers' requirements.

DB Developer

Vitiaz
09.2005 - 05.2008
  • Internal financial project which helps to control organization's finance.

Education

Bachelor of Science - Information Technology in Economic

Belarusian State University of Informatics And Radioelectronics
Minks, Belarus
06.2009

Skills

  • Palantir Foundry Expert
  • Apache PySpark and Big Data processing
  • Python programming
  • ETL and ELT processes
  • Database development and SQL
  • NoSQL database management
  • Web scraping techniques
  • Amazon Web Services (AWS)
  • Containerization with Docker
  • SOAP web services
  • Data pipeline architecture
  • Team leadership skills

Certification

  • AWS Certified Solutions Architect - Professional, 2021 - 2024
  • AWS Certified DevOps Engineer - Professional, 2021 - 2024
  • AWS Certified Data Analytics - Specialty, 2021 - 2024
  • Microsoft Certified: Azure Data Engineer Associate, 2021 - 2024

Languages

English
Polish
Russian

Timeline

Data Engineer

Integrated Data Intelligence
06.2023 - Current

Lead Data Engineer

EPAM Systems
09.2021 - 05.2023

Senior Database Developer

EPAM Systems
06.2017 - 09.2021

Senior Database Developer

EPAM Systems
01.2016 - 05.2017

Senior Database Developer

EPAM Systems
03.2015 - 01.2016

Senior Database Developer

EPAM Systems
05.2014 - 02.2015

Senior Database Developer

EPAM Systems
04.2013 - 05.2014

Senior Database Developer

EPAM Systems
01.2012 - 04.2013

Database Developer

EPAM Systems
05.2011 - 01.2012

Senior DB Developer

System Technologies
05.2008 - 05.2011

DB Developer

Vitiaz
09.2005 - 05.2008

Bachelor of Science - Information Technology in Economic

Belarusian State University of Informatics And Radioelectronics
Natallia Lahun