Summary
Overview
Work history
Education
Skills
Certification
Languages
Onsite Client Experiences
Timeline
Generic
Palanikumar Balasubramanian

Palanikumar Balasubramanian

Dubai,UAE

Summary

Experienced data engineer with expertise in Informatica BDM/DEI, Azure Cloud Databricks, and Synapse Analytics Data Warehouse. Proficient in big data technologies including Hadoop Stack, HDFS, Sqoop, Flume, MRV2, HQL, SPARK, Hive, Kudu, and Oozie. Skilled in Unix Shell Scripting and version control tools such as GIT Bucket and BIT Bucket. Adept at using ticketing tools like BMC Remedy and ServiceNow. Committed to leveraging advanced data engineering skills to drive innovative solutions within cloud environments.

Overview

10
10
years of professional experience
4
4
years of post-secondary education
1
1
Certification

Work history

Informatica BDM Data Engineer

WIPRO
Dubai, UAE
07.2022 - 11.2024
  • Company Overview: Wipro / Cloud Data Warehouse Migration/ First Abu Dhabi Gulf Bank
  • The objective is to build cloud data warehouse in the Azure data bricks platform and in the azure synapse data warehouse integrating disparate FAB source systems using Informatica BDM tool
  • Creation of source/target connections
  • Import of source/target tables as physical data objects
  • Creation of mappings for ETL pipelines for data transformations and data transportations
  • Creation of workflows and parameterization for end to end integrations for automation
  • Creation of applications and deploying to DIS services
  • Scheduling through ICS Orchestrator (Informatica Ingestion Control System, build-in framework for scheduler)
  • UAT and Deployment process for Production go-live
  • Skilled Tools: Informatica BDM, Azure Databricks , Azure Synapse Analytics Data warehouse and Unix Shell Scripting

Informatica BDM Big Data Hadoop Developer

WIPRO
Dubai, UAE
10.2019 - 06.2022
  • Company Overview: Wipro / Big data lake / Road and Transport Authority
  • The objective is to build data lake platform in the cloudera hive data warehouse on integrating disparate RTA source systems using Informatica BDM
  • Involved in managing and administering the Informatica BDM on-premises platform and have done the upgrade from 10.2.1 to 10.4.1
  • Involved in develop and manage of data pipelines using informatica BDM and hadoop stack tools for integrating various source systems into big data lake
  • Involved in integrating WOJHATI (Journey Planner) system via TIBCO webservice into Hive targets
  • Involved in integrating ETRANSPORT MS SQL server system into hive targets using Informatica BDM tool
  • Involved in parsing Google protobuf data using core java and dependent libraries
  • Involved in check in of development and production codes into respective GIT branches
  • Involved in migration of developed systems into production box complying defined CRQ processes
  • Involved in operations of managing all production jobs and handling support tickets adhering to the SLA
  • Wipro / Big data lake / Road and Transport Authority
  • Skilled Tools: Informatica BDM, Hive, Impala, Hue, Flume, Sqoop, Unix Shell Scripting

Big Data Hadoop Developer

WIPRO
Abu Dhabi, UAE
04.2019 - 09.2019
  • Company Overview: Wipro / Big data lake / Ministry of Finance
  • The objective is to build data lake platform on integrating disparate MOF source systems using Informatica BDM and Cloudera hadoop stack and managing it
  • Involved in initial load ingestion of data through sqoop from RDMS systems (Oracle & MS SQL) into HDFS in avro format
  • Involved in creating external hive avro table pointing to the ingested data location
  • Involved in parsing of avro binary data from kafka topics using avro schema
  • Involved in real time streaming using Spark for KAFKA to kudu data pipeline
  • Involved in developing shell scripts to automate the validation of ingested tables in RZ [HDFS] and CZ [KUDU] and generated reports
  • Wipro / Big data lake / Ministry of Finance
  • Skilled Tools: Informatica BDM, Hive, Impala, Hue, Kafka, Sqoop, Kudu, Unix Shell Scripting

Big Data Hadoop Developer

WIPRO
Chennai, INDIA
02.2019 - 03.2019
  • Company Overview: Wipro / PlanetM20 / Land and Transport Authority
  • The objective is to develop hadoop application to subscribe and parse real time streaming data of type JMS (Java Message Service) and GPB (Google Protocol Buffer) from external source systems like SOLACE and MQTT-Mosquitto servers respectively, then process and store the processed results into partitioned parquet hive table on daily basis
  • Involved in developing Spark application for subscribing real time streaming GPB (Google Protocol Buffer) data from MQTT-Mosquitto servers
  • Involved in creating JKS (Java Keystore) certificates from other certificates type like cer,crt,etc
  • Involved in establishing a secure connection to server from application using TLSv1.2/SSL with valid secure certificates
  • Involved in parsing/deserializing GPB data using scala-pb API
  • Involved in processing the parsed data and storing them into partitioned parquet hive table based on date
  • Involved in developing flume application for collecting real time streaming JMS based data from Solace systems
  • Involved in developing flume-interceptors (Static, Json using core-java and Morphline) for processing the events and storing them into avro format in hdfs
  • Involved in developing spark data processing application which will merge all collected data on a day basis to remove duplicates if any and then finally loaded into partitioned parquet hive table based on date
  • Involved in automating the above process with alert mechanism in-case of failure using unix shell scripting
  • Wipro / PlanetM20 / Land and Transport Authority
  • Skilled Tools: Flume, Spark, Core Java, Scala

Big Dataa Hadoop Developer

TCS
Chennai, India
04.2017 - 01.2019
  • Company Overview: TCS / WALMART / RETAIL
  • The objective is to ensure the applications that run on production environment to be more stable and to incorporation process improvements and automation
  • Supporting different applications built on hadoop technology that runs in production box
  • Identifying the root causes for the issues that arises out from different applications during production run by trouble shooting the run logs and work towards resolution
  • Involving in Process improvement techniques to meet the client requirements quickly
  • Involving in developing automation tool for client requests
  • Identifying the bugs in applications and do hot fix after approval of change request
  • Addressing the client issues based on their request
  • Tracking the code changes and moving to GIT
  • TCS / WALMART / RETAIL
  • Skilled tools: Sqoop, Hive, Oozie, Unix Shell Scripting
  • Developed complex MapReduce algorithms to enhance data extraction processes.

Big Data Hadoop Developer

TCS
Chennai, India
04.2015 - 03.2017
  • Company Overview: TCS / COMCAST / TELE-COMMUNICATION
  • The objective of this project is to do a quality analysis of system generated bills or invoices in pdf format for the telecom services provided to the customers
  • Involved in developing a java web service tool for auto downloads the bills from the server using Rest api calls
  • Involved in developing a parser class for reading pdf text information from bills using PDFBOX in Custom Record Reader in MapReduce framework
  • Involved in transforming unstructured extracted pdf data to structured data and stored in java class variables for further validation with DB information according to client requirements
  • Involved in importing DB information into HDFS using Sqoop
  • DB information is passed into mapreduce framework using Distributed cache
  • Developed the functionalities for cross-validating the pdf data stored in java variables with DB extract as per requirements
  • Finally, the data discrepancies between pdf and DB were reported to the client through email for the regeneration of bills correctly
  • TCS / COMCAST / TELE-COMMUNICATION
  • Skilled Tools: HDFS, Mapreduce, Core Java

Education

Bachelor of Engineering -

SRM EASWARI ENGG. COLLEGE
08.2010 - 04.2014

Skills

  • Informatica BDM (Big Data Management) / DEI (Data Engineering Integration)
  • Azure Cloud Databricks
  • Azure Cloud Synapse Analytics Data Warehouse
  • Azure Cloud Storage (Blob, ADLS - Gen2)
  • Big data – Hadoop Stack
  • [ Distributed Storage - HDFS

    Ingestion tools - Sqoop, Flume

    Distributed Processing - MRV2, HQL

    Distributed in-memory fast processing - SPARK

    Big data SQL Data Warehouse - Hive

    Big data SQL Analytics warehouse - Kudu

    Scheduler - Oozie ]

  • Unix Shell Scripting
  • Version Control Tool - GIT Bucket, BIT Bucket
  • Ticketing Tool - BMC Remedy, ServiceNow

Certification

  • JAVA SE 6
  • MapR Certified Hadoop Developer
  • Databricks Accredited Lakehouse Fundamentals
  • Databricks Accredited Platform Administrator


Languages

English
Fluent
Tamil
Fluent

Onsite Client Experiences

  • Ministry of Finance, Abu Dhabi, 6 Months
  • Road & Transport Authority, Dubai, 3+ Years
  • First Abu Dhabi Bank, Dubai, 2+ Years

Timeline

Informatica BDM Data Engineer

WIPRO
07.2022 - 11.2024

Informatica BDM Big Data Hadoop Developer

WIPRO
10.2019 - 06.2022

Big Data Hadoop Developer

WIPRO
04.2019 - 09.2019

Big Data Hadoop Developer

WIPRO
02.2019 - 03.2019

Big Dataa Hadoop Developer

TCS
04.2017 - 01.2019

Big Data Hadoop Developer

TCS
04.2015 - 03.2017

Bachelor of Engineering -

SRM EASWARI ENGG. COLLEGE
08.2010 - 04.2014
Palanikumar Balasubramanian