I'm

Adith Prabukumar

Data Engineer, Machine Learning Engineer, Data Scientist

About

About Me

Hi!

I love playing around with data πŸ“Š and extracting insights to drive better decisions πŸš€ and improve operations. With a strong background in data analysis and data engineering, I'm your go-to person for uncovering trends, patterns, and opportunities within large and complex datasets πŸ§πŸ’‘.

I'm currently working in the Data Analytics and Engineering team at Frontier Science as a Data Engineer.

Degree: Master of Science
Major: Data Science
Domains
  • Data Engineering
  • Software Development

  • Data Science
  • Machine Learning

Resume

Resume

Objective

Skilled Engineer with experience in cloud technologies, pipeline building and building CI/CD pipelines. Seeking a role in the software field to leverage my expertise in coding and analytics to develop innovative solutions and contribute to the success of an organization.


Education

Master of Science in Data Science

University at Buffalo, NY, USA

Aug 2021 - Feb 2023 | GPA: 3.8/4

Bachelor of Engineering

Anna University, TN, India

Aug 2017 - Feb 2021 | GPA: 3.7/4


Certifications

Professional Experience

Data Engineer

Frontier Science And Technology | Mar 2023 - Present

  • Worked on an LLM-based automation model to solve complex statistical problems during the annotation process reducing Statistician workload by 43%.
  • Developed comprehensive dashboards using Graffana with valuation metrics such as RMSE, and BLEU scores to evaluate the performance of various ML models.
  • Utilized Tensorflow to create a DeepPDF ML model, to parse large quantity of medical data records.
  • Worked with DevOps team to containerize the models using Docker and push them to the production server.
  • Engineered an automated bot utilizing Python to extract data from Excel files, and upload it to a local Postgres Database, resulting in a notable 30% reduction in manual data entry time.
  • Worked on real-time data transfer pipeline utilizing Kafka , transferring 1M data records from local server to AWS Redshift data warehouse, ensuring ease of access to data and increasing productivity by 80%.
  • Implemented optimized PostgreSQL queries on legacy projects, increasing query processing time by 40% and improving cross-team functionality.
  • Spearheaded the design and implementation of robust CI/CD pipelines using GitHub Actions and other version control methodologies across all projects.

Data Analyst

AINQA Group | Aug 2020 - Jul 2021

  • Achieved recognition as the leading performer within the team by consistently resolving existing tickets and collaborating with colleagues.
  • Leveraged OCR technology using OpenCV to extract data leading to the creation of a comprehensive data set, increasing the productivity of the team by 40% .
  • Engineered a web application using Python's Flask framework, enabling employees to effortlessly upload CSV files for real-time interactive data analysis, improving data accessibility by 30%.
  • Collaborated with Senior Enterprise Architects, Solution Architects, and Product Managers, to understand the workflow and ensure standards and rationalization plans to improve visibility and insights of strategic outcomes to drive product impact by around 70%.
  • Incorporated UAT testing methods by planning and executing key test cases, improving the error visibility process by 30%.

Projects

Projects

News Summarizer

Built a user-friendly website that uses an NLP-based BERT transformer model to summarize news articles. We developed the site with the Streamlit package and hosted it on an AWS EC2 Instance. The entire project is packaged in Docker for easy deployment, and our deployment process is automated using GitHub Actions in a CI/CD pipeline.

Audiolytics: A Speech to Text Conversion Tool

Developed an end-to-end ML model that implements speech-to-text on audio files and returns transcribed text along with timestamps of each word. Applications involve podcasts and audio book editing. Model made using wav2vec2 and endpoint with Flask. Packaged using Docker and maintained using Github Actions.

2022 Data Science Job Analysis

Data Science Job Analysis Dashboard - Applied NLP-based transformation on data science job data scrapped from Glassdoor. Developed using Python (Selenium), TFID Vector Transformation. Visualization and Analysis done using Tableau.

Analyzing the 2020 Presidential Election

Performed a comprehensive Explanatory Data Analysis of news articles published during the 2020 Presidential Election, utilizing Python packages including Selenium and BeautifulSoup. Employed advanced Natural Language Processing (NLP) techniques to extract valuable insights and conducted a sentiment analysis, shedding light on the nuanced aspects of the news coverage.

Skills

Skills


Application Development & Scripting









Cloud & Big Data






Other Skills




Designed by Adith Prabukumar.