SC

Sai Chaitanya Pachipulusu

Machine Learning Engineer | M.S in Machine Learning | Ex SE @CGI

United States500+ connectionsLinkedInUpdated 10 months ago

About Sai Chaitanya

With a passion for machine learning modeling, math and software engineering.
Spent most of time in graduate school dealing with research papers and understanding ML.
Personal website: https://pachipulusu.vercel.app/

Core Competencies

  • Proficient programming skills: Python, Matlab, Java; project experience using R, SQL, C, PyTorch, Keras and TensorFlow
  • 6+ years' professional and academic experience in computer vision, large language models, reinforcement learning, machine learning, deep learning, natural language processing, web mining
  • Proficient in statistics and machine learning models
  • Scientific paper implementing, project report/documentation writing.
  • Interdisciplinary background allows collaboration of cross-functional teams

Programming Languages: Python, R, C, C++, Scala

Databases: PostgreSQL, MySQL, MariaDB

Libraries, Frameworks and Models

  • Generative AI: LangChain, LlamaIndex, Pinecone, Chroma, Mixtral 8x7b MoE, Llama
  • Machine Learning: NumPy, SciPy, Pandas, Scikit-Learn, TensorFlow, PyTorch, Keras, CUDA
  • NLP: NLTK, SpaCy, Hugging Face Transformers
  • Data Engineering: Airflow, Spark, Hive, Hadoop, MapReduce
  • Ops: MLFlow, Kubeflow, Docker, Kubernetes
  • Visualization: Matplotlib, Seaborn, Plotly, ggplot, Tableau
  • Web: Flask, Scrapy, Beautiful Soup, Selenium

Cloud: AWS (S3, EC2, Lambda), Snowflake
Tools and OS: PowerBI, Tableau, Jupyter Notebook, VSCode, WEKA, RStudio, Databricks, Git, Github Actions, Mac OS, Windows, Linux

Experience

  1. CORtracker 360

    • Data Analyst
      Jun 2025 - Present · 3 mosUnited StatesCurrent

      • Developed scalable data pipelines and reporting solutions with APIs, Tableau/PowerBI interactive dashboards, and Python (Pandas, NumPy) automation, cutting reporting timelines by 86% and enabling real-time analytics in high-volume environments • Performed statistical analyses with K-means clustering, seasonality modeling with time-series techniques (ARIMA), and A/B testing in Python (Scikit-learn, Statsmodels), increasing creator participation by 13% and optimizing engagement strategies • Created Tableau/PowerBI dashboards with ML insights (XGBoost) reducing reporting time 35%; maintained KPI dashboards via SQL/Python automation for metric tracking and decisions in Snowflake • Optimized ETL pipelines using SQL, Snowflake warehousing, Airflow orchestration, and ML anomaly detection (Autoencoders), improving data quality 30% for cloud initiatives in AWS Redshift • Partnered with 3 engineers to define requirements and build SQL/DBT pipelines with PySpark, boosting transformation speed 30%; developed SQL/PySpark/DBT workflows with ML preprocessing (Pandas/NumPy) for AWS Redshift analytics • Led A/B testing frameworks with SQL and Python (Statsmodels, SciPy) to evaluate feature impacts, resulting in 15% uplift in user retention; mentored junior analysts on experimental design in Agile environments

  2. Community Dreams Foundation · Full-time

    • Machine Learning Engineer
      Jul 2024 - Jun 2025 · 1 yrUnited States

      ◦ Architected an HR tool using BERT, RoBERTa, and Sentence-BERT embeddings to match resumes with job descriptions, cutting manual screening by 89% from 10 hours to 1 hour per job opening and speeding up the hiring by 40% through context-aware ranking ◦ Built a cloud-native pipeline with Python, FastAPI, and Kubeflow on Kubernetes for automated interview scheduling, achieving 92% candidate selection precision measured in pilot with 50 companies ◦ Automated rejection emails with sentiment-aware templates (VADER score >0.6), handling 200+ weekly communications and reducing admin work by 90% from 15 hours/week to 1.5 hours/week while ensuring empathetic, bias-free communication ◦ Automated the process of sending personalized rejection emails, reducing time spent by 90% ◦ Designed a RAG chatbot with Mistral on AWS SageMaker, achieving 90% response relevance as measured by BLEU and ROUGE-L scores against human answers on a benchmark set of 500 historical support cases, reducing ticket escalations by 25%

  3. CGI · Full-time

    • Associate Software Engineer
      Sep 2020 - Jun 2022 · 1 yr 10 mosBengaluru, Karnataka, India

      ◦ Developed a real-time ingest layer using Kafka Connect to capture 8 sensor data streams (400 events/second) from factory equipment, reducing data availability lag from overnight batch to <5 minutes for maintenance teams ◦ Wrote and maintained 15+ Python ETL scripts to process daily Shell refinery data files (CSV, JSON) into a centralized SQL data warehouse, enabling dashboard KPIs previously unavailable to operating teams ◦ Led migration of 52 legacy servers to AWS EC2 (t3.xlarge instances) using Terraform, reducing monthly costs by $18k and ensuring 99.9% uptime over 6-month period ◦ Redesigned Databricks medallion architecture by implementing automatic schema validation, data quality checks, and incremental processing patterns, reducing job failures from 12 weekly incidents to 3 (75% decrease) and cutting average recovery time from 4 hours to 45 minutes for Shell’s refinery sensor data lake ◦ Achievement: Awarded Best Employee for Q4 2021 for exceptional contributions to project efficiency and innovation, leading 30% of the overall data migration effort

  4. ImbueDesk ENS Pvt Ltd · Full-time

    • Machine Leanring Engineer
      May 2018 - Aug 2020 · 2 yrs 4 mosHyderabad, Telangana, India · On-site

      ◦ Developed facial expression recognition system using OpenCV, TensorFlow achieving 97% accuracy on FER2013 dataset ◦ Designed an image processing pipeline with Tesseract OCR for vehicle ID recognition (50k plates/day), orchestrated with Kubernetes ◦ Created and deployed predictive maintenance dashboards with Python visualization tools on AWS Beanstalk that reduced equipment downtime by 28% ◦ Built a Kafka-based image processing pipeline processing 35MB/hour, reducing processing latency from 1.2s to 0.4s per image with a 3-node consumer topology. Implemented back-pressure handling for peak traffic periods (7AM-9AM) when processing volume increased by 300%

Education

  1. Stevens Institute of TechnologyMaster's degree, Machine LearningSep 2022 - May 2024

    Grade: 3.9

  2. Sreenidhi Institute of Science and TechnologyBachelor of Technology, Information Technology2016 - 2020

Skills

Other

AlgorithmsTerraformApacheMySQLLangChainObject-Oriented Programming (OOP)Retrieval-Augmented Generation (RAG)Problem SolvingLanguage ModelingSnowflakeData StructuresC (Programming Language)TransformersProgrammingC++Pattern RecognitionPython (Programming Language)Fine TuningJavaTensorFlowKubernetesPythonAWSPandasscikit-learnXGBoostAirflowCloudFastAPIKubeflowSageMakerKafkaEC2MathematicsPostgreSQLMariaDBLlamaIndexPineconeChromaNLTKspaCyHugging FaceMLflowDockerMatplotlibSeabornPlotlyTableauFlaskSeleniumS3LambdaVS CodeRStudioGitGitHubLinux

At a glance

Experience6 years
Currently atCORtracker 360
Based inUnited States
Studied atStevens Institute of Technology
Roles4
Skills57
LinkedInView profile

More about Sai Chaitanya

Frequently asked

Is this your profile?

Claim it to keep it up to date, or request removal. We're happy to help either way.

Claim or remove

Similar roles

Browse all jobs

Similar profiles

Find more professionals

Looking for your next role?

Chat with Clera and we'll introduce you to the right opportunities.

Chat with Clera

This profile is based on publicly available information. Sai Chaitanya is not affiliated with or endorsed by Clera. Privacy policy.