Vineeth Raju

Gen AI Engineer

I'm a GenAI/ML Engineer with 5+ years of experience building production-grade data infrastructure that powers real ML systems. I hold a Master of Science in Computer Systems Analysis from Pace University, and have built my career specializing in end-to-end MLOps on AWS from high-throughput feature ingestion pipelines using SageMaker and AWS Glue, to RAG framework deployment and LLM guardrail systems.

I’m My background spans financial and healthcare data at scale, with hands-on experience reducing model inference latency by 50%, eliminating data drift, and compressing data-to-model lifecycle times by 40%. I've worked across Spring Health, Discover Financial Services, and Deloitte delivering clean data in, reliable models out, monitored in production.

Lives in: South Plainfield, NJ

My Skills

Coding

Python
R
Streamlit
MySQL

Domains

🧠 ML / AI
Data Engineer
Data Analyst

Tools

Matlab
Power BI
Jira
Git
Jenkins
AWS SageMaker

What have I been upto?

Education

2022 - 2023

Masters in Computer System Analyst

Pace University

New York, NY

GPA: 3.92

Coursework: Data Analytics, Advanced Machine Learning, Fundamentals of Information Security, Software Design Quality, Web Application Development, Information Visualization, Usability Engineering and Social Media Analytics.
2018 - 2021

Bachelor of Business Adminstration (BBA)

Vellore Institute of Technology (Business School)

Vellore, India

CGPA: 8.25/10

Experience

Sep 2024 - Present
GenAI/ML Engineer
Spring Health.
New York, NY
- Scalable LLM & RAG Infrastructure: Architected high-throughput feature ingestion pipelines using AWS Glue and Amazon SageMaker, processing 5M+ records into optimized Parquet formats to power a RAG (Retrieval-Augmented Generation) framework, reducing model inference latency by 50%.
- NLP Text Preprocessing & Feature Engineering: Built scalable text preprocessing pipelines using AWS Glue and SageMaker, transforming 5M+ raw clinical records into tokenized, embedding-ready Parquet formats. Reduced data preparation overhead by 40%, accelerating downstream LLM training and inference workflows.
- Enterprise MLOps & Orchestration: Developed automated CI/CD/CE pipelines using SageMaker Pipelines and AWS Step Functions, ensuring 100% reproducibility of LLM prompts, training datasets, and feature sets across development and production environments.
2022 - 2024
Data Analyst
Discover
Riverwoods, IL
- Credit Risk & Fraud Analytics: Built SQL and PySpark anomaly detection models to flag suspicious transaction patterns across 10M+ monthly credit card records, reducing false positive rates by 22% and supporting near-real-time fraud intervention workflows.
- Complex Financial ETL: Designed AWS Glue and PySpark pipelines to ingest and standardize 3M+ records from disparate source systems, improving data readiness for actuarial risk modeling by 30%.
- Business Intelligence & Reporting: Developed Tableau and Power BI dashboards tracking delinquency rates, charge-off trends, and approval rate KPIs giving risk and product leadership a single source of truth for lending strategy decisions.
- Snowflake Data Governance: Implemented zero-copy cloning to enable secure cross-functional data access across risk, compliance, and product teams, reducing storage overhead by 60% while maintaining strict SOX and Basel III audit controls
- Stakeholder Data Translation: Partnered with risk, compliance, and product teams to convert business requirements into analytical frameworks, delivering ad-hoc and recurring reports that directly informed credit policy adjustments.

Interests

What can I do?

ML Engineer

Data Analyst

AI/ML

Web Design

Projects

Latest Work

Road Sign Detection & Classification ResNet18-based model that classifies road signs and predicts bounding boxes in a single forward pass.

A deep learning web application that predicts whether a bank customer will churn using an Artificial Neural Network (ANN) built with TensorFlow/Keras and deployed with Streamlit.

An end-to-end deep learning project that uses a Simple Recurrent Neural Network (RNN) to classify IMDB movie reviews as positive or negative.

A deep learning project that trains a Gated Recurrent Unit (GRU) recurrent neural network to predict the next word in a sequence, using Shakespeare's Hamlet as the training corpus.

Hello!

I'm VineethSanke

Vineeth Raju

Gen AI Engineer

My Skills

Coding

Domains

Tools

What have I been upto?

Education

Masters in Computer System Analyst

Bachelor of Business Adminstration (BBA)

Experience

GenAI/ML Engineer

Data Analyst

What can I do?

ML Engineer

Data Analyst

AI/ML

Web Design

Latest Work

Contact