AI Engineer · Toronto, ON

Bhaven
Naik

LLM Infrastructure · RAG Systems · MLOps · AI Platform Engineering

Building production-grade LLM systems, RAG pipelines, and ML infrastructure. 3+ years delivering enterprise GenAI — from data ingestion to scalable inference in regulated and air-gapped environments.

Get in touch View projects Download CV GitHub LinkedIn

Core stack

vLLMLangChainHugging FaceElasticsearchFAISSSentence TransformersFastAPIPyTorchTensorFlowAirflowMLflowDocker

Scroll

About

Building AI that ships to production

Bhaven Naik

AI Engineer

Toronto, ON

AI Engineer with 3+ years of production experience building and deploying end-to-end LLM applications, RAG systems, and scalable ML infrastructure. Deep expertise in Generative AI, LLM orchestration, retrieval-augmented generation, and MLOps. I bridge research and real-world impact — from data ingestion and model development to scalable inference, monitoring, and operationalization in enterprise and air-gapped environments.

I'm particularly passionate about designing robust AI platforms, retrieval-augmented generation systems, and cloud-native ML infrastructure. My focus is on bridging research and real-world impact — building reliable, scalable, maintainable solutions that work in regulated and constrained environments.

Based in downtown Toronto.

Toronto, ON

Years production AI

Enterprise RAG systems

Education

Master of Applied Computer Science

St. Francis Xavier University · 2020 – 2022

Bachelor of Computer Engineering

University of Mumbai · 2016 – 2020

Certifications

IBM Machine Learning Essentials

IBM · 2022

Experience

Where I've built things

▸Architected a production-grade, deterministic RAG Q&A system using Elasticsearch, vLLM, Sentence Transformers, and Docling — deployed in secure air-gapped enterprise environments.
▸Engineered a GraphRAG-powered due diligence platform using Neo4j knowledge graphs, integrating multi-document summarization and multi-source ETL via LLM pipelines, meaningfully reducing manual research and analyst time.
▸Built full-stack GenAI applications with Angular frontend and FastAPI backend, delivering AI interfaces to enterprise clients.
▸Automated end-to-end ML training workflows using Python, Apache Airflow, and MLflow — enabling reproducible experiment tracking and one-click model promotion.
▸Provisioned scalable cloud infrastructure via Terraform and Ansible on AWS, significantly improving deployment repeatability and reducing lead time through IaC standardization.
▸Designed distributed RL environments using multi-GPU setups to accelerate training pipelines.

vLLMElasticsearchNeo4jFastAPIAirflowMLflowTerraformDockerKubernetesAWS

Projects

Things I've built

Production systems, research projects, and open-source work. NDA-protected systems are described architecturally — the patterns are mine to share.

ProductionFeatured

NDA

Air-Gapped RAG Platform

Production Q&A system for regulated enterprise environments

Architected and deployed a production-grade, deterministic RAG system fully deployable in secure air-gapped enterprise environments — zero external API calls. Docling-powered document ingestion, semantic chunking, vector embedding via Sentence Transformers, and vLLM-served LLM inference behind FastAPI microservices.

vLLMElasticsearchSentence TransformersDoclingFastAPIDockerGitLab CI

Read case study

ProductionFeatured

NDA

GraphRAG Due Diligence Tool

Knowledge-graph-powered due diligence document analysis

Knowledge-graph-powered analysis platform for due diligence workflows. Automated multi-source ETL and multi-document summarization via LLM pipelines. Meaningfully reduced manual research and analyst time.

Neo4jPythonFastAPIDocker

Read case study

ResearchFeatured

DCGAN Video Augmentation

GAN-based data augmentation for human action recognition

Research implementation of a Deep Convolutional GAN for Human Action Recognition data augmentation on the HMDB51 dataset. Scalable training pipeline with multi-GPU support, benchmarked against pre-trained PyTorchVideo classifiers.

PyTorchPyTorch LightningPyTorchVideoPythonGANs

Read case study

ML Application

Diabetic Retinopathy Classifier

End-to-end medical AI with web deployment on AWS

Fine-tuned VGG16 to classify 5 severity grades of diabetic retinopathy from fundus images. Built a Flask web interface and deployed on AWS EC2 for clinical accessibility.

TensorFlowKerasVGG16FlaskOpenCVAWS EC2

Read case study

Skills

What I work with

Generative AI & LLMs

LangChainLlamaIndexvLLMHugging FacePrompt EngineeringFine-tuning

RAG & Search

ElasticsearchFAISSSentence TransformersDoclingGraphRAGNeo4j

MLOps & Infrastructure

AirflowMLflowDVCDockerKubernetesTerraformAnsible

Cloud & DevOps

AWSAzureGitLab CI/CDFastAPIFlaskMicroservices

ML & Deep Learning

PyTorchTensorFlowScikit-LearnReinforcement LearningGANs

Data & Streaming

KafkaPySparkPostgreSQLMongoDBRedisHadoop

Contact

Let's build something

Working on something interesting in LLMs, RAG, or ML infrastructure? Always happy to connect with people building in this space — whether it's a technical question, a collaboration, or just a conversation.

naikbhaven11@gmail.com

in/bhaven-naik

GitHub

github.com/bhaven123

X / Twitter

@bhavennaik

Get in touch

BhavenNaik

Building AI that ships to production

Where I've built things

AI Engineer

AI Research Assistant

Things I've built

Air-Gapped RAG Platform

GraphRAG Due Diligence Tool

DCGAN Video Augmentation

Diabetic Retinopathy Classifier

What I work with

Let's build something

Bhaven
Naik