Less

is

More

BUILDING AI SYSTEMS THAT SCALE

ENTER THE VOID

Integrated Neo4j for a live knowledge graph and Milvus for vector search. Deployed microservices via FastAPI and NGINX on GCP & AWS.

Implemented RAG pipelines with context persistence and chain-of-thought reasoning — enabling real-time, accurate responses and reducing hallucinations.

Performance Real-time vector retrieval, dynamic prompt routing, and multi-agent orchestration ensure high-fidelity outputs and low latency across all system layers.

Engineering Agentic LLM Systems

Built Agentic AI Chatbot, a multi-agent mental health chatbot with 20+ agents orchestrated across 100+ nodes using Kafka and Redis Streams.

SENSEREASONPLANACT
AGENTIC AI: Autonomous Intelligence Systems
CORE
Neural Agency

Real-Time Speech AI & TTS

Fine-tuned Whisper, Indic Conformer, and TTS models on A100 GPUs. Developed real-time transcription services with low-latency, high-accuracy results.

Retrieval-Augmented Generation and Vector Search

Designed and implemented production-grade RAG pipelines powering real-time, context-aware conversational AI. Leveraged semantic search, embeddings, and multi-agent orchestration for dynamic and accurate information retrieval.

Vector DB

Milvus, Qdrant, FAISS

Frameworks

LangChain, LlamaIndex, LangGraph, mem0

LLMs

Meta Llama, Mistral, DeciLM, OpenAI, Claude, Gemini, Qwen

Infra

Kafka, Redis, FastAPI, GCP, Docker, AWS

83% Faster Retrieval

Optimized field generation time (30 → 5 mins) through async pipelines & message brokers.

Dynamic Vector APIs

Real-time indexing, semantic search & adaptive embeddings at scale.

20+ AI Agents

Orchestrated across 100+ microservices with LangGraph, Redis & Kafka.

Professional Journey

My journey through education, research, and professional experience in AI and machine learning.

Education

B.Tech in Computer Science and Engineering

Techno Engineering College, Banipur

2019 – 2023

Focused on AI, ML, and Software Engineering; completed projects in stock prediction, plant disease detection, and emotion analysis.

Central Board of Secondary Education (AISSCE)

Kendriya Vidyalaya, Mumbai

2018 – 2019

Central Board of Secondary Education (SSC)

Kendriya Vidyalaya, Mumbai

2016 – 2017

Experience

Applied AI Scientist & Founding Member

United We Care X Shunya Labs AI

06/2025 – Present

Fine-tuned large-scale Whisper, Indic Conformer, and TTS models using LoRA (PEFT) and Seq2Seq Trainer frameworks on A100 GPUs

Developed and deployed real-time speech transcription service handling 250+ concurrent requests

Fine-tuned Whisper ASR model (shunyalabs/pingala-v1-en-verbatim) achieving 2.94% WER, surpassing industry benchmarks

Contributed to Speech AI and Voice AI research, including dataset preparation, model optimization, and GPU utilization

AI Engineer

United We Care

04/2024 – 06/2025

Built "Stella 2.0", a multi-agent mental health chatbot ecosystem with 15 autonomous agents across 60+ nodes

Developed Clinical Copilot for healthcare professionals using advanced prompt engineering and Chain-of-Thought methods

Managed deployment of microservices on AWS & GCP, using FastAPI and NGINX for scalable and reliable APIs

Implemented RAG & vector search pipelines with Redis and Milvus, reducing field generation time by 83%

Deployed multiple HuggingFace LLMs (Meta-Llama-3/3.1, DeciLM-7B, Mistral-7B) for production NLP tasks

AI Research Intern

United We Care

12/2023 – 03/2024

Conducted research on AI-driven mental health solutions, resulting in three international publications

Built POCs with Metarank, Docker, Kafka, Redis, and vector databases (Milvus, Qdrant, FAISS, ChromaDB)

Experimented with LLMs (Mixtral-8x7B, Meta-Llama-3/3.1, DeciLM-7B) using LangChain, LlamaIndex, and LangFlow

About Me

I'm an AI Engineer specializing in Speech AI, Voice AI, and Large Language Models (LLMs) — passionate about building scalable, production-grade AI systems that bridge research and real-world applications. My work focuses on fine-tuning large-scale ASR, TTS, and LLM models using LoRA, PEFT, and distributed training frameworks to achieve SoTA (state-of-the-art) performance.

I have led the development of real-time speech transcription and conversational AI systems, optimizing for low-latency inference, high concurrency, and GPU efficiency. I've built agentic AI ecosystems, retrieval-augmented generation (RAG) pipelines, and cloud-deployed AI microservices on AWS and GCP, integrating tools like LangChain, LangGraph, Redis, Neo4j, and Milvus to deliver reliable, intelligent automation at scale.

Driven by a balance of research rigor and engineering execution, I enjoy pushing the boundaries of what's possible with AI — from speech recognition to intelligent assistants — while actively contributing to open-source and research communities.

theboringai.work