Kunj Rathod
CS Researcher & AI Engineer at the University of Utah. Building AI systems from HIPAA-compliant hospital platforms to spatial memory for embodied agents and materials discovery pipelines.
Experience & Education
From hospital AI platforms and legal research tools to aerospace materials discovery and embodied agents
- Building scalable cloud solutions for distributed data systems on the Azure Data team.
- Focusing on full-stack software development and distributed systems within the Azure ecosystem.
- Built and deployed a HIPAA-compliant AI chat platform for 90+ hospital executives using React/TypeScript, Flask middleware, and AWS Bedrock microservices with event-driven Lambda orchestration.
- Shipped 6 full-stack features across 4 sprints; integrated AWS Bedrock Agents, Knowledge Bases, and Guardrails for production clinical workflows.
- Reduced inference latency by 40% and data query speed by 60% via Bedrock pipeline optimization, API caching, and a DynamoDB–RDS hybrid database strategy.
- Implemented token-streaming LLM responses (p95 <200ms TTFT) with resilient fallback handling and distributed session persistence for 1,000+ conversations.
- Integrated interactive data visualization tools into the LLM chat interface enabling real-time analytics on hospital data.
- Built a multi-agent, graph-augmented pipeline to extract and normalize material-property data from 1,000+ materials-science papers into a physics-aware graph for automated Ashby plot generation.
- Developed a constraint-based 'design region' engine (temperature, creep, pressure limits) and benchmarking suite to identify feasible materials for extreme aerospace environments.
- Explored LLMs and multi-agent AI to streamline knowledge sharing across interdisciplinary stakeholders including engineers, scientists, and DoD partners.
- Built Ref-RAG, a custom RAG chatbot using LangChain and Chainlit to extract structured information from large unorganized PDF datasets for materials researchers.
- Scaled hybrid legal-document retrieval to 10M+ indexed Indian legal documents (statutes, court orders), supporting 5,000+ daily queries.
- Improved retrieval accuracy by 28% and reduced hallucinations by 35% via hybrid RAG (dense vectors + BM25 + reranking) and context-grounding optimizations for Legal-NER tasks.
- Built production ETL ingesting 500k+ documents/week and benchmarked 8 LLM families on 4 legal benchmarks including LegalBench and NyayaAnumana.
- Analysis guided model routing decisions, reducing projected inference spend by $50k+/year.
- Co-authored a comparative analysis paper synthesizing insights from 15+ research papers on legal AI.
- Led development of BioGraphRAG: a Graph Retrieval-Augmented Generation platform combining biomedical knowledge graphs with LLMs for explainable biomedical Q&A.
- Engineered distributed GraphRAG system managing 1M+ biomedical entities (proteins, genes, diseases) integrating UniProt, AlphaFold, and RXNav with NebulaGraph.
- Improved factual accuracy by 40%; optimized graph traversal 3× through strategic caching and high-degree node pruning, achieving sub-500ms query latency at p95.
- Designed automated ETL pipelines processing 2M+ entity updates monthly with schema validation.
- Presented at an international AI panel attended by experts from India and the US — received commendation for technical leadership.
- Spearheaded campus-wide outreach programs to drive adoption of Perplexity's AI-powered search platform among students, faculty, and university clubs.
- Onboarded 150+ Perplexity Pro users, facilitating seamless onboarding and sustained long-term engagement.
- Ensured the safety and well-being of residential housing communities, providing conflict mediation, crisis response, and student support services for a 200+ resident community.
Undergraduate scholarship awarded for societal impact through AI research and production systems in healthcare, legal-tech, and embodied AI. Funded by a $15M endowment from The Kahlert Foundation; recognizes students with a compelling track record of translating computing research into real-world societal benefit.
Relevant Coursework
Featured Projects
AI-driven systems, hackathon winners, and research tools at scale
- Built a fully autonomous recruiting backend with specialized AI agents (Enrichment, Scheduling, Interview, Evaluation) to manage the end-to-end hiring lifecycle, from GitHub sourcing to live candidate screening.
- Engineered complex integrations with Twilio for real-time voice AI interviews, Google Calendar for automated slot scheduling, and Slack/Resend for manager approvals and multichannel outreach.
- A full-stack, model-agnostic AI orchestrator embedded within my portfolio that allows users to instantly generate and deploy Serverless AWS applications directly to their live AWS account.
- Generates highly-structured React SPAs, Node.js Lambda functions, and SAM CloudFormation templates via Vercel AI SDK (OpenAI/Anthropic).
- Uses AWS SDK for Javascript and JSZip to dynamically package Lambda binaries, create S3 artifacts, and execute CloudFormation templates with real-time Server-Sent Events (SSE) streaming logs direct to the user interface.
- Full-stack deployment monitoring and incident response system tracking Vercel deployments, classifying build/runtime failures, and triggering Slack alerts with approval workflows.
- AI-assisted root-cause analysis with FastAPI and ChromaDB vector search over logs, generating structured fix suggestions for downstream coding agents.
- Real-time React/TypeScript dashboard for live metrics, incident status, and agent health; deployed on Vercel with CI/CD pipeline.
- Engineered a high-performance macOS desktop application using Rust and Tauri, delivering a zero-trust, local-only memory assistant with full data sovereignty — no cloud, no telemetry.
- Optimized on-device inference for LLMs (Llama 3.2) and VLMs (SmolVLM) with Metal-accelerated backends, achieving low-latency RAG on M-series Apple Silicon.
- Architected a real-time screen extraction pipeline using Apple Vision Framework for high-speed OCR and CLIP-based visual embeddings to reconstruct temporal context from screen snapshots.
- Designed a Graphiti-style Temporal Search Engine modeling semantic relationships across user activities, web sessions, and meeting transcripts, enabling proactive entity extraction and multi-hop reasoning.
- Implemented automated meeting intelligence with local Whisper-based transcription (Parakeet) and segmented audio processing integrated into the global memory index.
- Developed a Model Context Protocol (MCP) server for secure, local interoperability between the memory store and external AI agents or IDEs.
- Unified AI intelligence layer integrating Gmail, Google Calendar, Slack, and FNDR private memory to eliminate context switching — generates Smart Todos, schedules meetings via natural language, and retrieves personal context on demand.
- Deep integrations with GitHub and Apple Services; supports real-time voice interaction and autonomous multi-step workflow orchestration across the entire digital stack.
- Designed as the universal interface that transforms from a passive assistant into a proactive digital companion, anticipating needs with unparalleled precision.
- iOS personal assistant with voice, chat, and image input integrating GPT-4o and Whisper APIs for context-aware responses with RAG-enhanced memory.
- Offline-first architecture with Firebase sync supporting real-time message streaming and persistent conversation history.
- Production-grade distributed GraphRAG system for healthcare professionals requiring trustworthy biomedical information retrieval.
- Integrated UniProt, AlphaFold, RXNav, and BioKG into a unified NebulaGraph store with automated ETL processing 2M+ entity updates monthly.
- Improved factual accuracy by 40%; optimized graph traversal 3× through caching and high-degree node pruning (sub-500ms at p95).
- Vehicle-to-Everything (V2X) traffic optimization platform combining V2V, V2I, and V2N communication for real-time adaptive traffic management.
- LSTM-based traffic flow prediction models with live SPaT signal data; full system stack from OBD-II hardware to cloud ML backend.
- AES-256 encrypted communication with rotating vehicle identifiers and edge-first architecture for privacy and ultra-low latency.
- Investment recommendation system combining DistillBERT-based sentiment analysis on financial news with DQN and PPO for portfolio optimization.
- Demonstrated measurable outperformance on backtested portfolio allocation tasks.
- Collaborative AI system with specialized agents (Analyst, Trader, Risk Advisor) using CrewAI and LangChain for real-time financial analysis.
- Designed inter-agent communication protocols enabling parallel analysis and consensus-driven output generation.
- Custom RAG chatbot for the STARS Lab to extract structured information from large, unorganized PDF corpora of materials-science research papers.
- Enabled researchers to query domain-specific knowledge across 1,000+ documents through a conversational interface.
Technical Skills
Languages, frameworks, databases, and tools I work with daily
Writing & Publications
Technical articles, research reports, and open knowledge sharing
Kunj's Substack
Co-authored with Niraj Kumar Singh (ML Engineer) · GMG Summer of Code
Full technical article presenting BioGraphRAG: system architecture, GraphRAG algorithm, node-degree performance analysis (low/mid/high-degree nodes), multi-stage answer enrichment pipeline integrating UniProt, AlphaFold, and RXNav, and future directions.
Kunj's Substack
Solo-authored technical article covering FlowVía, a V2X urban traffic optimization system. Details V2V, V2I, V2N protocols, DSRC and C-V2X standards, real-time speed recommendation algorithms, LSTM-based traffic flow prediction, data privacy/security design, and scalability challenges.
Comparative Analysis: LLM Families on Legal Benchmarks
Internal Technical Report · CourtEasy.ai / Nugen
Co-authored with team at CourtEasy.ai / Nugen
Co-authored comparative analysis of InLegalBERT, InLegalLLaMA, and GPT-4o-mini on LegalBench and NyayaAnumana benchmarks, synthesizing insights from 15+ research papers to inform production RAG workflow design and evaluation protocols.
Latest Blog Posts
Thoughts and insights on AI, cloud technologies, and software development
Lessons learned from developing BioGraphRAG and optimizing retrieval for complex medical knowledge graphs.
How to design collaborative AI systems with specialized agents for complex tasks like financial analysis.
Exploring DQN and PPO algorithms for portfolio optimization and investment risk assessment.
Get In Touch
Open to new opportunities and collaborations in AI and software engineering
Get in Touch
I'm currently open to new opportunities and collaborations. Feel free to reach out!