Founding AI Engineer at Vavi Labs

Founding AI Engineer

Building AI-native products, vertical workflow apps, creative tools, dev tools, fine-tuned models, and technical education systems at Vavi Labs.

Role: Founding AI Engineer
Company: Vavi Labs
Location: Mangalore, India

Key Achievements

  • Building and beta testing Arbiter, a CLI-first decision-intelligence platform that scores engineering decisions using repo, ticket, and incident context
  • Built BatSwing, a PWA and computer-vision cricket coaching product processing a few dozen reports weekly from one academy while tuning the pipeline
  • Building Creative Collab OS, a beta creative AI studio with workflow graphs, direction checkpoints, taste memory, and multimodal artifact generation
  • Built Tech Abstractions, an AI/ML interview-prep platform with a few dozen users on the waitlist
  • Created dev tools, plugins for coding agents, illustrated explainers, and corporate training material for agentic AI, MLOps, and LLM infrastructure

Overview

At Vavi Labs, I build AI-native products, vertical workflow apps, creative tools, dev tools, plugins for coding agents, fine-tuned models, and technical education systems for engineers and AI teams.

The work spans five connected areas:

  • Decision intelligence and governance tools that make AI-assisted engineering choices auditable, reviewable, and grounded in company context
  • Vertical AI products that combine lightweight capture workflows, async workers, computer vision, and human-readable reports
  • Creative AI systems with workflow graphs, checkpoints, reusable taste profiles, streaming outputs, and multimodal artifacts
  • Dev tools, plugins, and workflows for coding agents
  • Fine-tuned models, corporate training material, and illustrated technical education

Arbiter

Arbiter is a CLI-first decision-intelligence platform for engineering teams, currently in beta testing.

The product helps teams evaluate architecture and implementation decisions using company-specific context from GitHub, Jira, PagerDuty, prior decisions, and incident history. It is designed around a shared API-as-truth architecture, tenant-isolated audit trails, deterministic validation, LLM confidence scoring, and Slack-based async review loops.

The technical goal is to make AI-assisted engineering decisions visible, reviewable, and grounded in local organizational context rather than generic model output.

BatSwing

BatSwing is a cricket academy product that turns mobile-phone captured batting clips into branded, parent-facing assessment reports.

The product combines a mobile-first PWA capture flow, video quality checks, MediaPipe-based pose and phase analysis, async report generation, academy roster workflow, WhatsApp-ready summaries, bilingual report language, and consent-aware player media handling.

It is currently processing a few dozen reports weekly from one academy while I tune the analysis pipeline, reference-range comparisons, and parent-facing coaching report format.

Creative Collab OS

Creative Collab OS is a creative AI studio for creating stand-up routines, songs, comics, sitcoms, and memes.

It is built around LangGraph/FastAPI workflow modes, reference voices, direction checkpoints, reusable taste profiles, long-lived project context, artifact streaming, and multimodal generation. The system uses checkpointed ideation and revision loops so creative work can compound across runs instead of resetting with each prompt.

Creative Collab OS is currently in beta testing, with workflows for stand-up, songs, comics, sitcoms, and memes.

Tech Abstractions

Tech Abstractions is an AI/ML interview-prep platform focused on production reasoning, system design, LLM infrastructure, and applied practice.

The platform combines structured learning paths, interview-style questions, expert walkthroughs, workbook-style exercises, and course-linked practice. It currently has a few dozen users on the waitlist.

Dev Tools and Illustrated Explainers

I build dev tools, plugins for coding agents, and repeatable workflows for agentic AI engineering and production MLOps.

The agentic AI engineering toolkit packages senior engineering judgment into coding-agent workflows for Claude Code, Codex, and Gemini CLI. It includes 24 skills, 5 specialist agents, and 11 commands across strategy, architecture, reliability, production readiness, security, human-in-the-loop workflows, context engineering, multi-agent orchestration, eval design, observability, and handoff artifacts.

The production MLOps toolkit does the same for ML systems: 11 skills, 4 specialist agents, and 7 commands covering problem framing, ML system design, data lineage, feature platform design, training and evaluation, model deployment, monitoring, governance, and production readiness. The tools generate artifact-grade outputs such as architecture reviews, evaluation scorecards, rollout-readiness reviews, governance checklists, and project handoffs.

The public education layer includes illustrated explainers on coding agents, LLM inference, RLHF, statistics for MLOps, and related production AI topics. These explainers translate infrastructure-heavy AI concepts into visual, interactive learning experiences.

Fine-tuning Projects

I use fine-tuning projects to test the full loop from dataset design to training, evaluation, qualitative analysis, and public model publishing.

  • Sitcom Scriptwriter - fine-tuned Gemma-3 1B for The Office-style screenplay generation using a custom reasoning-to-screenplay dataset. The pipeline compares base, SFT, and RFT models, where samples include structured creative reasoning traces: storyline goal, character objectives, character dynamics, writer's-room meta reasoning, comedy engine, beat sheet, talking-head strategy, and final screenplay. The RFT stage uses LLM-as-judge rewards to optimize for character voice, humor, pacing, format, and multi-step comedic setup/payoff.

  • Feynman style Kannada Physics Tutor - built a low-resource language + domain reasoning pipeline for Kannada physics explanations. The staged approach starts from Gemma-3 1B, adds Kannada fluency SFT, then physics-domain SFT for Feynman-style explanations, and finally RAG grounding for more reliable factual answers. The system uses Kannada physics reasoning datasets, step-by-step explanation targets, LLM-as-judge evaluation, quantitative score tracking, qualitative output analysis, and published Hugging Face datasets/models.

Corporate Training Material

I also build corporate training material that turns the same engineering work into teachable systems for technical and leadership audiences.

The training library covers agentic AI for leaders, engineering AI agents, MLOps production systems, harness engineering, and LLM inference engineering. The material is structured as course outlines, module-based curricula, slide decks, assessments, and workbook-style learning artifacts.

The technical tracks cover agent architectures, memory and context engineering, tool use, multi-agent coordination, evaluation frameworks, guardrails, observability, security, cost/latency optimization, ML problem framing, production monitoring, incident response, and LLM inference topics such as KV cache, batching, memory bottlenecks, throughput, compression, and serving economics.

Engineering Themes

Across these projects, the recurring engineering themes are evaluation-first AI systems, cost-aware product architecture, human-in-the-loop review, reliable developer workflows, and practical education for production AI teams.

Technologies Used

LLMsAgentic AILangGraphFastAPIMediaPipeMLOpsFine-tuningEvaluationCoding AgentsCorporate TrainingProduct Engineering

Explore More Experiences

View other projects and professional experiences

View All Experiences