Chef AI - ElevenLabs Worldwide Hackathon
AI Tinkerers - Taipei
Hackathon Showcase

Chef AI

Team featuring a Gogoro AI Engineer building RAG pipelines (PyTorch) and a Backend Engineer (Kafka, Go) specializing in distributed systems.

4 members Watch Demo

Chef AI is a real-time, multimodal agent that transforms the cooking experience from passive reading to active coaching. By orchestrating continuous voice interaction and interval visual monitoring, the agent guides users through complex recipes (demonstrated via a French Omelette) with professional precision. The working prototype demonstrates high technical complexity by managing bi-directional state: it listens to user queries, analyzes visual cues (like butter foaming or egg texture) via camera feeds, and responds instantly with context-aware advice. This solution innovates by digitizing culinary intuition, democratizing professional cooking skills, and reducing food waste through proactive error prevention and gamified feedback.
Theme Alignment & Technical Stack
Theme Alignment Chef AI achieves theme alignment by evolving the mobile device from a static tool into an autonomous agent that “sees” and “hears.” It bridges the gap between cloud intelligence and the physical kitchen by:
Turning Voices into Actions: Converting natural language queries into state changes using speech-to-text.
Turning Clouds into Eyes: Utilizing Vision Language Models (VLMs) to analyze physical world states (cooking progress) in real-time.
Turning Tools into Teachers: Orchestrating these inputs to provide human-like, empathetic voice guidance and scoring.
Specific Technologies & Tools Used
Agent Intelligence & Vision: OpenAI GPT-4 (Reasoning & Conversation), OpenAI GPT-4V (Real-time Image Analysis).
Voice Orchestration: OpenAI Whisper API (Speech-to-Text), ElevenLabs API (Text-to-Speech).
Backend & Connectivity: Python FastAPI, WebSockets (for bi-directional real-time streaming).
Frontend Interface: Flutter (Cross-platform mobile UI), Camera & Audio stream management.

ChatGTP 4V VLM Model ElevenLabs