SVC—009 ● LIVE AI AUDIO APR 2025

Interview
Intelligence

AI Audio Analyst

Drop in an interview recording, get a full candidate report in seconds. Produces multi-dimensional scoring across technical depth, communication clarity, and problem solving — with radar charts, behavioral hesitation signals, topic sentiment, and a timestamped interview timeline. Built on a two-stage AI pipeline using OpenAI Whisper and LLaMA 3.1 8B, orchestrated by a FastAPI backend.

5
SCORING DIMS
<2min
REPORT TIME
95%+
TRANSCRIPT ACC
OpenAI Whisper LLaMA 3.1 8B Groq API FastAPI Next.js React Python
// UI PREVIEW — CANDIDATE ANALYSIS DASHBOARD
INTERVIEW INTELLIGENCE Audio Analysis
Analysis complete
CANDIDATE EVALUATION
TECH DEPTH COMM PROB. SOLV SYS HESIT.
BEHAVIORAL ANALYSIS
Confidence70
Clarity80
"Yes, I think as soon as I, you know, again, just making sure I understand the question."
TOPIC SENTIMENT
Collaboration 90% AI agent 80% Problem 70% gRPC -20% Timeouts -80%
INTERVIEW TIMELINE
  • 00:00–
    00:12
    Introduction to technical interview
  • 00:12–
    00:27
    Expectations for AI coding agent
  • 00:27–
    00:39
    Collaborative problem-solving
  • 00:39–
    00:50
    Understanding the gRPC service
  • 00:50–
    01:06
    Identifying the problem
AUDIO INPUT

The Pipeline

USER
NextJS
FastAPI
STAGE 1: TRANSCRIPT GENERATION (LOCAL GPU)
Audio(*.mp3)
Transcript
STAGE 2: SUMMARIZATION (GROQ API)
Transcript(*.txt)
LLaMA 3.1 8B
JSON
Summarized JSON
with scores for UI
STEP 01
Upload
User uploads an MP3 via the Next.js frontend, which sends it to FastAPI via REST.
STEP 02
Transcription
FastAPI passes audio to OpenAI Whisper running on local GPU. Outputs raw .txt transcript.
STEP 03
Summarization
Transcript sent to LLaMA 3.1 8B via Groq API. Extracts scores, hesitation signals, sentiments.
STEP 04
JSON Response
LLaMA returns structured JSON with all numbered scores and categorized data, ready to render.
STEP 05
UI Render
FastAPI sends JSON back to Next.js. Dashboard renders radar charts, bars, sentiment chips, timeline.
03 // STACK

Built with

TRANSCRIPTION
OpenAI Whisper
Runs locally on GPU. Converts MP3 audio to text with 95%+ accuracy across accents and noise conditions.
SUMMARIZATION
LLaMA 3.1 8B
Served via Groq API for ultra-fast inference. Extracts structured evaluation data from raw transcripts.
BACKEND
FastAPI + Python
Orchestrates the two-stage pipeline. Handles file upload, processing, and structured JSON response.
FRONTEND
Next.js + React
Renders the full analytics dashboard — radar charts, behavioral progress bars, sentiment chips, and interview timeline.
INFERENCE API
Groq API
Provides millisecond-latency LLaMA inference, drastically cutting report generation time vs. local inference.
DATA FORMAT
Structured JSON
All scores, sentiments, timestamps, and quotes returned in a consistent schema. Directly consumed by the React dashboard.
See the rest
of the deployments.
← ALL PROJECTS