SVC—009 ● LIVE AI AUDIO APR 2025

Interview
Intelligence

AI Audio Analyst

Drop in an interview recording, get a full candidate report in seconds. Produces multi-dimensional scoring across technical depth, communication clarity, and problem solving — with radar charts, behavioral hesitation signals, topic sentiment, and a timestamped interview timeline. Built on a two-stage AI pipeline using OpenAI Whisper and LLaMA 3.1 8B, orchestrated by a FastAPI backend.

SCORING DIMS

<2min

REPORT TIME

95%+

TRANSCRIPT ACC

OpenAI Whisper LLaMA 3.1 8B Groq API FastAPI Next.js React Python

// UI PREVIEW — CANDIDATE ANALYSIS DASHBOARD

INTERVIEW INTELLIGENCE Audio Analysis

Analysis complete

CANDIDATE EVALUATION

Technical Depth65

BEHAVIORAL ANALYSIS

Confidence70

Clarity80

"Yes, I think as soon as I, you know, again, just making sure I understand the question."

TOPIC SENTIMENT

Collaboration 90% AI agent 80% Problem 70% gRPC -20% Timeouts -80%

INTERVIEW TIMELINE

00:00–
00:12 Introduction to technical interview
00:12–
00:27 Expectations for AI coding agent
00:27–
00:39 Collaborative problem-solving
00:39–
00:50 Understanding the gRPC service
00:50–
01:06 Identifying the problem

AUDIO INPUT

02 // ARCHITECTURE

The Pipeline

USER

→

NextJS

↔

FastAPI

→

STAGE 1: TRANSCRIPT GENERATION (LOCAL GPU)

Audio(*.mp3)

→

Transcript

↓

STAGE 2: SUMMARIZATION (GROQ API)

Transcript(*.txt)

←

LLaMA 3.1 8B

←

JSON

Summarized JSON
with scores for UI

←

STEP 01

Upload

User uploads an MP3 via the Next.js frontend, which sends it to FastAPI via REST.

STEP 02

Transcription

FastAPI passes audio to OpenAI Whisper running on local GPU. Outputs raw .txt transcript.

STEP 03

Summarization

Transcript sent to LLaMA 3.1 8B via Groq API. Extracts scores, hesitation signals, sentiments.

STEP 04

JSON Response

LLaMA returns structured JSON with all numbered scores and categorized data, ready to render.

STEP 05

UI Render

FastAPI sends JSON back to Next.js. Dashboard renders radar charts, bars, sentiment chips, timeline.

03 // STACK

Built with

TRANSCRIPTION

OpenAI Whisper

Runs locally on GPU. Converts MP3 audio to text with 95%+ accuracy across accents and noise conditions.

SUMMARIZATION

LLaMA 3.1 8B

Served via Groq API for ultra-fast inference. Extracts structured evaluation data from raw transcripts.

BACKEND

FastAPI + Python

Orchestrates the two-stage pipeline. Handles file upload, processing, and structured JSON response.

FRONTEND

Next.js + React

Renders the full analytics dashboard — radar charts, behavioral progress bars, sentiment chips, and interview timeline.

INFERENCE API

Groq API

Provides millisecond-latency LLaMA inference, drastically cutting report generation time vs. local inference.

DATA FORMAT

Structured JSON

All scores, sentiments, timestamps, and quotes returned in a consistent schema. Directly consumed by the React dashboard.

InterviewIntelligence

The Pipeline

Built with

Interview
Intelligence