Music & Speech Detection

Detect music and speech

This is 90% music

77% speech

Music

Speech

0:00

7:48

0:00

Timestamp	Music	Speech
0.00s – 1.15s	95.6%	7.9%
1.15s – 2.11s	94.6%	3.2%
2.11s – 3.07s	91.4%	2.1%
3.07s – 4.03s	84.7%	1.6%
4.03s – 5.18s	79.3%	99%
5.18s – 6.14s	79.8%	99.9%
6.14s – 7.10s	80.3%	97.5%
7.10s – 8.06s	82.7%	100%
8.06s – 9.02s	83.1%	90.4%
9.02s – 10.18s	78.2%	98.2%
10.18s – 11.14s	74.6%	99.6%
11.14s – 12.10s	76.1%	95.3%
12.10s – 13.06s	81.4%	71.8%
13.06s – 14.02s	86%	33.5%
14.02s – 15.17s	90.2%	12.4%
15.17s – 16.13s	92.7%	6.1%

Raw JSON

{
  "deepfake_score": 0.9788,
  "utterances": [
    {
      "utterance_uid": "ec69b9e7-8fac-4b3e-a4da-ea9773d56aed",
      "text": "Track package.",
      "start_ms": 91620,
      "duration_ms": 960,
      "speaker": 2,
      "language": "en",
      "emotion": "Excited",
      "accent": "American",
      "deepfake_score": 1.1000000000000001
    },
    {
      "utterance_uid": "4003b94d-b15d-46b1-9276-80366dc178fc",
      "text": "Thank you. Did you say you'd like to place an order?",
      "start_ms": 3000,
      "duration_ms": 3800,
      "speaker": 1,
      "language": "en",
      "emotion": "Interested",
      "accent": "American",
      "deepfake_score": 0.9723
    },
    {
      "utterance_uid": "a1c82f3d-09e2-4f7a-b831-22de94a05c61",
      "text": "I need to check on my delivery, it's been two weeks.",
      "start_ms": 8200,
      "duration_ms": 4100,
      "speaker": 2,
      "language": "en",
      "emotion": "Frustrated",
      "accent": "American",
      "deepfake_score": 0.1042
    },
    {
      "utterance_uid": "7fd301ea-c44b-48d2-a96e-5b3c71d0f882",
      "text": "I understand your frustration. Let me pull up your order right now.",
      "start_ms": 12400,
      "duration_ms": 3600,
      "speaker": 1,
      "language": "en",
      "emotion": "Calm",
      "accent": "American",
      "deepfake_score": 0.9812
    }
  ]
}

General Statistics

Speakers: 2
Languages: en
Deepfake analyzed: 17 / 19 utterances
Avg deepfake score: 0.6368
Max deepfake score: 0.9810

Audio

File Name: AIAgentFrustration.mp3
File Size: 1.87 MB
File Type: audio/mpeg
Audio Duration: 1m 37.3s

Request

HTTP: 200 OK
Endpoint: /api/velma-2-stt-batch
Response Size: 5.8 KB

Performance

Processing Time: 2.66s
Processing Factor: 36.6x real-time

Emotion pattern	Conversation	Category	Industry
07:48	Gender-role argument ends the relationship	Social	Personal Relationships
05:33	Elderly caller needs login for surgery payment	Support	Banking
06:10	Sales rep fumbles MFA setup with IT	Support	IT services
04:31	Customer fights for refund in delivery fraud	Support	E-commerce
07:52	Youtuber describes personal stalker experience	Social	Online media
01:37	AI bot can't find customer's order	Support	E-commerce
03:49	User demands MFA reset for drive access	Support	IT services
04:24	Streamer rants on politics and censorship	Social	Media & broadcasting
04:28	Angry caller demands update on late delivery	Support	E-commerce
06:25	Manager pushes IT for password reset	Support	IT services
05:14	Customer tries account recovery without security steps	Support	E-commerce

Neon City ad spot with music and voiceover

This is 90% music

Raw JSON

General Statistics

Audio

Request

Performance

Preloaded demo recordings