← Back to Projects
What are 50,000 of your viewers actually asking?
Creators publish content, read a handful of comments, and move on. But buried in thousands of comments is some of the most valuable audience research available anywhere — questions people keep asking, frustrations that keep surfacing, topics they're desperate for more of.
What you get
Ingests large volumes of YouTube comments, generates semantic embeddings, clusters them by meaning, and runs an insight agent over the clusters to surface actionable findings.
Demo output coming soon — this project is currently In Progress.
Want something like this for your channel?
Let's talk →Creator Comment Intelligence Agent
Tier 2In ProgressEmbeddingsClusteringInsight Extraction
Architecture
Data Ingestion
YouTube API→Comment Ingestion→Preprocessing→
DigitalOcean
Embedding Generation→Clustering (HDBSCAN)→Cluster Labeling→Insight Extraction→
Vercel
Dashboard
Tech Stack
| Layer | Technology |
|---|---|
| Data source | YouTube Data API v3 |
| Embeddings | OpenAI text-embedding-3-small |
| Clustering | HDBSCAN / scikit-learn |
| Orchestration | Python + asyncio |
| Insight agent | OpenAI GPT-4o + Claude API |
| Output | Next.js dashboard |
| Infrastructure | DigitalOcean App Platform |
Pipeline
1YouTube API
2Comment Ingestion
3Preprocessing & Deduplication
4Embedding Generation
5Clustering (HDBSCAN)
6Cluster Labeling Agent
7Insight Extraction Agent
8Dashboard Output