Your RAG pipeline is only as good as the data feeding it. VidProxy delivers full YouTube transcripts to your vector store the moment a new video posts — no scraping, no polling, no manual work.
YouTube is one of the richest sources of expert knowledge on the internet — technical tutorials, conference talks, product demos, research walkthroughs, earnings calls. But getting that content into a vector store requires someone to find the video, pull the transcript, clean it, chunk it, embed it, and upsert it. Every. Single. Time.
Most teams skip it or batch it weekly. The result is a knowledge base that's always three steps behind the conversation your users want to have.
Someone on your team monitors channels manually. They find new videos, download transcripts (if they can), clean the text, paste it into a script, run the embedding, and update the store. Or you write a cron job that scrapes YouTube, handles errors when it's blocked, parses the page HTML, extracts captions from a JavaScript object buried in the page source, and tries to keep up with YouTube's changing markup. This breaks constantly.
You subscribe to a channel. VidProxy polls it on your behalf — every 2 minutes on Pro. When a new video is detected, it fetches the transcript, formats it as structured JSON, and POSTs it to your webhook URL. Your webhook receiver gets the full text plus timestamped segments. It chunks the transcript, calls your embedding API, and upserts to your vector store. Done. No human involvement.
From zero to automated transcript ingestion in under ten minutes.
Paste a YouTube channel URL or handle into the VidProxy dashboard. Set a webhook URL pointing to your ingestion endpoint. Label the subscription so your receiver knows which topic cluster to target.
The moment a new video is detected, VidProxy POSTs a JSON payload to your webhook. The payload includes the full plain-text transcript, per-sentence timestamped segments, and (on Pro) an AI-generated summary and topic tags you can use as metadata.
Your endpoint chunks the transcript, calls your embedding API (OpenAI, Cohere, or a local model), and upserts the vectors to Pinecone, Supabase pgvector, Chroma, or Weaviate. Your knowledge base is current within minutes of each publish.
A minimal Node.js handler that receives a VidProxy webhook and upserts the transcript to Pinecone.
// POST /webhooks/vidproxy app.post('/webhooks/vidproxy', async (req, res) => { // Acknowledge immediately — VidProxy retries on non-2xx res.sendStatus(200); const { video, transcript, subscription } = req.body; if (!transcript.available) return; // Chunk the transcript into ~500-word windows const chunks = chunkText(transcript.text, 500); // Embed all chunks in parallel const embeddings = await Promise.all( chunks.map(chunk => openai.embeddings.create({ model: 'text-embedding-3-small', input: chunk }) ) ); // Upsert to Pinecone await index.upsert( chunks.map((chunk, i) => ({ id: `${video.url}-chunk-${i}`, values: embeddings[i].data[0].embedding, metadata: { videoTitle: video.title, videoUrl: video.url, publishedAt: video.published, channel: subscription.channel_name, tag: subscription.label, chunk: chunk } })) ); console.log(`Indexed ${chunks.length} chunks from "${video.title}"`); });
# POST /webhooks/vidproxy @app.route('/webhooks/vidproxy', methods=['POST']) def handle_vidproxy(): payload = request.get_json() # Acknowledge immediately response = make_response('', 200) transcript = payload.get('transcript', {}) if not transcript.get('available'): return response video = payload['video'] subscription = payload['subscription'] chunks = chunk_text(transcript['text'], max_words=500) # Embed via OpenAI embeddings = openai.embeddings.create( model="text-embedding-3-small", input=chunks ).data # Upsert to Pinecone vectors = [ { "id": f"{video['url']}-chunk-{i}", "values": emb.embedding, "metadata": { "title": video["title"], "url": video["url"], "channel": subscription["channel_name"], "text": chunk } } for i, (chunk, emb) in enumerate(zip(chunks, embeddings)) ] pinecone_index.upsert(vectors=vectors) return response
On Pro and Agency plans, every transcript is automatically processed by Claude to produce a summary, key takeaways, and topic tags. These land in the webhook payload under the enrichment key — no extra API calls on your side.
Store the summary and topics as vector metadata. Your retrieval layer can filter by topic tag before running the semantic search, dramatically cutting noise in multi-topic knowledge bases.
Not every video you want to index comes from a channel you're monitoring. Use GET /api/transcript?url= to fetch the transcript for any YouTube video by URL or ID — no subscription required. One API call, instant result. This is useful for backfilling a specific video someone shared, indexing a one-off talk, or enriching your store with content from channels you don't need to watch continuously. Available on all paid plans (500–10,000 lookups/month).
On a broad channel, not every video is relevant to your domain. Use Keyword Alerts (Pro) to specify a comma-separated list of topics per subscription. VidProxy will only fire your webhook if the transcript contains at least one match. Your ingestion pipeline stays focused — and your vector store stays signal-dense.
VidProxy outputs standard JSON. Everything downstream is your choice.
Pinecone, Supabase pgvector, Weaviate, Chroma, Qdrant, Milvus — any store that accepts an embedding vector and a metadata object.
OpenAI text-embedding-3-small, Cohere embed-v3, Voyage AI, or any local model via Ollama or Hugging Face. VidProxy is model-agnostic — it hands you the text.
LangChain, LlamaIndex, Haystack, or a hand-rolled retriever. VidProxy is upstream of your framework — it feeds the ingestion pipeline, not the query layer.
GPT-4o, Claude, Gemini — your vector store is model-agnostic. You get better answers at query time because your knowledge base has current, expert-level YouTube content.
n8n, Make, and Zapier all support custom webhook triggers. Wire VidProxy as the trigger and your embedding step as the action — no custom code required.
Use GET /api/videos?since=7d to bootstrap a new knowledge base or backfill a gap. One call returns everything from the window you specify.
transcript.text field is plain text, ready to chunk and embed. The transcript.segments array provides per-sentence entries with start (seconds), duration, and text — useful for building time-indexed retrieval or for storing segment-level vectors with precise video timestamps.
transcript.available is false. Your webhook receiver should check this field and skip embedding. VidProxy still delivers the payload so you can log the video's metadata for record-keeping.
video_id from the URL) as the namespace prefix for your vector IDs, as shown in the code example above. If the same video ID is upserted again, your vector store will overwrite the existing vectors — no duplicates.
Free tier includes 3 channels and webhook delivery. No credit card required.
No credit card · Free tier forever