
How this archive is built and maintained
The Cult Codex is a structured archive of the Cult of Psyche podcast. It uses a combination of automated transcription, AI-assisted analysis, and human curation to catalog episodes, identify participants, extract quotes, and map recurring topics and lore.
This page documents the processes, tools, and editorial standards used to build and maintain the archive.
Transcripts are sourced from YouTube captions where available. When YouTube captions are disabled, audio is downloaded and transcribed locally using OpenAI Whisper (medium model). Transcripts are stored as timestamped segments.
Each transcript is processed by an AI model (Claude) to extract structured data: episode summaries, guest identifications, topic tags, notable quotes, lore references, and content type classification. The AI is prompted with explicit editorial guidelines emphasizing neutral, factual language.
Enriched data is imported into a PostgreSQL database (Neon) via Prisma ORM. The import process handles deduplication of people, topics, and lore entries using slug-based matching.
Each episode displays provenance badges indicating whether its data is “transcript-backed” (derived from a full transcript) or “inferred” (generated from title and metadata only). This helps users gauge the reliability of each entry.