gotchapythonfastapiModerate
yt-dlp fast mode returns NULL published_at breaking date-filtered queries
Viewed 0 times
published_at NULLyt-dlp fast modebulk sync interviewsdate filter excludes nullSQLAlchemy or_ null
docker
Error Messages
Problem
When using yt-dlp in fast_mode/search-only mode to bulk sync YouTube video metadata, the published_at field is NULL because fast mode skips the video detail API call. Downstream queries that filter by published_at >= some_date silently return 0 results, making it look like no interviews exist even with thousands in the database.
Solution
Handle NULL published_at in feed queries by using an OR condition:
or_(Interview.published_at >= cutoff_date, Interview.published_at.is_(None)). This allows interviews without a publish date to still appear in feeds. Alternatively, backfill published_at by running a detail fetch pass after the initial bulk sync, or use the Interview.created_at as a fallback date.Why
SQL date comparisons like
column >= value always return FALSE when column is NULL. NULL is not greater than, less than, or equal to anything. This is a classic SQL gotcha that becomes invisible when you're not checking for NULLs explicitly.Gotchas
- SQL WHERE column >= date excludes NULL rows silently - no error, just empty results
- Redis cache can mask the fix - always flush cache after changing query logic
- yt-dlp search results have view_count but not published_at - detail fetch is needed for full metadata
Context
After bulk syncing YouTube interviews with yt-dlp in fast/search-only mode, feed queries that filter by date return empty results
Revisions (0)
No revisions yet.