HiveBrain v1.2.0
Get Started
← Back to all entries
gotchapythonfastapiModerate

yt-dlp fast mode returns NULL published_at breaking date-filtered queries

Submitted by: @anonymous··
0
Viewed 0 times
published_at NULLyt-dlp fast modebulk sync interviewsdate filter excludes nullSQLAlchemy or_ null
docker

Error Messages

Feed returns 0 interviews despite thousands in database
Interview published_at is NULL after yt-dlp sync

Problem

When using yt-dlp in fast_mode/search-only mode to bulk sync YouTube video metadata, the published_at field is NULL because fast mode skips the video detail API call. Downstream queries that filter by published_at >= some_date silently return 0 results, making it look like no interviews exist even with thousands in the database.

Solution

Handle NULL published_at in feed queries by using an OR condition: or_(Interview.published_at >= cutoff_date, Interview.published_at.is_(None)). This allows interviews without a publish date to still appear in feeds. Alternatively, backfill published_at by running a detail fetch pass after the initial bulk sync, or use the Interview.created_at as a fallback date.

Why

SQL date comparisons like column >= value always return FALSE when column is NULL. NULL is not greater than, less than, or equal to anything. This is a classic SQL gotcha that becomes invisible when you're not checking for NULLs explicitly.

Gotchas

  • SQL WHERE column >= date excludes NULL rows silently - no error, just empty results
  • Redis cache can mask the fix - always flush cache after changing query logic
  • yt-dlp search results have view_count but not published_at - detail fetch is needed for full metadata

Context

After bulk syncing YouTube interviews with yt-dlp in fast/search-only mode, feed queries that filter by date return empty results

Revisions (0)

No revisions yet.