Aura
A native macOS app that turns weeks of manual video curation into a fully automated pipeline — from clip discovery to beat-synced export.

Built & run by PixelVision
What it does
Your editing team, automated
Content creators building dance and lifestyle compilations spend days hunting clips, filtering out bad quality, cutting to the beat, and adding lyrics — then doing it all again next week. Aura replaces that loop entirely. It discovers clips from TikTok, YouTube, and Instagram; scores them with computer vision; assembles them to the beat grid; and burns lyrics in — all from a single macOS window.
Multi-platform crawler across TikTok, YouTube, Instagram and curated Collections
MediaPipe pose scoring + CLIP visual classification reject low-quality clips automatically
BPM detection and a manual beat-grid editor cut every clip to the music grid
ffmpeg encodes the final video with burned-in lyrics and optional AI-generated scenes
The problem
Days of work per video, every week
For a creator publishing 3–4 compilations a week, manual curation isn't a workflow — it's a full-time job. Each clip needs to be found, watched, judged, trimmed, and sequenced. We needed to automate this without losing the editorial quality that makes compilations worth watching.
Manually judging pose quality, motion, and aesthetic consistency for every clip
Under the hood
Three layers of intelligence
Intelligent Discovery
A five-stage pipeline: TikTok → Sound → YouTube → Instagram → Creators. Accepted clips feed their songs back into the next run — a self-amplifying signal that improves with every compilation published.
Computer Vision Filtering
MediaPipe validates full-body dance form. CLIP zero-shot classification rejects clips with text overlays, scene cuts, or the wrong visual aesthetic for the category — no human review needed.
Beat Assembly & AI Generation
BPM detection snaps clips to a beat grid. Lyrics are pulled from LrcLib or Genius, force-aligned with stable-ts, and burned in. Optionally, Suno and Runway generate AI music video scenes from prompts.
In production
Running daily, across 8 categories






What we learned
The best automation is invisible
Every intelligence layer — pose detection, CLIP scoring, beat alignment, lyric sync — is invisible to the operator. The UI shows accepted clips and a progress bar. Nothing more. That invisibility is the hardest engineering constraint. It forces every feature to earn its place not by being visible, but by making the output undeniably better.
