UPDATED: youtube-dl + aria2 NOW: yt-dlp

Note: I started having trouble with this. It would appear they learned to throttle this as well. There is now a WAY better option that does all of this is one, and I finally have the holy grail I was looking for: archiving disappearing YouTube content before it disappears on me again!

Tool: https://github.com/yt-dlp/yt-dlp

Sample Command I use to download

yt-dlp --write-info-json --download-archive "F:/Libraries/Youtube Archive/James Bruton/archive.txt" -N 4 -o "F:/Libraries/Youtube Archive/%%(uploader)s/%%(upload_date)s-%%(title)s.%%(ext)s" https://www.youtube.com/c/jamesbruton/videos

Explained

yt-dlp 
   --download-archive "F:/Libraries/Youtube Archive/James Bruton/archive.txt" <- What's already been downloaded
   -N 4                                                                       <- # Simultaneous fragment downloads
   -o "F:/Libraries/Youtube Archive/%%(uploader)s/%%(upload_date)s-%%(title)s.%%(ext)s"  <- Filename Format
   https://www.youtube.com/c/jamesbruton/videos                               <- Like the the videos

I’ve been using it for months, reliably, unattended. It just works. #Beautiful

Original Content Continues Here…

In the course of planning out some potential projects, I was about 1/2 way through a video series when it disappeared from YouTube. In fact, I had even purchased a part I wanted planning on using for this project (a diffraction grating tube). And then it was gone. For 2 days. And then it came back. It doesn’t always come back.

OK. Lesson learned. If I am to a point where I’m really “investing” in a video project: DOWNLOAD IT!!! I mean, we git clone anything we are even remotely interested in looking at. Where’s my git clone for YouTube? I do have a video downloader tool that I think is great for grabbing a video from *any* site (even https sites!). But I want to point some tool that will:

  • Download a YouTuber’s entire video library, or by all their playlists, or a single playlist, or just a video. I want some searching/filtering to make it easy to “grab the new videos just since date x”
  • I want it to get me the best possible audio and video resolution by default, and I want to be able to even splice and resync audio/video on the fly if the “best” A/V wasn’t a default
  • I able to grab *any* data element about that video at the command line: updated date, description, etc.
  • I want to be able to grab and associate any closed captioning in any or all languages associated with the video

Kewl wish lest, eh? Go check out: https://github.com/ytdl-org/youtube-dl This tool does all that. And a ton more.

I have used it a lot in the past and I love it. It just rocks.

Until today…

Today, I will get throttled down to 50 kb/s one some of the downloads… whether it was 1 of 100 downloads. Did some Googling. So, youtube-dl can be easily set to replace it’s own internal downloader and accept files from an external downloader. I love pipelines! The command looks a little ugly, I’ll admit. But it’s boilerplate. The only part you change is the URL at the end you want to point to. Cut/paste your way to slick downloading. About aria2: https://aria2.github.io/ Even though a connection was being throttled to 50kb/sec, no reason I cant have 10 at once going after the same file, just different segments. That got me to 500kb/sec. It’s all I need. I don’t care if it’s slow. But if it takes weeks to complete, that’s time where the content could disappear. #balance

youtube-dl --external-downloader aria2c --external-downloader-args "-c -j 10 -x 10 -s 10 -k 1M" -o "F:/Libraries/Youtube Archive/%(uploader)s/%(upload_date)s-%(title)s.%(ext)s" https://www.youtube.com/c/jamesbruton/videos

I know. Ugly. And powerful. It’s not that bad. What I want is a directory structure like:

X:
   \Libraries
      \Youtube Archive
         \James Bruton
            \20190729-Completing the Robots' Mechanical Build.mp4
             20190805-Simulating a Compliant Dampened Spring with Current Control.mp4
             20190812-Building the Robots' Electronics.mp4
             ...
         \Gear Down For What
            \20170127-50_1 Compound Planetary 3D Printable Gearbox.mp4
             20170309-3D Printed Plastic Robotic Arm Demo.mp4

You get the idea. So, I wrote this:

youtube-dl 
--external-downloader aria2c        <- Use the aria2 downloader
--external-downloader-args             <- pass these args to aria2
     "-c                               <- continue downloading a partially downloaded file
      -j 10                            <- maximum number of parallel downloads for every file
      -x 10                            <- maximum number of connections to 1 server for each download
      -s 10                            <- download a file using N connection
      -k 1M"                           <- do not split less than 2*SIZE byte range
-o "F:/Libraries/Youtube Archive/   <- Set output template for youtube-dl archive location
      %(uploader)s                     <- Name of the uploader (channel owner) string template
      /                                <- literal for the directory symbol
      %(upload_date)s                  <- Upload date for the file string template (YYYYMMDD)
      -                                <- Literal dash
      %(title)s                        <- Title of the video string template
      .                                <- Literal period
      %(ext)s"                         <- Video file type extension
https://www.youtube.com/c/jamesbruton/videos  <- URL for youtube-dl to examine

If you want to use it in a batch file, just be a little careful about the % symbol that youtube-dl uses to start a substitution string. It has to be escaped with an extra %. It caused me grief. Here’s the batch file version of the command above:

youtube-dl --external-downloader aria2c --external-downloader-args "-c -j 10 -x 10 -s 10 -k 1M" -o "F:/Libraries/Youtube Archive/%%(uploader)s/%%(upload_date)s-%%(title)s.%%(ext)s" https://www.youtube.com/c/GearDownForWhat/videos

Final note: I’ve dealt with 2 errors:

  1. youtube-dl occasionally exits with a 403 error. Apparently it’s a known issue. clear the youtube-dl cache with “youtube-dl –rm-cache-dir”, and re-run the download command. it’ll skip anything it’s already done. In 600 videos, I had to do this 10-15 times?
  2. aria2 occasionally aborts with error 22 (HTTP response header was bad or unexpected. Not much I can do about that! Re-run the download command. This one isn’t as frequent as the youtube-dl 403 issue

PS: youtube-dl? It downloads from thousands of other sites too. 🙂