{"id":231,"date":"2021-12-11T16:25:15","date_gmt":"2021-12-12T00:25:15","guid":{"rendered":"https:\/\/varmintworks.com\/varmint-blog\/?p=231"},"modified":"2021-12-11T17:20:14","modified_gmt":"2021-12-12T01:20:14","slug":"youtube-dl-aria2","status":"publish","type":"post","link":"https:\/\/varmintworks.com\/varmint-blog\/2021\/12\/11\/youtube-dl-aria2\/","title":{"rendered":"UPDATED: youtube-dl + aria2 NOW: yt-dlp"},"content":{"rendered":"\n<p>Note: I started having trouble with this. It would appear they learned to throttle this as well. There is now a WAY better option that does all of this is one, and I finally have the holy grail I was looking for: archiving disappearing YouTube content before it disappears on me again!<\/p>\n\n\n\n<p>Tool: <strong>https:\/\/github.com\/yt-dlp\/yt-dlp<\/strong><\/p>\n\n\n\n<p>Sample Command I use to download <\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">yt-dlp --write-info-json --download-archive \"F:\/Libraries\/Youtube Archive\/James Bruton\/archive.txt\" -N 4 -o \"F:\/Libraries\/Youtube Archive\/%%(uploader)s\/%%(upload_date)s-%%(title)s.%%(ext)s\" https:\/\/www.youtube.com\/c\/jamesbruton\/videos<br><\/pre>\n\n\n\n<p>Explained<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">yt-dlp \n   --download-archive \"F:\/Libraries\/Youtube Archive\/James Bruton\/archive.txt\" &lt;- What's already been downloaded\n   -N 4                                                                       &lt;- # Simultaneous fragment downloads\n   -o \"F:\/Libraries\/Youtube Archive\/%%(uploader)s\/%%(upload_date)s-%%(title)s.%%(ext)s\"  &lt;- Filename Format\n   https:\/\/www.youtube.com\/c\/jamesbruton\/videos                               &lt;- Like the the videos<\/pre>\n\n\n\n<p>I&#8217;ve been using it for months, reliably, unattended. It just works. #Beautiful <\/p>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"1080\" style=\"aspect-ratio: 1920 \/ 1080;\" width=\"1920\" controls src=\"https:\/\/varmintworks.com\/varmint-blog\/wp-content\/uploads\/2021\/12\/2021-12-11-17-13-47.mp4\"><\/video><\/figure>\n\n\n\n<p><strong>Original Content Continues Here&#8230;<\/strong><\/p>\n\n\n\n<p>In the course of planning out some potential projects, I was about 1\/2 way through a video series when it disappeared from YouTube. In fact, I had even purchased a part I wanted planning on using for this project (a diffraction grating tube). And then it was gone. For 2 days. And then it came back. It doesn&#8217;t always come back.<\/p>\n\n\n\n<p>OK. Lesson learned. If I am to a point where I&#8217;m really &#8220;investing&#8221; in a video project: DOWNLOAD IT!!! I mean, we git clone anything we are even remotely interested in looking at. <strong>Where&#8217;s my git clone for YouTube?<\/strong> I do have a video downloader tool that I think is great for grabbing a video from *any* site (even https sites!). But I want to point some tool that will:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Download a YouTuber&#8217;s entire video library, or by all their playlists, or a single playlist, or just a video. I want some searching\/filtering to make it easy to &#8220;grab the new videos just since date x&#8221;<\/li><li>I want it to get me the best possible audio and video resolution by default, and I want to be able to even splice and resync audio\/video on the fly if the &#8220;best&#8221; A\/V wasn&#8217;t a default<\/li><li>I able to grab *any* data element about that video at the command line: updated date, description, etc.<\/li><li>I want to be able to grab and associate any closed captioning in any or all languages associated with the video<\/li><\/ul>\n\n\n\n<p>Kewl wish lest, eh? Go check out: <a href=\"https:\/\/github.com\/ytdl-org\/youtube-dl\">https:\/\/github.com\/ytdl-org\/youtube-dl<\/a> This tool does all that. And a ton more.<\/p>\n\n\n\n<p>I have used it a lot in the past and I love it. It just rocks. <\/p>\n\n\n\n<p>Until today&#8230;<\/p>\n\n\n\n<p>Today, I will get throttled down to 50 kb\/s one some of the downloads&#8230; whether it was 1 of 100 downloads. Did some Googling. So, youtube-dl can be easily set to replace it&#8217;s own internal downloader and accept files from an external downloader. I love pipelines! The command looks a little ugly, I&#8217;ll admit. But it&#8217;s boilerplate. The only part you change is the URL at the end you want to point to. Cut\/paste your way to slick downloading. About aria2: <a href=\"https:\/\/aria2.github.io\/\">https:\/\/aria2.github.io\/<\/a> Even though a connection was being throttled to 50kb\/sec, no reason I cant have 10 at once going after the same file, just different segments. That got me to 500kb\/sec. It&#8217;s all I need. I don&#8217;t care if it&#8217;s slow. But if it takes weeks to complete, that&#8217;s time where the content could disappear. #balance<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">youtube-dl --external-downloader aria2c --external-downloader-args \"-c -j 10 -x 10 -s 10 -k 1M\" -o \"F:\/Libraries\/Youtube Archive\/%(uploader)s\/%(upload_date)s-%(title)s.%(ext)s\" https:\/\/www.youtube.com\/c\/jamesbruton\/videos<\/pre>\n\n\n\n<p>I know. Ugly. And powerful. It&#8217;s not that bad. What I want is a directory structure like:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">X:\n   \\Libraries\n      \\Youtube Archive\n         \\James Bruton\n            \\20190729-Completing the Robots' Mechanical Build.mp4\n             20190805-Simulating a Compliant Dampened Spring with Current Control.mp4\n             20190812-Building the Robots' Electronics.mp4\n             ...\n         \\Gear Down For What\n            \\20170127-50_1 Compound Planetary 3D Printable Gearbox.mp4\n             20170309-3D Printed Plastic Robotic Arm Demo.mp4<\/pre>\n\n\n\n<p>You get the idea. So, I wrote this:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">youtube-dl \n--external-downloader aria2c        &lt;- Use the aria2 downloader\n--external-downloader-args             &lt;- pass these args to aria2\n     \"-c                               &lt;- continue downloading a partially downloaded file\n      -j 10                            &lt;- maximum number of parallel downloads for every file\n      -x 10                            &lt;- maximum number of connections to 1 server for each download\n      -s 10                            &lt;- download a file using N connection\n      -k 1M\"                           &lt;- do not split less than 2*SIZE byte range\n-o \"F:\/Libraries\/Youtube Archive\/   &lt;- Set output template for youtube-dl archive location\n      %(uploader)s                     &lt;- Name of the uploader (channel owner) string template\n      \/                                &lt;- literal for the directory symbol\n      %(upload_date)s                  &lt;- Upload date for the file string template (YYYYMMDD)\n      -                                &lt;- Literal dash\n      %(title)s                        &lt;- Title of the video string template\n      .                                &lt;- Literal period\n      %(ext)s\"                         &lt;- Video file type extension\nhttps:\/\/www.youtube.com\/c\/jamesbruton\/videos  &lt;- URL for youtube-dl to examine<\/pre>\n\n\n\n<p>If you want to use it in a batch file, just be a little careful about the % symbol that youtube-dl uses to start a substitution string. It has to be escaped with an extra %. It caused me grief. Here&#8217;s the batch file version of the command above:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">youtube-dl --external-downloader aria2c --external-downloader-args \"-c -j 10 -x 10 -s 10 -k 1M\" -o \"F:\/Libraries\/Youtube Archive\/%%(uploader)s\/%%(upload_date)s-%%(title)s.%%(ext)s\" https:\/\/www.youtube.com\/c\/GearDownForWhat\/videos<\/pre>\n\n\n\n<p>Final note: I&#8217;ve dealt with 2 errors:<\/p>\n\n\n\n<ol class=\"wp-block-list\"><li>youtube-dl occasionally exits with a 403 error. Apparently it&#8217;s a known issue. clear the youtube-dl cache with &#8220;youtube-dl &#8211;rm-cache-dir&#8221;, and re-run the download command. it&#8217;ll skip anything it&#8217;s already done. In 600 videos, I had to do this 10-15 times?<\/li><li>aria2 occasionally aborts with error 22 (HTTP response header was bad or unexpected. Not much I can do about that! Re-run the download command. This one isn&#8217;t as frequent as the youtube-dl 403 issue<\/li><\/ol>\n\n\n\n<p>PS: youtube-dl? It downloads from thousands of other sites too. \ud83d\ude42 <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Note: I started having trouble with this. It would appear they learned to throttle this as well. There is now a WAY better option that does all of this is one, and I finally have the holy grail I was looking for: archiving disappearing YouTube content before it disappears on [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":234,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-231","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"jetpack_featured_media_url":"https:\/\/i0.wp.com\/varmintworks.com\/varmint-blog\/wp-content\/uploads\/2021\/10\/pngaaa.com-808842.png?fit=481%2C401&ssl=1","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/varmintworks.com\/varmint-blog\/wp-json\/wp\/v2\/posts\/231","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/varmintworks.com\/varmint-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/varmintworks.com\/varmint-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/varmintworks.com\/varmint-blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/varmintworks.com\/varmint-blog\/wp-json\/wp\/v2\/comments?post=231"}],"version-history":[{"count":13,"href":"https:\/\/varmintworks.com\/varmint-blog\/wp-json\/wp\/v2\/posts\/231\/revisions"}],"predecessor-version":[{"id":252,"href":"https:\/\/varmintworks.com\/varmint-blog\/wp-json\/wp\/v2\/posts\/231\/revisions\/252"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/varmintworks.com\/varmint-blog\/wp-json\/wp\/v2\/media\/234"}],"wp:attachment":[{"href":"https:\/\/varmintworks.com\/varmint-blog\/wp-json\/wp\/v2\/media?parent=231"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/varmintworks.com\/varmint-blog\/wp-json\/wp\/v2\/categories?post=231"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/varmintworks.com\/varmint-blog\/wp-json\/wp\/v2\/tags?post=231"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}