See linked posting. I've commented there with a link to a CLI tool in Python that allows downloading of IA collections. I've submitted a patch to enable specifying start and end points so that it's easier to resume downloading a huge collection, or to allow multiple people to split up the work.
https://archive.org/details/georgeblood
https://archive.org/details/78rpm_bowling_green
F*ck the RIAA and absurdly long copyright.
EDIT: There is more than one collection of 78s on IA, so I updated the title.
The issue with these collections are that they're absolutely HUGE. And yes, IA offers torrents for them, but as a separate torrent for every. single. album. And the torrents have all data in them -- FLAC, fixed-rate MP3, VBR MP3, PDF liner notes, etc. etc... there may be some extremely hardcore data-hoarders out there who want everything, but IMHO as these are scratchy old 78 records, FLAC is overkill to just save the audio in a listenable format. The George Blood collection, just the VBR MP3s, is looking to be about 6TB. With ALL data it might be over 40TB! I can't afford that many hard drives :)
So, my approach at the moment is to save just the VBR MP3s (they seem to be done at up to 320kbps VBR) and the JPEG album cover. If I have a chance and any storage left afterwards, I can make a separate pass to get the album liner PDFs...
Tool used: https://github.com/jjjake/internetarchive
Patch to allow setting start and end item indices for downloads: https://github.com/jjjake/internetarchive/pull/605
Example usage to grab just the VBR MP3 and record label JPG for each (note the --start-idx and --end-idx arguments):
#ia download --start-idx=4001 --end-idx=8000 -a -i --format="VBR MP3" --format="JPEG" --search collection:georgeblood
I'm going to concentrate on the George Blood collection for now.. I'm starting at item 1. It would be great if others started at index 50,000, 100,000, 150,000, ... and others started at the end and worked backwards in similarly-sized chunks, so that it's assured someone gets each of them.
Or a renewal step. If it's not worth renewing, let it into the public domain.
This is why It's A Wonderful Life became a Christmas classic. Because it was in the pubic domain, it was used as late night filler.
The MPAA and RIAA miss the point. If It's A Wonderful Life was still copyrighted, it wouldn't have become a classic.
It's like the concept of Abandonware. If video games had a large copyright clearing house like the MPAA or RIAA, Abandonware wouldn't work, but abandoned media will disappear. Heck, non-abandoned media also disappears because profits don't reward preservation.
Ok but then how will my ~~kids~~ record company benefit into perpetuity?
In all seriousness, I think copyright law is the best example of how captured our government is to large corporate interests.