There are tons of books you can borrow on archive.org.
I’ve been purging by paper collection and moving to ebooks, and while there a lot of books I can replace with Kindle and PDFs, a fair number of books never made it to ebook form.
archive.org has an offering where they’ve scanned zillions of books and make them available for check-out on a free renewable lending basis. But this service has four key limitations
- It’s using the archive.org page-turning interface which is not as pleasant as having an actual PDF you can use in your favorite software.
- You can’t markup, bookmark, etc. this book and retain these notes in the future.
- Only works when you’re connected to the Internet. Going on a cruise or camping? You’re out of luck.
- And it might not be around much longer. archive.org already lost round one against a gang of publishers who are angry that books they have no interest in republishing can be checked out of a library.
Code to the Rescue
Fortunately, there is a way you can download these books as PDFs. All you need is a little JavaScript and the ability to pay close attention to instructions.
First, head over to this GitHub, which has full instructions.
Some advice:
- Use Firefox as your browser. It works consistently.
- Uncheck “Always ask you where to save files”
- You’ll be downloading a couple hundred or more files and you don’t want to hit return for each one.
- Zoom in on the image after you check the book out, and do it at least two times. I usually do 4. Otherwise you’re going to get tiny JPGs that are fuzzy when you try to read them.
- Follow instructions closely. It won’t work for you the first time and going over the instructions again you’ll realize you missed a small step.
Once you have all the JPGs, you can assemble them into a PDF in various ways. Here’s a quick Python script that can do it via the img2pdf module. Just save all the JPGs into one folder and call this script as
make_pdf.py <directory name>
Code:
#!/usr/bin/python3 import img2pdf, os, re, sys def fail ( message ): print ("%s\n" % ( message )) sys.exit(1) if ( len(sys.argv) != 2 ): fail ("Usage: makepdf <directory>") img_dir = sys.argv[1] img_dir = re.sub( '/$', '', img_dir ) if ( os.path.exists ( img_dir ) == False ): fail ( "ERROR: directory '%s' does not exist" % ( img_dir ) ) print ("%-30s: %s" % ( "Directory", img_dir ) ) pdf_name = "%s.pdf" % ( img_dir ) print ("%-30s: %s" % ( "PDF to Create", pdf_name ) ) images = [] for fname in os.listdir(img_dir): if not fname.endswith(".jpg"): continue path = os.path.join(img_dir, fname) if os.path.isdir(path): continue images.append(path) images.sort() print ("%-30s: %d" % ( "Num Images", len(images) ) ) print ("%-30s: %s" % ( "First Image", images[0] ) ) print ("%-30s: %s" % ( "Last Image", images[len(images)-1] ) ) with open(pdf_name,"wb") as f: f.write(img2pdf.convert(images)) os.system ("du -sh \"%s\"" % ( pdf_name ))
Related Posts:
Has the Biggest Performance Bottleneck in Python Finally Been Slain?
Enjoy This Index of Thousands of FREE Programming Books! Python, Rust, Javascript, Java, C#, C++, Y...
Just Published: My Powerball Results Checker Script
Setup Odoo? Swap out Slack? Create Plugins for Python? Dynamic DNS? We've Got the Tutorials!
Have You See the Internet Archive's Stolen Truck?
Free Udemy Courses! Our Community Resource Just Keeps Going
- Dropbear in 2025: Still the LowEnd SSH Server of Choice? - January 20, 2025
- “OMG! I Never Knew That!”: The Simply Linux Tip That Has Got Me More Thanks Than Anything I’ve Ever Shared in 30+ Years - January 19, 2025
- Bluesky has Flopped: How Mashable is Lying To You - January 18, 2025
Leave a Reply