There are tons of books you can borrow on archive.org.
I’ve been purging by paper collection and moving to ebooks, and while there a lot of books I can replace with Kindle and PDFs, a fair number of books never made it to ebook form.
archive.org has an offering where they’ve scanned zillions of books and make them available for check-out on a free renewable lending basis. But this service has four key limitations
- It’s using the archive.org page-turning interface which is not as pleasant as having an actual PDF you can use in your favorite software.
- You can’t markup, bookmark, etc. this book and retain these notes in the future.
- Only works when you’re connected to the Internet. Going on a cruise or camping? You’re out of luck.
- And it might not be around much longer. archive.org already lost round one against a gang of publishers who are angry that books they have no interest in republishing can be checked out of a library.
Code to the Rescue
Fortunately, there is a way you can download these books as PDFs. All you need is a little JavaScript and the ability to pay close attention to instructions.
First, head over to this GitHub, which has full instructions.
Some advice:
- Use Firefox as your browser. It works consistently.
- Uncheck “Always ask you where to save files”
- You’ll be downloading a couple hundred or more files and you don’t want to hit return for each one.
- Zoom in on the image after you check the book out, and do it at least two times. I usually do 4. Otherwise you’re going to get tiny JPGs that are fuzzy when you try to read them.
- Follow instructions closely. It won’t work for you the first time and going over the instructions again you’ll realize you missed a small step.
Once you have all the JPGs, you can assemble them into a PDF in various ways. Here’s a quick Python script that can do it via the img2pdf module. Just save all the JPGs into one folder and call this script as
make_pdf.py <directory name>
Code:
#!/usr/bin/python3 import img2pdf, os, re, sys def fail ( message ): print ("%s\n" % ( message )) sys.exit(1) if ( len(sys.argv) != 2 ): fail ("Usage: makepdf <directory>") img_dir = sys.argv[1] img_dir = re.sub( '/$', '', img_dir ) if ( os.path.exists ( img_dir ) == False ): fail ( "ERROR: directory '%s' does not exist" % ( img_dir ) ) print ("%-30s: %s" % ( "Directory", img_dir ) ) pdf_name = "%s.pdf" % ( img_dir ) print ("%-30s: %s" % ( "PDF to Create", pdf_name ) ) images = [] for fname in os.listdir(img_dir): if not fname.endswith(".jpg"): continue path = os.path.join(img_dir, fname) if os.path.isdir(path): continue images.append(path) images.sort() print ("%-30s: %d" % ( "Num Images", len(images) ) ) print ("%-30s: %s" % ( "First Image", images[0] ) ) print ("%-30s: %s" % ( "Last Image", images[len(images)-1] ) ) with open(pdf_name,"wb") as f: f.write(img2pdf.convert(images)) os.system ("du -sh \"%s\"" % ( pdf_name ))
Related Posts:
- One Week From Tomorrow…THE WORLD WILL LOSE THEIR MINDS!Lines Are Already Forming! - November 21, 2024
- Crunchbits Discontinuing Popular Annual Plans – The Community Mourns! - November 20, 2024
- RackNerd’s Black Friday 2024: Bigger, Better, and Now in Dublin! - November 19, 2024
Leave a Reply