Catholic Info

Traditional Catholic Faith => Computers, Technology, Websites => Topic started by: Cryptinox on February 21, 2022, 11:15:57 PM

Title: Copying text from archive.org books?
Post by: Cryptinox on February 21, 2022, 11:15:57 PM
I am curious if there is a way I can just get all the text of an Archive.org book on one and just copy it so I don't have to flip through every single page of the book. I want to to do this because I want to copy How Christ Said the First Mass so that I can make an audio version to listen to using a tolerable text to speech program.
Title: Re: Copying text from archive.org books?
Post by: Mark 79 on February 21, 2022, 11:19:36 PM
https://archive.org/details/howchristsaidfir00meag

Scroll down, right sidebar "Download options," pick your choice.
Title: Re: Copying text from archive.org books?
Post by: Cryptinox on February 22, 2022, 12:08:19 AM
https://archive.org/details/howchristsaidfir00meag

Scroll down, right sidebar "Download options," pick your choice.
Thanks. Exactly what I was looking for.
Title: Re: Copying text from archive.org books?
Post by: B from A on February 22, 2022, 07:39:10 AM
so that I can make an audio version to listen to using a tolerable text to speech program.

It looks like a photographed book.  Is there an easy way to change that to text, or straight to speech from photo?  Or am I wrong about it being a photo? 

Title: Re: Copying text from archive.org books?
Post by: Marion on February 22, 2022, 07:58:25 AM
It looks like a photographed book.  Is there an easy way to change that to text, or straight to speech from photo?  Or am I wrong about it being a photo?

The PDF contains both page images and text.
Title: Re: Copying text from archive.org books?
Post by: Mithrandylan on February 22, 2022, 10:54:01 AM
It looks like a photographed book.  Is there an easy way to change that to text, or straight to speech from photo?  Or am I wrong about it being a photo?
.
You will need to invest in OCR software. Then you can process the image text files into searchable text files. Adobe is one option. If you prefer non SaaS options, ABBY Reader is very good. I think they still provide a perpetual license. 
Title: Re: Copying text from archive.org books?
Post by: Cryptinox on February 22, 2022, 11:06:50 AM
.
You will need to invest in OCR software. Then you can process the image text files into searchable text files. Adobe is one option. If you prefer non SaaS options, ABBY Reader is very good. I think they still provide a perpetual license.
Archive.org has the text file available for download