this post was submitted on 05 Jul 2025

31 points (94.3% liked)

Ask Lemmy

33105 readers

1364 users here now

A Fediverse community for open-ended, thought provoking questions

Rules: (interactive)

1) Be nice and; have fun

Doxxing, trolling, sealioning, racism, and toxicity are not welcomed in AskLemmy. Remember what your mother said: if you can't say something nice, don't say anything at all. In addition, the site-wide Lemmy.world terms of service also apply here. Please familiarize yourself with them

2) All posts must end with a '?'

This is sort of like Jeopardy. Please phrase all post titles in the form of a proper question ending with ?

3) No spam

Please do not flood the community with nonsense. Actual suspected spammers will be banned on site. No astroturfing.

4) NSFW is okay, within reason

Just remember to tag posts with either a content warning or a [NSFW] tag. Overtly sexual posts are not allowed, please direct them to either !asklemmyafterdark@lemmy.world or !asklemmynsfw@lemmynsfw.com. NSFW comments should be restricted to posts tagged [NSFW].

5) This is not a support community.

It is not a place for 'how do I?', type questions. If you have any questions regarding the site itself or would like to report a community, please direct them to Lemmy.world Support or email info@lemmy.world. For other questions check our partnered communities list, or use the search function.

6) No US Politics.

Please don't post about current US Politics. If you need to do this, try !politicaldiscussion@lemmy.world or !askusa@discuss.online

Reminder: The terms of service apply here too.

Partnered Communities:

Logo design credit goes to: tubbadu

founded 2 years ago

MODERATORS

Bluetreefrog@lemmy.world

TheSaneWriter@lemm.ee

TheSaneWriter@lemmy.thesanewriter.com

Asudox@lemmy.world

lemmy_bot@lemmy.world

beefbaby182@lemmy.world

ModeratorCan@lemmy.world

neidu3@sh.itjust.works

asudox@lemmy.asudox.dev

candyman337@lemmy.world

candyman337@sh.itjust.works

Are there any tools I can use for translating a ~400 pages scanned book? (piefed.social)

submitted 22 hours ago* (last edited 22 hours ago) by morto@piefed.social to c/asklemmy@lemmy.world

18 comments fedilink hide all child comments

Situation: I got a scanned book that I'd like to read that is in chinese and has no available translation. I really want to read it, because it would probably help a lot with my university project.

What I tried: tried creating a version with ocr to get a text layer and use some translation tool on it, but found no way to make the ocr text visible. I also tried this tool, but the ocr didn't work for me, and I found no way to use it with some local model

Have any of you ever done a similar task? I'd appreciate any kind of suggestions and tips.

top 18 comments

sorted by: hot top controversial new old

[–] starlinguk@lemmy.world 1 points 8 hours ago

Yes. Pay a translator.

[–] andrew0@lemmy.dbzer0.com 5 points 17 hours ago (1 children)

If you find that OCR doesn't get you very far, maybe try a small vLM to parse PNGs of the pages. For example, Nanonets OCR will do this, although quite slow if you don't have a GPU. It will give you a Markdown version of the page, which you can then translate with another tool.

PaddleOCR might also be useful, since it focuses on Chinese, but it's more difficult to set up. To add to this, some other options are MinerU and MistralOCR (this is paid, but you can test it for free if you upload it in Mistral's library).

[–] morto@piefed.social 3 points 15 hours ago (1 children)

That PaddleOCR looks very interesting. It will even extract images and formulas and somewhat preserve formatting in the output! I will try this one, even if takes more than a day to process is with my low end cpu. Thank you for the suggestion!

[–] andrew0@lemmy.dbzer0.com 1 points 11 hours ago

Be wary that their docs are so and so. Nanonets OCR, Mistral OCR and MinerU will also extract formulas and images.

One other model I forgot to mention is Docling. This one is quite quick to set up in a docker container, and will have a web interface ready to go where you can upload documents. This sort of follows the PaddleOCR pipeline, but also allows you to use vLMs.

Good luck!

[–] bitofarambler@crazypeople.online 7 points 21 hours ago (1 children)

i did this with a chinese book, but have to check what i used.

The translation was entirely readable.

i think i used tesseract.

No, GImagereader!

that was it.

tesseract was also very straightforward, but gimage reader had a GUI, and all I had to do was import the file and then click export and it did the whole thing.

[–] morto@piefed.social 3 points 15 hours ago (1 children)

I used tesseract, but the output pdf didn't have visible text, and I found no way to change it. Maybe I don't know how to properly use it., or it's not intended to keep formatting.

[–] bitofarambler@crazypeople.online 1 points 14 hours ago* (last edited 12 hours ago)

try gImagereader.

it's a frontend to tesseract and is more workable via its GUI and option menus.

Load the file, execute the program.

That's all I had to do for a successful OCR.

[–] BurgerBaron@piefed.social 3 points 19 hours ago (1 children)

This is more intended for real time usage, but might work for you:

https://github.com/Artikash/Textractor

https://github.com/Crivella/ocr_translate

I watch Macaw45 play full fledged Japanese retro RPG games using Textractor it'd probably be good for books too.

[–] morto@piefed.social 3 points 15 hours ago

Thanks for the suggestions. That OCR_translate looks interesting. I will prioritize other recommended tools that seem to be more focused on books, but I bookmarked it for future needs.

[–] BlameThePeacock@lemmy.ca 4 points 21 hours ago (3 children)

You can literally just feed the images into chat gpt at this point.

[–] morto@piefed.social 3 points 15 hours ago

I'm giving preference to open source tools, but that's a good thing to know, thanks

[–] mesamunefire@piefed.social 2 points 17 hours ago

Every time I've done it, it's pretty bad. Ocr is much better.

[–] thebestaquaman@lemmy.world 2 points 20 hours ago (1 children)

This doesn't work after the pdf reaches a cert max size.

[–] BlameThePeacock@lemmy.ca 3 points 19 hours ago

Could just break it up into chapters or something, pretty easy to split a pdf.

[–] gramie@lemmy.ca 1 points 19 hours ago (1 children)

Which Google lens work? And take a picture of each page and feed it to the Google translate engine. It might be the easiest way.

[–] morto@piefed.social 2 points 15 hours ago

I'm not sure if it would be viable for a long book, and I'm also avoiding google, but thanks for helping. I got some nice suggestions in this thread.

[–] lemmyuser68@sopuli.xyz 1 points 21 hours ago (1 children)

notebooklm (Google)

[–] morto@piefed.social 2 points 15 hours ago

Well, I'm avoiding google, but I will keep it in mind as a last last resort, thanks