Replies: 3 comments 1 reply
-
It is technically already possible to translate files. Another possibility would be to have a separate python script reading images from a folder and sending the request to the server and than using the response to draw the textboxes on the images and save them. This is beyond the scope of this tool/repo and if ever done will be probably done in a separate repository.
As far as PDFs goes this would either have to be done on the side of the extension which right now is not really capable of detecting images in a PDF. It might be worth for the future to add a
The POST requests are not done to the image hosting site, but from the browser extension to the ocr_translate server. The thing that could be improved right now is that the extension does not take into account all possible JS that could inject images into the browser and can in some cases stop images from loading. |
Beta Was this translation helpful? Give feedback.
-
Sorry, I thought that the app was getting images from the website via post requests.
Exactly so, the site I am trying to acess does similar thing. Either the extension is already blocked and even if somhow reloading the website can get the extension to try to get the images, the website just becomes completely blank. I have even tried disabling right click protection, but nothing worked. Will open an issue on the extension repo. Apart from this , the project is really good! I am thinking of building a manga translator myself. On much smaller scale though. I've come across a interesting paper propsing the use of multimodal context-aware translation framework, that will not translate based on only text but will also take the image context into account! |
Beta Was this translation helpful? Give feedback.
-
If you can point me to the site i will try to have a look into it
That seems interesting, in my todo list i want to introduce context aware translation by extracting context using something like a CLIP model and then use a translation tool like an LLM model that can take context into account. The difficult thing that I do not think is trivial to do with the way this server work is to keep context in between images (i do not think there is a surefire way to know which images are tied together without introducing a batch translation) Will move this to a discussion as it seems more appropriate |
Beta Was this translation helpful? Give feedback.
-
Translation on local files should absolutely beautiful possible with slightly modifying the existing script I think. But it would be great to have the option to translate a local PDF as well if it's loaded into the browser, skipping the whole https request part in that scenario. Many websites doesn't support accessing manga images via post request and actively blocks it. In those cases we can simply download the images via different script and feed them to the models individually, in a batch or PDF.
Beta Was this translation helpful? Give feedback.
All reactions