Can I upload PDFs for translation? | Communitytranslation

Can I upload PDFs for translation? Feb 7, 2017 22:17:28 GMT Anneysha7 and ifphigenia like this

Post by SussexSoleil on Feb 7, 2017 22:17:28 GMT

Please don't!

Firstly, there is a charge for this service.

Secondly, the results are not good.

SmartCAT treats PDFs - and some other document formats, please read the Help - as images. It then carries out OCR (optical character recognition) on the document to extract the text.

OCR is not totally reliable. It can't easily tell the difference between "i" and "l" or "rn" and "m" for instance, depending on the typeface used. The technology requires the use of specialist dictionaries to interpret what it thinks it sees and clean up the text to the most likely words.

This can still result in lots of unknown words and missing punctuation in the source document, which isn't that easy to resolve, even by native speakers. And the SmartCAT translation segments themselves can become fragmented without clear line breaks (periods or full stops) in the OCR'ed material.

So an OCR document is not the best source for a language learner or SmartCAT beginner to start with.