SmartCAT treats PDFs - and some other document formats, please read the Help - as images. It then carries out OCR (optical character recognition) on the document to extract the text.
OCR is not totally reliable. It can't easily tell the difference between "i" and "l" or "rn" and "m" for instance, depending on the typeface used. The technology requires the use of specialist dictionaries to interpret what it thinks it sees and clean up the text to the most likely words.
This can still result in lots of unknown words and missing punctuation in the source document, which isn't that easy to resolve, even by native speakers. And the SmartCAT translation segments themselves can become fragmented without clear line breaks (periods or full stops) in the OCR'ed material.
So an OCR document is not the best source for a language learner or SmartCAT beginner to start with.