To ensure the best possible data extraction and OCR results, we recommend:
Documents should be scanned at least 300dpi.
What elements can affect image quality?
- Stains on original documents.
- Output format and depth when scanning.
What output formats do we recommend?
However, Athento supports the formats listed in the following OCR Extraction.
What happens when the image quality is low?
Two processes are particularly affected:
- Classification based on textual expressions: As OCR cannot be extracted correctly, textual expressions are not extracted well either, so classification may fail.
- Metadata extraction: Metadata may not be extracted or may be extracted incorrectly.
Is processing a photo the same as processing a scanned image?
No, it is not the same. Processing photographs is much more complex, as camera-generated images can have a lot more complexity:
- Perspective distortion.
- Luminosity problems.
Can Athento correct image defects?
Athento has operations that can correct defects such as the following:
- Correct document orientation.
- Delete blank pages.
- Clean up OCR...etc.
However, corrections have limitations, and in many cases are not sufficient to obtain 100% optimal results.
You can also consult