To ensure the best possible data extraction and OCR results, we recommend:
Recommendations
Documents should be scanned at least 300dpi.
What elements can affect image quality?
- Stains on original documents.
- Folds.
- Output format and depth when scanning.
What output formats do we recommend?
- PDF.
- TIFF.
However, Athento supports the formats listed in the following OCR Extraction.
What happens when the image quality is low?
Two processes are particularly affected:
- Classification based on textual expressions: As OCR cannot be extracted correctly, textual expressions are not extracted well either, so classification may fail.
- Metadata extraction: Metadata may not be extracted or may be extracted incorrectly.
Is processing a photo the same as processing a scanned image?
No, it is not the same. Processing photographs is much more complex, as camera-generated images can have a lot more complexity:
- Perspective distortion.
- Blur.
- Luminosity problems.
Can Athento correct image defects?
Athento has operations that can correct defects such as the following:
- Correct document orientation.
- Delete blank pages.
- Clean up OCR...etc.
However, corrections have limitations, and in many cases are not sufficient to obtain 100% optimal results.
You can also consult
Comments
0 comments
Please sign in to leave a comment.