|
Features |
| Expand All |
|
Technical Notes |
- Sample code is included for: VB.NET, C#, VB, Delphi, VC++, HTML
- Object-oriented API for .NET users
- Deploys within .NET as a managed control and is fully compliant with .NET 1.1 and above (see "Building Robust Imaging Components for the Microsoft .NET Platform" white paper)
- Can also be used in any development environment that hosts ActiveX COM controls
- Can be used in a multi-threaded environment and perform synchronous thread-safe processing (more).
- Support user-specified debug logging levels
- Suitable for Client/server Web applications
- Perform operations on bitonal and color images, including the ability to extract color images and insert them back into the searchable document
- Support documents containing up to 999 pages
- Free full-featured trial version available for immediate downnload. Watermarks will be placed in output files created by a trial version of OCR Xpress.
|
|
Text and Font Style Recognition |
- Perform OCR on a digital image, delivering the text in:
- The appropriate Serif, Sans Serif, or Monospace font style
- A font that is closest in form to the recognized font in normal, bold, italic, or bold italic
- Scaled over a wide range of font sizes
|
|
Language Recognition |
- Recognize text in English, French, German, Italian, Spanish, Portuguese, Danish, Dutch, Swedish, Norwegian, Hungarian, Polish, and Finnish
- Recognize one language at a time
- Includes dictionaries for all supported languages
- Accepts and applies user-defined words in a custom dictionary
|
|
Auto Rotation |
- Accepts input images in any orientation and automatically rotates 0, 90, 180, or 270 degrees
- Returns the amount of rotation applied
- Uses the text to determine orientation
- Highly optimized for speed
|
|
Character Position Information |
- Returns character position information for all characters (recognized with high and low confidence)
- Use this feature to redact or highlight text in the original image using the included NotateXpress component.
- Use this feature to build your own PDF files, using the position information to place the hidden text in the correct location.
- Use the recognition confidence of each character in OCR Xpress and its recognition engine in conjunction with other OCR engines, such as SmartZone, to perform voting, thereby improving the recognition accuracy of both engines.
|
|
Text Correction Capability |
- Identify characters recognized with low confidence
- Make corrections to text prior to outputting to the document
- Build text proofing and character replacement functions into applications
|
|
Image Binarization |
- Create black and white images from 24 bit color and 8 bit grayscale image file formats, with image input and conversion support provided by ImagXpress Document v8
- Retain non-text color regions for reinsertion into the output document
|
|
Deskew |
- Full-page deskew on images with up to 15 degrees of skew
|
|
Image Input |
|
|
File Output Formats |
- The output from OCR Xpress is a digital file containing unformatted text, formatted text, or formatted text plus image data, delivered in a variety of file formats. OCR Xpress Professional outputs all file types listed below, INCLUDING PDF. OCR Xpress Standard outputs all file types listed below, EXCEPT PDF.
- ASCII
- ASCII with no line breaks
- ASCII with line breaks
- ASCII with smart formatting (positioned with spaces)
- ASCII, comma-delimited (one line per field)
- ASCII, tab-delimited
- Excel v2.x (compatible with later versions)
- HTML, with a sub-folder of the same name containing images
- PDF
- PDF – Searchable Image (Original Image with Hidden Text), PDF version 1.4 file (Professional edition only)
- PDF – Formatted Text and Graphics (Normal), PDF version 1.4 file (Professional edition only)
- PDF – Image only, PDF version 1.4 file (Professional edition only)
- RTF – Used for import to Word, WordPerfect, etc.
- WordPerfect 5.0, WordPerfect 5.1
- All image-only file formats supported by ImagXpress (see "File Format Support" of ImagXpress Document)
|
|
Segmentation |
- Automatically or manually locate regions of the input image and identify them as either images (whose color can be preserved) or areas containing recognizable text
- Access various regions separately, or recombine into fully-formatted documents such as RTF or PDF files
|
|
Edition Descriptions |
| The Standard Edition is a full-featured full-page OCR toolkit that supports many different export formats. The Professional Edition adds support for exporting PDF formats. |
|
|
|
|
|
|
|
|