Overview Release Notes Central Hub Deployment Kubernetes Configuration Advanced Deployment
Getting Started
Deployment & Usage
Support
References
Proactive DLP
v5.10.1
Search this version
Proactive DLP
Proactive DLP
Title
Message
Create new category
What is the title of your new category?
Edit page index title
What is the title of the page index?
Edit category
What is the new title of your category?
Edit link
What is the new title and URL of your link?
Optical Character Recognition (OCR)
Copy Markdown
Open in ChatGPT
Open in Claude
OCR is a commonly-used technology to recognize text inside images. It examines the text of the documents and converts the characters into code that can be used for data processing. Proactive DLP now can utilize this technology to detect and redact sensitive information.
Supported file types
- Portable Document Format: PDF
- Microsoft Office: doc, docx, MS Word XML, xls, xlsx, ppt, pptx, rtf,
- Standalone image: jpg, png, tiff, bmp, .jp2, .jpg2, .jpf, .jpx, .mj2, .mjp2, .jpm, .jpgm
Supported languages
- English
Enabling OCR
Policies > Workflow rules > "Workflow name" > Proactive DLP > Optical character recognition (OCR)

OCR Quality:
- Normal: detect the information without pre-processing images
- Best: pre-processing images before detecting the image to have a better detection rate, however, performance will be impacted
Example output

System requirements
Vectors can affect the accuracy
- Low contrast documents
- Documents with small text
- Documents with blurry images
- Colored paper or background in documents
- Handwritten text
- Unusual or script-type fonts
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard
Last updated on
Was this page helpful?
Discard Changes
Do you want to discard your current changes and overwrite with the template?
Archive Synced Block
Message
Create new Template
What is this template's title?
Delete Template
Message
