Proactive DLP release notes

v2.19.1

Release date: 06/13/2024

  • PDF metadata dictionary processing
  • Detection Policy improvement (precedence, new operators and parenthesis usage)
  • Custom regex file update does not require engine restart from now on
  • Date regex issue fix in Excel files using TSV files

v2.19.0

Release date: 04/11/2024

  • DLP now supports multi-frame GIF files with OCR
  • DLP now supports .loc files
  • DLP now supports FDF files
  • DLP now supports Acrobat Forms
  • DLP now supports recursive PDF processing (only embedded PDF)
  • Improved Office 2007 object processing, including linked objects, tracked changes, background images, Smart Art
  • Improved PDF stamp annotation and form processing
  • Watermark scan bugfix

v2.18.1

Release date: 02/15/2024

  • More sophisticated TSV file handling related to “custom regexes from file” feature
  • Refined detection of SWIFT codes, reducing both false positives and false negatives
  • Resolved interruption issues related to hit limits during embedded image processing
  • Removed duplicate results that occur when using the best quality OCR
  • Fixed independence of redact-settings from TSV file loading
  • Fixed memory leak

v2.18

Release date: 01/04/2024

  • Importing custom regular expressions from file is now supported

  • New predefined sensitive information types are supported for detection and redaction

    • ABA Routing Number
    • U.S. Bank Account
    • International Banking Account Number (IBAN)
    • International Securities Identification Number (ISIN)
    • SWIFT Code
  • OCR capabilities have been improved

  • SSE4.2 is now also accepted by DLP (besides SSE4.1) as required CPU instruction set for OCR

  • File metadata handling for scan requests have been improved

  • DLP’s UI workflow settings are now unified on every officially supported Core version

  • DLP’s UI workflow settings for detection have been redesigned for a more compact look

v2.17

Release date: 10/20/2023

  • Announcing Document Identification

  • Not Safe For Work (NSFW)

    • Detect "Not Safe For Work" content in text and images
    • Redact textual hits and blur images with NSFW content
  • Personal document classification

    • Detect personal ID on images
  • Detection of Generic Password and Generic API Key secrets have been improved

v2.16

Release date: 07/18/2023

  • Announcing DICOM Anonymization:

    • Anonymize patient information according to the Basic Attribute Confidentiality Profile
    • Remove sensitive burned-in annotations from DICOM images
  • Patterns for US SSN numbers have been updated

  • Fixed issue when DLP engine crashes when files are sent for scanning right after engine initialization

v2.15.1

Release date: 05/22/2023

  • Improved generic password and generic API token detection (secret detection)
  • Fixed issue when processing of large text files took very long time
  • Fixed issue when CCN hits contained extra character at the end
  • Fixed sensitive info substitution in hyperlinks
  • Fixed issue when PDLP was unable to detect certain software secrets in XML files
  • Allowlist and custom validators now properly working with XML files

v2.15

Release date: 03/31/2023

  • Support more secrets:

    • Generic passwords
    • Generic API tokens
    • PostgreSQL credentials
    • MYSQL credentials
  • DOCM and XLSM file types are now supported in scanning

  • Redact sensitive information in CSV and XML files

  • Text substitution can be configured instead of redaction in the following document types:

    • MS office documents (word, excel, slides)
    • PDF files
    • Text files
  • Fixed issue when valid hits can be invalidated due to a bug

  • Fixed issue with duplicate character validator that resulted in more false positive

v2.14

Release date: 12/19/2022

  • Support more secrets:

    • Private keys (PEM, PPK)
    • IBM Cloud key
    • IBM API Connect Credentials
    • IBM COS HMAC Credentials
  • Reorganize the workflow configurations

  • Improve MS Word processing to reduce false positive detection

  • Fixed issue when valid hits could be lost when other hits are redacted during PDF processing

  • Fixed log retention

v2.13.1

Release date: 11/2/2022

  • Allowlist feature has been added to all sensitive info types
  • Log retention period can now be configured from the engine configuration
  • Encoding detection can be configured in the Workflow rule setting instead of the engine configuration
  • Fixed issue where DLP engine updates were failing permanently due to unrecognized OS
  • Fixed issue when encoding detection settings were lost during configuration export and import
  • Fixed issue where an empty regular expression field in the workflow rule settings caused DLP engine update to fail permanently

v2.13

Release date: 9/31/2022

  • Secret detection in text files supporting AWS, Azure and GCP secrets

  • New optional validators for sensitive data

    • Exclude prefix
    • Exclude suffix
    • Exclude beginning characters
    • Exclude ending characters
    • Duplicate characters (only for custom regexes)
  • DLP log level is now selectable in the engine configuration

  • External dependencies are checked before engine initialization

  • Some descriptions have been streamlined in the workflow rule settings

  • Fixed issue when local scan fails on read-only PDF files

  • Fixed issue when SSN localization could be saved without value

v2.12

Release date: 7/5/2022

  • Improved PDF processing speed
  • General improvements in performance due to update from .NET Core 3.1 to .NET 6.
  • Improved OCR capabilities
  • Improved product logging to enhance product diagnostics
  • Fixed issue when the quality of jpeg images dropped after metadata removal
  • Fixed issue when DLP fails to process EMF images embedded in a document
  • Fixed issue when embedded files can't be opened during recursive PPT processing
  • Fixed issue when xls files could not be processed during MD Core local scan

v2.11.1

Release date: 5 /11/ 2022

  • A configuration to choose a default fallback encoding
  • Optimize system resource usage for PDF processing
  • PDF Watermark: Added line breaks to long texts

v2.11

Release date: 3/30/2022

  • Support watermark feature for MS Word
  • Allow the customers to set certainty for the regular expression
  • Support encoding detection (for Japanese, Hebrew and UTF8 encodings)
  • Processing all annotation object types in PDFs
  • Processing all stamp object types in PDFs
  • Keeping the original image quality for output files when removing Metadata

v2.10.1

Release date: 2/16/2022

  • Enhance PPT file processing
  • Improve metadata processing for image file types
  • Fix line break issue with DOC/DOCX when applying watermark
  • Fix metadata removal with TIFF file format
  • More plain text file support
  • Improved image processing (performance)

v2.10

Release date: 12/22/2021

  • Add more supported encodings (for Email Security Gateway use case)
  • A configuration to set limit file size per workflow
  • Scan and remove Metadata recursively
  • Support processing cropped images for XLS/XLSX/PPT/PPTX
  • Fixed memory leak issue

v2.9.1

Release date 12/1/2021

  • Detection Policy is available on MetaDefender Core v5 or newer
  • A configuration to allow a file if it is redacted (MetaDefender Core v5 or newer)
  • Optimize memory usage
  • Better encode handling between Email Security Gateway and Proactive DLP

v2.9

Release date 10/13/2021

  • Support watermark EMF/WMF/SVG

  • Detect and Redact sensitive info in several objects

    • MS Excel sheet name
    • Defined Name object in MS Excel
    • Image alternative text in MS Office
    • Comment Author in MS Word and Excel
    • Header and Footer in MS Word and Excel
    • Track changes in MS Word and Excel
    • Alternative images in PDF
    • Form fields in PDF
  • Improve recursive processing

    • Support RTF as an embedded file
    • More details about hit location
  • Upgraded Qt framework to version 5.15.2

v2.8.1

Release date: 8/18/2021

  • Improve performance for several file types (MS Excel, text, etc ...)
  • Fix MS Excel redaction failure in some cases

v2.8

Release date: 7/8/2021

  • A configuration to remove embed objects if recursive processing fails
  • Fixed FILE SIZE LIMIT configuration issue
  • Fixed embedded image OCR processing
  • Improve large plain text file processing

v2.7.1

Release date: 6/3/2021

  • Support unlimited depth and number objects in recursive document processing
  • Allow users to set a limit number of returned sensitive info
  • Added "," to the default delimiter list
  • Improve Excel processing
  • Fixed Chinese regular expression in text files (CSV, TXT, ...)
  • Fixed missing keyword configuration when upgrading to DLP 2.7

v2.7

Release date: 4/27/2021

  • Drop supporting MetaDefender Core older than 4.17.1
  • Recursive scan and redaction of embedded files in MS Office files
  • Localization support for Japanese SSN
  • Support watermark, metadata detection, OCR for BMP format
  • Support "Delimiter" as an optional validator
  • Context detection around hits in PDF files has been improved
  • Chart detection and redaction has been introduced in Excel
  • Improve OCR detection quality
  • Improve redaction function for MS Word files

v2.6.1

Release date: February 8, 2021

  • Fix detection issue when an empty cell has a comment (Excel)
  • Improve MS Office validation in some regular expression cases

v2.6.0

Release date: January 11, 2021

  • Process the hidden areas in a cropped image (DOCX, DOC)
  • Support OCR for standalone image file types (JPG, PNG, TIFF)
  • Support OCR for embedded images in DOC, DOCX, XLS, XLSX
  • Support remove metadata for document files (PDF, DOC, DOCX, XLS, XLSX)
  • Support redaction for RTF

v2.5.1

Release date: November 12, 2020

  • Improve Japanese string detection/redaction in PDF
  • Fix a detection issue when a regular expression contains a Hebrew string
  • Fix a crash issue when scanning DOC file on Linux

v2.5

Release date: October 1, 2020

  • Metadata removal, Watermark, OCR are available on Linux

  • Advanced watermark configurations: font size, text opacity, text position

  • New configurations

    • Stop the process if found enough sensitive info
    • Quality configurations for OCR (Normal, Best)
  • Support HTML and TXT redaction

  • 10x faster when processing text file

v2.4.1

Release date: August 8, 2020

  • Improved memory usage
  • Improved IPv4 and CIDR search
  • Added threaded comment search and redaction in Excel files
  • Up to 40% speedup when scanning Excel files

v2.4

Release date: July 7, 2020

  • Utilize column and row header to improve certainty level in Excel
  • Detect sensitive info in file properties with regular expressions
  • Custom keyword list for regular expression
  • Support redaction feature on Linux
  • Performance improvement: faster processing, less resource usage
  • New system requirements on Linux
  • End of support Centos 6, Debian 8

v2.3.2

Release date: May 20, 2020

  • Better context calculation for Excel and PDF
  • Improve IPv4 detection in TXT
  • Distinguish between "Failed to detect" and others

v2.3.1

Release date: April 21, 2020

  • Threaded comment redaction in Excel files.
  • Slightly increased PDF scan performance.
  • Improved certainty calculation for MS Office and PDF files.
  • Fixed wrong context when a single cell in an Excel file contained the same hit multiple times.

v2.3.0

Release date: April 7, 2020

  • Support Optical Character Recognition (OCR) for PDF (Windows only)
  • Redact sensitive information for Microsoft Office Excel (XLS/XLSX)
  • Better detection method, reduce false positive

v2.2.1

Release date: Feb 12, 2020

  • Improve IPv4/CIDR detection performance
  • Better handling temp files
  • Remove "Parse Binary" option

v2.2

Release date: Jan 6, 2020

  • Supports watermark addition for PDF
  • Redact sensitive information for Microsoft Office Word (DOC/DOCX)
  • Support DLP in Linux with limited functions (work with MetaDefender Core 4.17.1 or newer)
  • Redact sensitive information based on certainty level (work with MetaDefender Core 4.17.1 or newer)
  • Sample Regular expressions to detect Personally identifiable information (PII): email, address, full name, date of birth, driver license, phone number, bank account number

v2.1.2

Release date: November 27, 2019

  • Better error message when an input PDF file is corrupted

v2.1.1

Release date: October 31, 2019

  • Better displaying the words before and after a hit in PDF

v2.1

Release date: September 8, 2019

  • Supports IPv4, Classless Inter-Domain Routing (CIDR) detection
  • Supports remove metadata for TIFF, GIF file
  • Better CCN detection

v2.0.1

Release date: August 15, 2019

  • Better watermark and redaction handling when a system is under high load
  • Improve CCN detection

v2.0

Release date: June 28, 2019

  • Proactive DLP as new name
  • Certainty score for sensitive data detection
  • Redact sensitive information for text-based PDF file
  • Watermark addition for JPEG, TIFF, PNG, GIF
  • Supports remove metadata for JPG, PNG file

v1.0.3

Release date: February 18, 2019

  • Improve detection for Microsoft Access format
  • Improve context for hits
  • Improve processing speed (20%)
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard