Custom detection

The custom detection feature is an advanced feature that allows users to define their own rules for identifying specific patterns within files. This capability enables users to support their own file types for detection quickly, without needing to wait for official support from the FileType engine.

Enable custom detection

This feature is disabled by default. To enable this feature:

  • At Inventory > Modules > Utilities > FileType, Tick Enable custom detection
  • At Inventory > Modules > Utilities > FileType, section Enable custom detection, specify paths to XML rule files and/or rule directories that contain XML rule files.

When there are updates on the rule files or the rule directories, the engine needs to be restarted in order for the rules to take effective.

When new items of the configuration are added, the rules are loaded automatically along with the changes insides the existing files or directories if available.

Custom rules

Rule definitions

Info of file types detected with custom rules and the rules are defined in XML format with fields described as in the table below.

FieldMandatoryMeaning
File type info
descriptionRequiredFile type description to be used to output.
idRequiredFile type ID.
mimeOptionalMime type to be used to output. Default value: application/octet-stream.
groupOptionalGroup ID to be used to output. Default value: O. See the list of group IDs below.
extensionOptionalExtension(s) for the file format. This value will be used to check mismatching. Default value: empty.
scoreOptionalConfidence score for the custom file type. Value range [0, 1]. Default value: 0.25.
Patterns for detection
FrontBlockRequiredDefine patterns at specific offsets
FrontBlock.PatternRequiredDefine offset (stored in Pos) and hex pattern to be compared (stored in Bytes).
GlobalStringsOptionalDefine patterns at random offsets.
GlobalStrings.StringOptionalDefine string pattern to be matched.

Group ID and name

GroupGroupGroup
A: Archive FilesG: Image FilesT: Text Files
AP: Application FilesI: Disk Image FilesZ: Email Files
D: Office DocumentsM: Media FilesO: Other
D_ENC: Encrypted DocumentsOPENSSL_ENC: OpenSSL Encrypted Files
E: Executable FilesP: Adobe Files

The current use case is to turn a unknown (DATA) or not surely (non-DATA with score < 1.0) (detected by native rules of the engine) file type into a user-custom one with higher score.

There can be cases in which a file matches both a custom and a built-in rule. In order to prioritize the detection result from the custom rule, the custom rule should be defined with a high confidence score, e.g., 1.1.

The "detection score" can be found in the JSON scan result: filetype_info.file_info.likely_type_ids.score

Example rules

Below are some XML example rules.

Rule 1 - Pattern at offset 0
Rule 2 - Pattern at offset 16
Rule 3 - Have global string
Rule 4 - Multiple global strings
Copy
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard