Title
Create new category
Edit page index title
Edit category
Edit link
Similarity Search - Introduction
Our ML-Based Similarity Search leverages advanced feature extraction techniques to identify and correlate unknown threats with known malware families. By analyzing behavioral patterns, code structures, and static attributes, our machine learning models detect even evasive or zero-day threats that traditional signature-based methods may miss.
This capability enables security teams to quickly pivot between related threats, uncover hidden malware clusters, and enhance threat hunting efficiency—making it a powerful tool for identifying and responding to emerging cyber threats.
Portable Executable type
These features are carefully selected based on their ability to provide accurate and relevant results, and they are continuously updated to stay current with the latest malware trends and techniques.
| Field name | Type | Description |
|---|---|---|
| Language | String | What speaking language does the binary target |
| Entry point section name | String | Name of the section where the entry point of the PE resides. It’s a calculated value, based on the supplied entry point address & section details. |
| Pdb path | String | Path of the PDB file on the compiler machine |
| DetectItEasyInfo | String | Information that has been extracted using DetectitEasy |
| Malware config | String | Malware configuration refers to the settings and parameters within malicious software that dictate its behavior, |
| File size | Number | Size of the input file |
| Unix timestamp | Number | A timestamp showing when the file was compiled |
| Subsystem | Number | Defines whether the PE is made to be a Console or UI application |
| Section number | Number | Number of sections present in the PE |
| Resource number | Number | Number of resources present in the PE |
| Resources to file ratio | Number | Ratio between the size of the resources & the file itself |
| Digitally Signed | Boolean | Whether the digital signature is verified or not. |
| Packed | Boolean | Whether the input file is packed or not |
| Total exported functions | Number | Indicates the number of exported functions in a PE |
| Total imported functions | Number | Indicates the number of imported functions in a PE |
| Digital signature verification | String | Whether the digital signature is verified or not. |
| Field name | Type | Description |
|---|---|---|
| Pdb guid | String | GUID of the PDB associated with the binary |
Similarity Search Filters
In addition to advanced technology, Similarity Search provides multi filtering search parameters. This feature offers greater flexibility and ensures that users receive the most accurate and relevant results for their specific needs.
| Field name | Type | Possible values | Example | Description | Required |
|---|---|---|---|---|---|
| SHA-256 | String | Number | Yes | ||
| Submission data | Date | 2023-01-17T12:17:20.000Z | Number | Optional | |
| Final Verdict | String | MALICIOUS, LIKELY-MALICIOUS, NO-THREAT, SUSPICIOUS, BENIGN, UNKNOWN | MALICIOUS | Verdict of a file | Optional |
| Tags | String | peexe,xml | Tags of a file | Optional | |
| Threshold | Number | 1 to 100 any integer | Number | Similarity threshold 0% to 100% Higher score means higher similarity | Optional |
| Limit | Number | 1 to 100 any integer | Number | Number of returns | Optional |
See the "Technical Datasheet" for a complete list of features: https://docs.opswat.com/filescan/datasheet/technical-datasheet
