Multiscanning – Making Sense of the Numbers

Author: Randy Abrams, Sr. Security Analyst, OPSWAT.

Reports by antimalware companies claim that they are seeing hundreds of thousands of brand-new unique malware samples every day. For example, Symantec reported seeing 7.8 million unique malware samples in the month of April 2019. Independent test lab, AV-Test, reported seeing over 965 million samples between January 1 and October 1 in 2019. McAfee reported seeing 504 new samples each minute in Q1 of 2019. 

The media often confuses the largest number reported by a single security company with THE total number of new malware samples released each day. This is not correct, nor is the sum of all reported numbers correct. Let’s dive into the numbers and make some sense of what it all means.

Is it hype?

Well, yes and no. While there is no reason to believe the numbers are not honest, few of the polymorphic samples come from new families of malware. Cerber is one family of polymorphic malware. Trend Micro reports that they see as many as 4 new unique Cerber polymorphs each minute. Extrapolating we arrive at 5,760 brand new unique samples each day, but they are all variants of the Cerber family. While technically unique malware samples, a single well written heuristic will detect most, if not all of these samples. Other families of malware create polymorphic samples even more frequently. According to the 2018 Webroot Threat report, 94% of the new malware they see is polymorphic. Virtually all of these samples are seen just once in the wild. Let’s look at this number a different way. Ninety-four percent of the new samples seen by Webroot each day will never be seen by any test lab or any other antimalware vendor. This does not mean that no other vendor is able to detect the polymorphic variants, many will be detected sight unseen, but it means that we do not know how many of these samples might be undetected by other vendors. 

The “per day” column in Table 1 depicts the number of unique samples per day reported by vendors. The vendors with stars by their name did not report numbers and so artificially low numbers were arbitrarily assigned. Vendors A through F represent smaller vendors, and 10,000 was the value assigned in order to avoid over-stating the numbers. 10,000 was arbitrarily assigned in order to avoid over-stating the numbers Real-world numbers may be significantly higher.

It is important to note that different vendors may count samples by different criteria, and no correlation can be drawn between the numbers of samples and the quality of the product. 


Report

Per day

.9 multiplier

Date Range

Samples/Day Not Seen

AV-TEST

376,000

338,400

2019

2,086,458

McAfee

725,760

653,184

Q1 2019

1,771,674

Kaspersky

360,000

324,000

2017

2,100,858

Symantec

222,527

200,274

Q1-Q3 2019

2,224,584

*Trend Micro

350,000

315,000

N/A

2,109,858

*ESET

300,000

270,000

N/A

2,154,858

* Bitdefender

300,000

270,000

N/A

2,154,858

**Vendor A

10,000

9,000

N/A

2,415,858

**Vendor B

10,000

9,000

NA

2,415,858

**Vendor C

10,000

9,000

NA

2,415,858

**Vendor D

10,000

9,000

NA

2,415,858

**Vendor E

10,000

9,000

NA

2,415,858

**Vendor F

10,000

9,000

N/A

2,415,858

Total

2,694,287

2,424,858

Table 1

Researchers I have spoken with from several antimalware companies, have indicated that 85 to 95 percent of the samples they see each day are seen once. The column in Table 1 with the .9 multiplier is the number of samples that no other vendor will see. For example, AV-Test does not see 90% of the combined number of samples seen by all of the other vendors. McAfee will never see 90% of the samples that AV-Test and all other vendors see. The same applies to all vendors.

It is important to note that vendors have heuristic algorithms that can detect a large portion of polymorphic samples they have never seen. If detection did not exist before the sample, the malware did, then no vendor would score 100% in tests. Vendors have their own proprietary heuristic algorithms. These proprietary algorithms may be more, or less, effective against different families and variants of polymorphic malware. 

I have not yet talked about the 10% of the malware that is seen multiple times; sometimes hundreds of thousands of times. These threats include malware families such as WannaCry and Emotet. The time it takes to add detection for rapidly spreading threats is your window of vulnerability. The first product to provide detection is the first to offer protection. This means that the more scanners you use, the shorter your window of vulnerability will be. The moment the window of vulnerability opens, employing the best antimalware product is essential. But what is the best antimalware scanner?

Jimmy Kuo, a stalwart of the antimalware industry, defines the best as follows.

“The best antivirus is the one that blocks the viruses that you encounter. And the ones you are most likely to encounter are the ones that you just saw”

Let’s look at a practical application of Jimmy’s definition. In November 2015, a specific Trojan was submitted to MetaDefender Cloud for examination. The only scanner that displayed detection for the threat was from a little-known vendor named Filseclab. If you were using Filseclab and encountered the threat, then you were using the best scanner at that moment. If you weren’t using Filseclab at that moment and you encountered that trojan; it was probably a bad day for you.

Is a sample from November 2015, too stale? On February 2, 2016 only four scanners detected this threat. These scanners were Antiy, AegisLab, Filseclab, and Zillya! Were you using any of these scanners? No? Perhaps you weren’t using the best scanner that day.

Still too stale? Is October 29, 2019 fresh enough? Same threat, but over four years later only 16 of 39 products displayed detection for the threat. If you were not using one of the sixteen scanners you might not have been using the best, if you encountered this trojan. Many of the best-known products are still not detecting this malware.

I can provide numerous examples of significant delays in detection. But what about rapidly spreading threats, such as WannaCry? That is when you need “the best antivirus.” Unfortunately, you can’t install 20+ scanners on your endpoints. But that does not mean you cannot be using the best scanner. If you are using MetaDefender Cloud you are using the best scanner for almost all the threats you will encounter. MetaDefender is an API driven solution that integrates with email servers, ICAP servers, access control solutions, and more.

If you can’t send your files to the cloud, MetaDefender Core can be deployed on-premises and includes additional proactive technologies that do not rely upon detection.

If you’d like to know more, we’d love to hear from you. For now, why don’t you sign up for a free MetaDefender Cloud account at https://go.opswat.com/communityRegistration?

About the author: In 1997 Randy designed, maintained and administered the multiscanning system that Microsoft uses to ensure that no malware is released. Multiscanning worked in 1997, and it works now! 

Sign up for Blog updates

Get information and insight from the leaders in advanced threat prevention.