How Does Machine Learning Detect Malware? The Explainer Nobody Simplified

The short version: signature-based antivirus is like a bouncer with a photo book — if your face isn't in the book, you walk in. Machine learning detection is like a bouncer who's studied thousands of fights and learned to read body language — even if you're a stranger, the way you carry yourself gives you away. That distinction is what separates "we caught yesterday's virus" from "we caught a zero-day attack nobody had cataloged yet." Below: what signatures actually do, why they're not enough, how ML classifiers work, the three main approaches, what Casper's Cloak does at the network layer specifically, and the honest limitations.

Signature-based vs ML-based detection at a glance

Before we get into the mechanics, here's the high-level comparison. Both approaches have a place; the question is which threats each one handles well.

Factor	Signature-based	ML-based
Detection speed for known threats	Very fast — direct hash/pattern match	Fast — feature extraction + classification
Zero-day detection	None until a signature is written	Can catch novel threats by behavioral pattern
False positive rate	Very low — exact match	Higher — probabilistic classification
Update frequency	Hourly/daily signature DB pushes	Model retrained periodically; inference runs continuously
Resource usage	Low — pattern lookup in a database	Moderate — depends on model complexity and where inference runs
Evasion difficulty	Easy — change one byte, evade the signature	Harder — must change enough features to shift the classification

The real-world answer is "use both." Signatures are fast and precise for known threats; ML catches the novel ones. Most modern security products layer them. The interesting question is how the ML part works — and that's what we'll spend the rest of this article on.

What signature-based detection does (and why it's not enough)

Signature-based detection is conceptually simple: a security researcher analyzes a new piece of malware, extracts a unique byte sequence (the "signature"), and adds it to a database. Every file your antivirus scans is compared against that database. Match found? Quarantined. No match? Clean.

This model worked well through the early 2000s when new malware appeared at a rate of dozens per day and human analysts could keep up. The AV-TEST Institute now registers over 450,000 new malware and potentially unwanted applications per day. No team of human analysts writes 450,000 signatures per day. Even automated signature generation can't keep up because malware authors use polymorphism — they change the code's byte sequence on every distribution while preserving its behavior. A polymorphic virus might have a thousand variants, each with a different hash, each requiring a different signature. The game is rigged against static pattern matching.

The second problem is the window of exposure. Between the moment a new piece of malware appears in the wild and the moment a signature for it ships to your device, there's a gap. That gap averaged 24–48 hours in 2020; with faster pipeline automation it's shorter now, but it's never zero. During that gap, your signature-based antivirus is completely blind to the new threat. That gap is where zero-day attacks live.

The third problem is mobile. Traditional signature-based scanning requires scanning every file against a large database — an operation that drains battery and demands storage for the signature database itself. iOS doesn't even allow apps to scan other apps' files. On mobile, running a traditional AV engine in the background is either impractical (Android battery drain) or architecturally impossible (iOS sandboxing). The detection has to happen somewhere else.

How ML models learn what "malicious" looks like

Machine learning detection takes a fundamentally different approach. Instead of memorizing individual malware specimens, you train a model on features — measurable characteristics that tend to differ between malicious and benign software (or malicious and benign network behavior). The model learns a decision boundary: "things with these feature combinations are probably malicious; things with those combinations are probably safe."

Step 1: Collect a training dataset

You need labeled examples — known-malicious samples and known-benign samples. For file-based ML, this typically means millions of PE (Portable Executable) files from threat intelligence feeds, public repositories like VirusTotal, corporate honeypots, and malware-sharing consortiums. For network-based ML, it means labeled DNS query logs, network flow records, and domain registration metadata — millions of known-malicious domains and millions of known-benign ones.

Step 2: Extract features

Raw data isn't useful to a classifier directly. You transform each sample into a feature vector — a list of numbers that describe its characteristics. For a Windows executable, features might include: the entropy of each code section (packed/encrypted code has higher entropy), imported API calls (malware tends to import process-injection and registry-manipulation APIs), the ratio of read-write to read-only sections, whether the file has a valid digital signature, string patterns in the binary. For a domain name, features might include: domain age, registrar, WHOIS privacy status, TLS certificate attributes, DNS record history, lexical structure of the domain name itself.

Step 3: Train a classifier

With labeled feature vectors in hand, you train a machine learning model — gradient-boosted trees (XGBoost, LightGBM), random forests, deep neural networks, or ensemble combinations of several. The model learns which feature combinations correlate with "malicious" and which correlate with "benign." Cross-validation and holdout test sets ensure the model generalizes to unseen samples rather than memorizing the training set.

Step 4: Deploy for inference

The trained model is deployed where it can score new samples in real time. For file-based detection, that's usually an endpoint agent on the user's device or a cloud sandbox. For network-based detection, it's a classifier sitting inline on the DNS resolver or network proxy. A new sample arrives, features are extracted, the model scores it (typically as a probability between 0.0 and 1.0), and a threshold determines the action — block, allow, or flag for human review.

The critical difference from signatures: the model has never seen this specific sample before, but it recognizes the pattern. A new polymorphic variant of existing ransomware changes its byte sequence but still imports the same encryption APIs, still has high-entropy sections, still lacks a valid certificate — the model catches it because the features match the malicious pattern, even though no human wrote a rule for this variant.

The three main ML approaches to malware detection

ML-based detection isn't one thing — it's at least three different approaches, each operating at a different layer of the stack and catching different categories of threats.

1. Static analysis: inspecting the file without running it

Static ML models analyze the file's structure, metadata, and contents without executing it. Features include PE header attributes, section entropy, import tables, embedded strings, resource metadata, packer identification, and increasingly, raw byte sequences fed into convolutional neural networks that learn their own features from the binary directly.

Strengths: fast (no sandbox execution needed), scales to millions of files, catches many commodity threats. Weaknesses: can be evaded by obfuscation, packing, or metamorphic code that restructures itself. A sufficiently motivated attacker can transform their binary until the static features look benign. The MITRE ATT&CK framework documents these evasion techniques under Defense Evasion (TA0005).

2. Dynamic/behavioral analysis: running the file and watching what it does

Dynamic analysis executes the suspect file in a sandbox (an instrumented virtual machine) and records its runtime behavior: which files it reads and writes, which registry keys it modifies, which network connections it opens, which processes it spawns, which system calls it makes. An ML model trained on these behavioral traces can classify the sample based on what it does, not what it looks like.

Strengths: much harder to evade — even if the binary is obfuscated, the behavior has to unfold for the malware to achieve its goal. Ransomware has to encrypt files; a keylogger has to hook the keyboard; a C2 beacon has to phone home. The behavior is the ground truth. Weaknesses: slow (sandbox execution takes seconds to minutes), expensive (VMs consume compute), and some malware detects sandboxes and refuses to execute. Sandbox evasion is its own cat-and-mouse game.

3. Network traffic analysis: classifying connections without inspecting payloads

Network-level ML operates on the metadata of network connections — DNS queries, TLS handshake attributes, flow statistics, domain registration patterns — without decrypting or inspecting the actual payload. This is the approach most relevant to consumer mobile security, and it's where Casper's Cloak operates.

Strengths: works on any device without installing an endpoint agent that scans files, handles encrypted traffic without breaking encryption, scales to millions of DNS queries per second, catches phishing and C2 infrastructure at the domain-registration level before the malware even reaches the user's device. Weaknesses: can't see what's inside the encrypted payload, can't detect malware that doesn't make network connections, and relies on the quality of domain/network features.

What features does the model actually look at?

This is where the abstract "ML detects patterns" becomes concrete. For network-level ML classification — the approach most relevant to Casper — the features fall into six categories:

Domain age and registration patterns. Malicious domains are overwhelmingly young — registered days or hours before use, often in bulk via automated registrars. A domain registered 72 hours ago through a privacy-WHOIS registrar that also registered 200 other domains in the same batch is statistically far more likely to be malicious than a domain registered eight years ago with a consistent WHOIS record. The classifier weights domain age heavily, and for good reason: NIST's Cybersecurity Framework identifies asset management and supply-chain risk (including domain provenance) as core identify-function controls.
TLS certificate attributes. Legitimate sites increasingly use certificates from established CAs with organization validation. Phishing and malware infrastructure leans on free, automatically issued DV (Domain Validation) certificates — specifically, a high volume of Let's Encrypt certificates issued in rapid succession for domains with no prior web presence. The classifier examines certificate issuer, validity period, SAN (Subject Alternative Name) entries, and whether the certificate chain matches the domain's apparent purpose.
DNS behavior. Malware C2 infrastructure often uses fast-flux DNS (rapidly rotating IP addresses behind one domain), DGA-generated domain names (algorithmically generated strings like xk3j7mq9.net), or unusual record types (TXT records used for data exfiltration). The classifier looks at TTL values, the number of IP addresses returned, geographic diversity of resolved IPs, and lexical entropy of the domain name itself.
Hosting infrastructure. Bulletproof hosting providers and specific ASNs (Autonomous System Numbers) are disproportionately associated with malware infrastructure. The model learns which hosting environments correlate with malicious activity — not as a blocklist, but as a weighted feature alongside all the others.
Request patterns. Malware C2 beacons tend to produce distinctive traffic patterns: regular-interval callbacks, small payloads in both directions, connections to infrastructure that has no legitimate web content. The classifier can identify these patterns from flow metadata (timing, size, direction) without inspecting the encrypted content.
Lexical and structural analysis of the domain name. DGA domains have high entropy and low pronounceability. Phishing domains mimic legitimate brands with character substitutions (paypa1.com, arnazon-security.com). NLP-based features measure edit distance to known brands, character n-gram distributions, and whether the domain follows the structural patterns of legitimate registrations.

No single feature is dispositive — plenty of legitimate domains are new, use Let's Encrypt, or are hosted on shared infrastructure. The classifier's job is to weigh all features simultaneously and produce a risk score. A new domain alone might score 0.3; a new domain with a DGA-like name, a 48-hour-old Let's Encrypt cert, hosted on a known-abused ASN, serving no web content? That scores 0.95+.

How Casper's Cloak uses network-level ML detection

Casper's threat protection operates at the DNS resolution layer. When your device makes a DNS query — any app, any connection — the query passes through Casper's resolver before an IP address is returned. Here's what happens in the milliseconds between query and response:

The DNS query arrives at Casper's resolver. Your phone asks "what's the IP for suspicious-domain.com?" The query travels through the encrypted VPN tunnel, so no one on the local network (coffee-shop WiFi, ISP) can see it.
Static blocklist check. First, a fast lookup against curated blocklists — known phishing domains, known malware C2 domains, known tracker domains. This is the equivalent of the signature check: precise, fast, and limited to known threats. If there's a match, the domain is blocked immediately.
ML classifier scoring. If the domain isn't on any static list, the classifier extracts features: domain age, registrar, certificate metadata, DNS record history, lexical analysis, hosting infrastructure, and recent query patterns for this domain across all Casper users (anonymized and aggregated). The model scores the domain's risk.
Threshold-based decision. If the risk score exceeds the configured threshold, the domain is blocked (the resolver returns NXDOMAIN). If it's below the threshold, the legitimate IP is returned. Borderline scores can trigger a "soft block" — the user sees a warning interstitial rather than a hard block, with the option to proceed.
The entire process completes before the connection is established. The classification happens at DNS resolution time — before the TCP handshake, before the TLS negotiation, before any data is exchanged. The malicious connection never opens.

This is what makes DNS-layer ML detection particularly well-suited to mobile: it protects every app on the device, requires no on-device scanning (the model runs on our resolver infrastructure), doesn't drain battery, and works on both iOS and Android without the architectural limitations that prevent traditional antivirus from functioning on mobile. We covered the phishing side of this in detail in our analysis of three real phishing campaigns.

What Casper's network-layer ML does not do: it does not scan files on your device. It does not inspect the contents of encrypted connections. It does not monitor app behavior on-device. If a malicious file is already on your device and communicates only via IP address (no DNS lookup), Casper's DNS-layer classifier won't see it. This is the honest boundary of network-level detection — it's extremely effective at catching threats that involve domain-based infrastructure (which is the vast majority of phishing, malware distribution, and C2 communication), but it's not a replacement for on-device endpoint security in enterprise environments where targeted, sophisticated attackers might use IP-only C2 channels.

Limitations and honest caveats

ML-based malware detection is a genuine improvement over signatures alone, but it has real limitations that any honest discussion has to name.

False positives are a real cost

Probabilistic classification means the model will sometimes flag legitimate domains as malicious. A brand-new startup with a recently registered domain, a Let's Encrypt cert, and hosting on a budget provider can trip the same features that malware infrastructure trips. The false positive rate for a well-tuned DNS classifier is typically in the 0.01–0.1% range — low in percentage terms, but when you're processing millions of queries per day, even 0.01% means thousands of legitimate queries flagged. The engineering work is in tuning the threshold, building fast allowlisting mechanisms, and continuously retraining on false-positive feedback to push the rate lower.

Adversarial ML is a real discipline

Attackers who understand ML classifiers can engineer their infrastructure to evade them — aging domains before use, obtaining EV certificates, hosting on reputable providers, using legitimate-looking domain names. This is called adversarial machine learning, and it's an active area of academic research and real-world cat-and-mouse. The MITRE ATLAS framework (Adversarial Threat Landscape for AI Systems) catalogs known adversarial ML techniques specifically for this domain.

The defender's advantage: evading one feature is easy; evading all features simultaneously is expensive. Aging a domain costs time (and time is money for attack campaigns). Obtaining an EV cert requires organizational identity verification. Hosting on reputable providers means accepting their abuse-response processes. The ML classifier doesn't need each feature to be individually unbeatable — it needs the combination to be expensive enough to evade that most attackers don't bother, and the ones who do get caught on other features or by the next retraining cycle.

Encrypted payloads are opaque at the network layer

Network-level ML sees metadata — domain names, certificate attributes, flow statistics — not payload content. If a malicious payload is delivered over HTTPS from a legitimate-looking domain (a compromised WordPress site, a malicious file hosted on a major cloud provider's storage), the DNS classifier may not flag it because the domain's features look clean. This is why network-level detection complements but doesn't replace other layers: tracker blocking at the DNS layer stops the data exfiltration channel; browser and OS-level protections (Google Safe Browsing, Apple's XProtect) handle the payload-inspection layer.

Model drift requires continuous retraining

The threat landscape changes constantly. Attackers shift registrars, adopt new hosting providers, change their domain-generation algorithms. A model trained on 2025 data will degrade in accuracy over months if not retrained on fresh labeled data. Continuous retraining on new threat intelligence — plus feedback loops from false positives and false negatives in production — is essential. A deployed ML classifier isn't a "train once, ship forever" artifact; it's a living system that needs ongoing investment.

What this means for your phone

If you're reading this on a phone — which statistically, most of you are — here's the practical upshot:

Traditional antivirus doesn't work well on mobile. iOS doesn't allow it architecturally; on Android it drains battery and still can't scan most app data. The signature model was built for desktop computers with full filesystem access.
Network-level ML detection works on every mobile platform. It doesn't need filesystem access, doesn't need to run on-device, and protects every app simultaneously. When your phone tries to reach a phishing domain, the DNS-layer classifier blocks it before the connection opens — regardless of which app initiated the connection.
The combination of static blocklists and ML classification covers both known and unknown threats. Blocklists catch the domains that have already been cataloged; the ML classifier catches the new ones that haven't. Together, they're meaningfully more effective than either alone.
No single layer covers everything. DNS-level ML catches domain-based threats (phishing, malware distribution, C2 communication, tracking). It doesn't catch first-party-hosted threats or threats that bypass DNS entirely. A layered approach — Casper at the network layer, browser protections at the content layer, OS protections at the system layer — gives you the broadest coverage.

The broader point: machine learning in security isn't hype and it isn't magic. It's a specific engineering discipline that recognizes patterns in data — and for the specific problem of "is this domain likely malicious?", the pattern-recognition approach outperforms static lists by a wide margin, especially for the zero-day threats that signatures will never catch in time. The question is whether your security tools use it, and whether they're honest about what it covers and what it doesn't.

Related: Phishing texts in 2026 — anatomy of three campaigns shows the ML classifier in action against real-world phishing infrastructure. Threat protection is the feature page for Casper's ML-based detection. Tracker blocking covers the complementary DNS-level filtering layer. For the academic foundations, the MITRE ATLAS framework catalogs adversarial ML techniques, and NIST's Cybersecurity Framework provides the broader risk-management context these detection systems operate within.

How does machine learning detect malware? The step-by-step explainer