Netcraft’s Deceptive Domain Score is a tool for domain registrars, registries, and certificate authorities to analyse the likelihood that a domain name will be used for fraudulent activities. The service identifies domain names that are deceptively similar to legitimate websites run by banks and other institutions commonly targeted by phishing attacks.
Netcraft’s extensive experience identifying, validating and eliminating phishing sites provides a wealth of knowledge of the tricks that are used by fraudsters to create a deceptive domain name, such as:
- Using Internationalized Domain Names to use lookalike characters from different alphabets.
- Substituting lookalike characters, such as “o” (letter O) and “0″ (zero).
- Inserting, deleting or re-ordering characters.
- Adding prefixes and suffixes, such as “update”, “login” and “secure”.
Exploiting this knowledge, Netcraft’s Deceptive Domain Score provides a metric to quantify the risk of a domain name later being used for phishing or fraud. Scores are given from 0 to 10, with 5 being a common threshold to trigger additional verification of the domain name.
The service computes a Deceptive Domain Score for a domain name, which gives a measure of the likelihood that this domain may be used to host a phishing attack. We do this by using the results of two algorithms:
Phish target score
This score compares the domain name to the frequently-phished legitimate domains we have on record. This comparison is done on a per-character basis, and the score is formed by looking at the minimum set of edits required to map from one to the other.
The algorithm recognises the tricks commonly used in domain names to deceive victims, such as double letters (paaypal.com) or confusing characters or combinations of characters (paypa1.com).
We also check against a list of deceptive prefixes and suffixes that are frequently used by phishing sites, including signin and verify. As well as using a set of fixed rules, this algorithm also retains the flexibility to match new mappings and edits that have not been seen before.
String entropy score
The second algorithm works entirely differently. Many phishing domains in our database are essentially random strings of alphanumeric digits, yet very few legitimate sites follow this pattern. The string entropy test looks to see if a domain looks like a combination of real dictionary words and plausible names, or whether it looks more like a randomised string. The higher the score, the more random a string appears to be.
These two methods work together to give sophisticated and largely independent indicators of the likelihood that a candidate domain may be used to host phishing attacks against a known legitimate target.
The domains in the table below have run phishing attacks and are shown together with their Deceptive Domain Score.
|Domain||Target||Deceptive Domain Score|
Change the registrable point
Remove prefix of 'update'
Used in Dropbox phishing attack
Remove character 'x'
Swap substring 'í' for 'i'
Remove character 'p'
Remove character 'r'
Swap substring 'a' for 'e'
Swap substring 'a' for 'o'
Remove character 'i'
Remove character '-'
Swap substring '-' for 'c'
Swap substring '2' for 'ui'
Swap substring '3' for 's'
For Registries & Registrars
Registries and Registrars are in an excellent position to prevent the purchase of domains intended solely for phishing. Since such registrations are often made using stolen credit cards, there are significant advantages to the registrar in refusing them.
ICANN’s New gTLD Registry Agreement obliges Registry Operators to:
[require] Registrars to include in their Registration Agreements a provision prohibiting Registered Name Holders from distributing malware, abusively operating botnets, phishing, piracy, trademark or copyright infringement, fraudulent or deceptive practices, counterfeiting or otherwise engaging in activity contrary to applicable law, and providing (consistent with applicable law and any related procedures) consequences for such activities including suspension of the domain name
Implementing a robust technical defence against abusive registration helps protect the TLD from fraud and maintain public trust. The Deceptive Domain Score can be used as part of this defence, making life more difficult for fraudsters and simultaneously protecting the registry and registrar against fraudulent payments.
For Certificate Authorities
The CA/Browser Forum’s Baseline Requirements, the rules that publicly-trusted Certificate Authorities are expected to follow, mandate that "High Risk" certificate requests are subject to increased vetting:
The CA SHALL develop, maintain, and implement documented procedures that identify and require additional verification activity for High Risk Certificate Requests prior to the Certificate’s approval
High Risk Certificate Request: A Request that the CA flags for additional scrutiny by reference to internal criteria and databases maintained by the CA, which may include names at higher risk for phishing or other fraudulent usage
Many CAs have failed to detect obviously-fraudulent domain names in certificate requests and issued SSL certificates to fraudsters. While most certificates obtained by fraudsters are domain-validated, one fraudster managed to acquire a higher-assurance OV certificate for paypal-office.com.
A web-based interface to the system is available for evaluation purposes and ad-hoc queries. For automated processes and bulk queries an API is available to return domain registration risk information in JSON format. Bespoke formats can be made available on request.
Entering the domain securepaypa1.com into the test system produces the report shown below:
|Domain||The domain name for which the deceptive domain score is being computed.|
|Probable phishing target||A domain operated by a bank or other phishing target to which the domain under test can be reduced using one or more of the transforms that are commonly used in deceptive phishing domains.|
|Overall risk||This overall risk rating is the value we recommend using when trying to determine whether or not to accept a domain registration request. It is a combination of the phish target score and the string entropy score - usually the maximum of the two, unless one method is certain that the domain cannot be used to host a phishing site.|
|Applicable transformations||A list of the transforms required to map the domain under test to the nearest phishing target domain.|