The Domain Registration Risk Calculator is a tool for domain registrars to analyse the likelihood that new domains will be used for fraudulent activities. The service identifies domains which are deceptively similar to legitimate websites run by banks and other institutions commonly targeted by phishing attacks.
Since such registrations are often made using stolen credit cards, there are significant advantages to the registrar in refusing them.
Netcraft has blocked well over five million phishing attacks since 2005, and our phishing feed is used by all of the major web browsers, and also by leading anti-virus companies, domain registrars, registries, certificate authorities and hosting companies. Our extensive experience in identifying, validating and eliminating phishing sites has provided us with a wealth of knowledge of the tricks that are used by fraudsters to create a deceptive domain name. We analyse our database of over six thousand organisations which have been targeted by phishing attacks to extract a comprehensive set of homoglyphs that could be used to convert bona fide domains to fraudulent ones. Example transformations are the corresponding characters from an IDN alphabet, or ASCII character set substitutions such as replacing “o” (letter O) with “0” (zero), or replacing “l” (lower-case letter l) with “1” (digit one), or simply appending or prepending strings such as update or secure.
A Facebook phishing site, along with its Domain Registration Risk score
The service computes a registration risk score for a proposed domain, which gives a measure of the likelihood that this candidate domain may be used to host a phishing attack. We do this by using the results of two algorithms:
- The first algorithm, Phish target score compares the candidate domain to each of the frequently-phished legitimate domains we have on record. This comparison is done on a per-character basis, and the score is formed by looking at the minimum set of edits required to map from one to the other.
The algorithm recognises certain tricks commonly used in domain names to deceive victims, such as double letters (paaypal.com) or confusing characters or combinations of characters (paypa1.com). We also check against a list of deceptive prefixes and suffixes that are frequently used by phishing sites, including signin and verify.
As well as using a set of fixed rules, this algorithm also retains the flexibility to match new mappings and edits that have not been seen before. Using the suggested cut-off of a minimum score of 5/10, this method identifies 278 (12.7%) out of the 2,191 phishing domains currently blocked by Netcraft.
The second algorithm, String entropy score, works entirely differently. Many phishing domains in our database are essentially random strings of alphanumeric digits, yet very few legitimate sites follow this pattern. The string entropy test looks to see if a domain looks like a combination of real dictionary words and plausible names, or whether it looks more like a randomised string. The higher the score, the more random a string appears to be.
Although most dictionary strings score zero, the suggested cut-off is a minimum score of 5/10; any domain scoring higher than this is very likely to be random, but below this score false positives are increasingly likely.
Using the suggested cut-off identifies 474 (21.6%) of the 2,191 identified phishing domains and these are substantially non-overlapping with those domains spotted by the first method.
These two methods work together to give sophisticated and largely independent indicators of the likelihood that a candidate domain may be used to host phishing attacks against a known legitimate target. Using the overall risk rating produced by combining the two scores would presently detect 742 (33.9%) of the 2,191 currently blocked phishing domains.
The domains in the table below have run phishing attacks and are shown together with their domain registration risk.
A web-based interface to the system is available for evaluation purposes and ad-hoc queries. For automated processes and bulk queries an API is available to return domain registration risk information in JSON format. Bespoke formats can be made available on request.
Entering the domain securepaypa1.com into the test system produces the report shown below:
Please get in touch (email@example.com) if you would like to try out this service or for subscription information.