Netcraft has developed a dataset which shows the hosting locations of the million busiest websites, as determined by visits from users of the Netcraft Anti-Phishing Extension . The dataset gives a guide to the market share of companies hosting the sites responsible for the great majority of web traffic, and is uninfluenced by parked domains, personal sites, shared hosting accounts or the majority of blogs. Although the top 1000 sites are concentrated amongst the web superpowers, Google, Microsoft, Facebook and Amazon, the hosting locations of the top million sites are widely fragmented.
The dataset is presented in an Excel spreadsheet and provides a variety of different filters and selections. Using the dataset, a hosting company can identify its relative position and closest competitors in each of the top 10,000, 100,000 and million tiers of site traffic, and also by region, country, and operating system.
Data for an individual hosting location can be segmented to show the share it has in each order of magnitude band of site traffic, as shown with Google in this example:
- The dataset does not give access to the underlying web site hostnames. Selections of site details including hostnames, hosting location, operating system, web server software, traffic and content technologies may be purchased separately.
- Accesses by the Netcraft Anti-Phishing Extension user community are used to determine site traffic rating.
- Attributing a site to a hosting location requires successful DNS lookup on a site’s ip address. If a DNS lookup fails then the hosting location will be unknown to us.
The dataset is updated monthly and is available on a company license basis.