How to understand when a proxy is lying: verification of the physical locations of network proxies using an active geolocation algorithm

How to understand when a proxy is lying: verification of the physical locations of network proxies using an active geolocation algorithm




People around the world use commercial proxies to hide their true location or identity. This can be done for a variety of tasks, including access to blocked information or privacy.

But how correct are these proxy providers when they claim that their servers are located in a particular country? This is a fundamentally important question, the answer to which determines whether it is possible to use a certain service in general for those clients who are concerned with the protection of personal information.

A group of American scientists from the universities of Massachusetts, Carnegie Mellon and Stony Brook published a research , during checked the actual location of the servers of seven popular proxy providers. We have prepared a brief retelling of the main results.

Introduction


Proxy operators often do not provide any information that would confirm the accuracy of their statements about the location of servers. IP-to-location databases usually confirm advertising theses of such companies, however there is a large amount of evidence of errors in these databases.

In the course of the study, American scientists estimated the location of 2,269 proxy servers managed by seven proxy companies and located in a total of 222 countries and territories. The analysis showed that at least one third of all servers are not located in those countries that the companies say in their marketing materials. Instead, they are located in countries with cheap and reliable hosting: in the Czech Republic, Germany, the Netherlands, the UK and the USA.

Server Location Analysis


Commercial VPN and proxy providers can affect the accuracy of IP-to-location databases - companies have the ability to manipulate, for example, location codes in the names of routers. As a result, marketing materials may declare a large number of locations accessible to users, whereas in reality, to save and increase reliability, the servers are physically located in a small number of countries, although the IP-to-location of the database suggests the opposite.

To check the actual location of the servers, researchers used an active geolocation algorithm. With its help roundtrip of a packet sent to the server side and to other known hosts on the Internet was evaluated.

At the same time, only less than 10% of the tested proxies respond to ping, and scientists couldn’t run any measurement software on the server itself, for obvious reasons. They only had the ability to send packets through a proxy, so the roundtrip to any point in space is the amount of time it takes for the packet to go from the test host to the proxy and from the proxy to the addressee.



During the study, a specialized software was developed based on four active geolocation algorithms: CBG, Octant, Spotter and hybrid Octant/Spotter. A solution code is available on GitHub.

Since it was impossible to rely on the IP-to-location of the database, the researchers used the RIPE Atlas anchor hosts list for experiments — the information in this database is available online, constantly updated, and the documented locations are correct, moreover, the hosts from the list constantly send ping signals to each other and update roundtrip data in a public database.

Developed by scientific solutions is a web application that establishes secure (HTTPS) TCP connections over an unsecured HTTP port 80.If the server does not listen to this port, then after one request it will fail, however if the server is listening to this port, the browser will receive a SYN-ACK response with the TLS ClientHello packet. This will trigger a protocol error, and the browser will display an error, but only after the second roundtrip.



Thus, a web application can measure one or two roundtrip times. A similar service was implemented as a command-line program.

None of the tested providers names the exact location of their proxy servers. At best, cities are mentioned, but most often there is information only about the country. Even when a city is mentioned, incidents can occur - for example, researchers studied the configuration file of a server called usa.new-york-city.cfg, which contained instructions for connecting to a server called chicago.vpn-provider.example. So you can more or less accurately confirm only the server’s affiliation to a specific country.

Results


According to the results of the tests using the active geolocation algorithm, the researchers were able to confirm the location of 989 out of 2269 IP addresses. In the case of 642, this was not possible, and 638 are definitely not in the country where they should be according to the assurances of the proxy services. More than 400 of these false addresses are in reality on the same continent as the declared country.



Correct addresses are in the countries that are most often used for hosting servers (click on the image to open in full size)

Suspicious hosts were found in each of the seven tested providers. The researchers asked for comments from companies, but they all refused to communicate.

Source text: How to understand when a proxy is lying: verification of the physical locations of network proxies using an active geolocation algorithm