Proxy
Generally, a proxy server is a device that intercepts network connections made from a client to a destination server. The best known basic meaning of the word proxy is: function of something that acts as a substitute for something else.
In other words, a proxy server is a specific type of application server that routes HTTP requests from clients to the content servers that actually do the work.
Usually, a proxy server intercepts a connection between a device and the Internet, manipulating the direct communication between the two points. Also known as an application-layer gateway between one network and a larger-scale network (e.g., the Internet), a proxy server can be an application or a specific device on a network that filters outgoing and incoming data transfers.
The strategic location of the intermediate point allows various functionalities for a proxy server: access control, trace logging, restriction to certain types of information and traffic, performance improvements, communication anonymity, web cache, among others. Depending on the context, the intermediation performed by the proxy can be considered by users, administrators and service providers alike.
There are two classifications of proxies according to the objective of whoever implements their intermediation policy:
- local proxy: In this case the one implementing the policy is the same as the one making the request, hence the name local. They are usually located on the same device as the client that makes requests. They are widely used so that the client can control traffic and establish filtering rules to ensure, for example, that private information is not disclosed.
- network proxy: In this implementation, proxy policies are set by an entity and apply to all devices using its network infrastructure; it is also often referred to as an external proxy. It is mainly used to implement filters, block content, traffic control, trace logging, among others.
Operation
A proxy allows other devices to connect to a network indirectly through it. When a computer on the network wants to access a piece of information or resource, it is actually the proxy that performs the communication and then transfers the result to the initial computer.
Generally, a proxy is an intermediate point between a network device and the Internet to access a resource. When browsing via proxy, the client making the request is not actually accessing the server directly, but rather the proxy accesses the request and returns the result of the request.
It is the proxy server that is responsible for the network address translation (NAT) also known as IP masquerading. This is what happens when multiple devices share a single Internet connection.
Within the local area network (LAN), computers use IP addresses reserved for private use, and generally a unique outgoing IP address to another network or the Internet. In this sense, the proxy is in charge of translating the private addresses to this unique public address to make the requests, as well as to distribute the responses received to each internal user that requested it.
This is very common in companies and homes with several networked devices and a single external Internet access. Internet access via NAT provides some security, since there is no direct connection between the outside and the private network, and the computers are not exposed to direct attacks from the outside.
The proxy server, in addition, usually has a storage cache, which stores a copy of the web pages that have been visited, so that the first access will be physically to the destination server, while the successive ones use cache. It is the proxy that checks if it has been accessed previously (if it is in its cache), and if it has not been modified on the server since the last request from the proxy. If this is the case, instead of requesting the page again from the server, it sends the cached copy to the user, which significantly improves the performance or speed of the Internet connection of the computers behind the proxy.
Proxy servers have multiple functions, some of the most common are:
- Interface between the internal and the public network (Internet): to control access to the Internet by devices on a network.
- Bandwidth control: network users can be assigned certain resources and a certain available bandwidth. Control tasks also include server availability monitoring.
- Protection against network attacks: the proxy server is placed between the actual data server and the users. Websites that work with sensitive customer data, such as online stores, for example, often use this solution to protect their servers.
- Network logging: Proxy servers are commonly used to log network activity, so that malicious access can be identified more quickly.
- Anonymous traffic: on the Internet, the use of a proxy can anonymize an IP address and thus evade restrictions on access, for example, to a certain website.
- Reduced load on the actual server: a proxy server can store the requests sent to it, and thus deliver requested information without causing a load on the actual application server; at the same time, the requesting client receives the information more quickly.
- Content blocking: in public networks, access to the Internet can be restricted through a proxy server, thus limiting browsing of websites according to their content, or at certain times. In addition, filtering functions can also include removal of advertising while browsing.
- Time adjustment: A proxy server can adjust timeouts and time limits for requests and responses to avoid poor network performance. In addition, it can help customize the content denial message for different restrictions.
Proxy Types
There are many different proxy types, each one with its specific functions and features. Here is a short list of some of the most common ones:
- Web Proxy/Web Cache: this is the most common type and is responsible for accessing the web, masking the IP address of the devices on a network by its own one when requesting an Internet resource. In addition, it provides cache storage for downloaded content, which is shared by all computers on the network, improving access times for overlapping queries, while relieving the load on links to the Internet.
- Reverse proxy: it is usually configured in combination with one or more application servers that receive requests from the Internet, to protect them, for example, from denial-of-service attacks. All incoming Internet traffic destined for one of these web servers passes through the reverse proxy, as an additional layer of security. Moreover, when a secure web site (HTTPS) is created, the encryption is not usually done by the web server itself, but by the proxy. Similarly, the reverse proxy can perform load distribution among several web servers, rewriting external requests to internal addresses depending on the server where the requested information is located, as well as storing static content such as images or graphics to speed up the response.
- Transparent proxy: does not require configuration by the client and is applied directly at the network layer level. They are generally used by Internet service providers (ISP) for web filtering, among other things.
- Open proxy: these are proxy servers that accept requests from any device, whether or not they are connected to the same network. In this configuration the proxy executes any request as if it were its own. They are not highly recommended because they are often used as a gateway for sending spam emails, so some services deny access from them.
Advantages and Disadvantages
In general, proxy servers make following possible:
- Control: only the intermediary does the actual work, so you can limit and restrict user rights, and give permissions only to the proxy server.
- Saving: only one of the users (the proxy) has to be ready to do the real work, i.e. to have the resources and their configuration, in this case, for example, the external IP address to the Internet.
- Speed: if several clients request the same resource, the proxy can cache the response of a request to return it directly when another user needs it, thus improving response time.
- Demand: it can cover a large number of users, to request, through it, certain web contents.
- Filtering: the proxy may refuse to respond to certain requests if it detects that they are prohibited.
- Anonymity: Connecting anonymously to an external resource without revealing the client's IP address, since the IP address used by the proxy to obtain the resource is used.
On the other hand, the use of proxy servers can also cause:
- Non-identification: if all users identify themselves as one, it is difficult for the accessed resource to differentiate between them, since all requests are made from the same IP address. This can be a problem in some scenarios when real identification is required, or advanced operations that should be done through some ports or protocols.
- Intrusion: storing pages and objects that customers request can be a privacy violation for some users, especially when caching and storing copies of data.
- Inconsistency: when using caching, the displayed web pages may not be up to date if they have been modified since the last load was performed. This problem is currently significantly restricted, as the proxy connects to the remote server to check that the version it has in the cache is still the most up to date. However, this does occasionally occur.
