Overview of Proxy Servers
A proxy is a service that sits between web servers (or, more accurately, "origin web servers") and clients. This service receives requests from clients and makes requests to servers on behalf of the clients.
Reasons to use proxy servers
- Bandwidth conservation
- Statistics
- Filtering
- Remote resource access
Proxies for remote resource access
Basic theory
- Vendors use IP address limitations to restrict access to the institution's network
- Site puts a proxy server in the range of valid IP addresses for the use of users outside the netowrk
- Just because you can do it doesn't mean it is legal
- Bandwidth considerations: traffic may pass through your Internet connection twice!
Authentication step
- This is the most difficult step. How do we allow only authorized users to access the proxy server, and consequently the remote databases?
Sources for authentication
- Integrated Library Systems
- Barcode recognition
- Flat-file username/password
- Login test against POP/IMAP servers
- LDAP directory service
- Kerberos, Netware NDS, Microsoft Networking
Transparent versus non-transparent proxy servers
- Transparent Proxy
- A proxy that passes requests and responses unmodified, except as required for proxy authentication.
- Non-transparent Proxy
- A proxy that somehow modifies the request or response to provide some added value to the client or user.
- Rewriting Proxies
- A special form of the Non-transparent Proxy server which examines the URLs in HTML documents passing through the proxy, and rewrites them to point back to the proxy server
- "http://firstsearch.oclc.org/dbname=WorldCat;graphics=low;FSIP" becomes
"http://proxy.college.edu/firstsearch/dbname=WorldCat;graphics=low;FSIP" or
"http://proxy.college.edu:2049/dbname=WorldCat;graphics=low;FSIP" or
"http://80-firstsearch.oclc.org.proxy.college.edu/dbname=WorldCat;graphics=low;FSIP"
Advantages, Disadvantages
- Transparent Proxy servers...
- ...are less computing intensive because they do not examine the content of each HTML page.
- ...are easier to program than Rewriting Proxies.
- ...require users to reconfigure their browsers (education problem).
- ...may not work with some corporate or commercial Internet Service Providers.
- Rewriting Proxy servers...
- ...require no changes to the user's browser and work with browsers on firewalled networks.
- ...sensitive to "incorrect" HTML.
- ...may not work with sites using sophisticated JavaScripts.
Selection of Proxy Servers
| Transparent Proxy servers | Rewriting Proxy servers |
- Apache
- Squid
- Microsoft Internet Security and Acceleration (ISA) Server
- Delegate
|
- EZproxy
- Libproxy
- Obvia
- Web Access Management
|