User Authentication Strategies in Multi-Institutional Environments

Common Practice for Remote Resource Authentication

Overview of Proxy Servers

A proxy is a service that sits between web servers (or, more accurately, "origin web servers") and clients. This service receives requests from clients and makes requests to servers on behalf of the clients.

Reasons to use proxy servers

Proxies for remote resource access

Basic theory

  • Vendors use IP address limitations to restrict access to the institution's network
  • Site puts a proxy server in the range of valid IP addresses for the use of users outside the netowrk
  • Just because you can do it doesn't mean it is legal
  • Bandwidth considerations: traffic may pass through your Internet connection twice!

Authentication step

  • This is the most difficult step. How do we allow only authorized users to access the proxy server, and consequently the remote databases?

Sources for authentication

  • Integrated Library Systems
  • Barcode recognition
  • Flat-file username/password
  • Login test against POP/IMAP servers
  • LDAP directory service
  • Kerberos, Netware NDS, Microsoft Networking

Transparent versus non-transparent proxy servers

Transparent Proxy
A proxy that passes requests and responses unmodified, except as required for proxy authentication.
Non-transparent Proxy
A proxy that somehow modifies the request or response to provide some added value to the client or user.
Rewriting Proxies
A special form of the Non-transparent Proxy server which examines the URLs in HTML documents passing through the proxy, and rewrites them to point back to the proxy server
"http://firstsearch.oclc.org/dbname=WorldCat;graphics=low;FSIP" becomes
"http://proxy.college.edu/firstsearch/dbname=WorldCat;graphics=low;FSIP" or
"http://proxy.college.edu:2049/dbname=WorldCat;graphics=low;FSIP" or
"http://80-firstsearch.oclc.org.proxy.college.edu/dbname=WorldCat;graphics=low;FSIP"

Advantages, Disadvantages

Transparent Proxy servers...
...are less computing intensive because they do not examine the content of each HTML page.
...are easier to program than Rewriting Proxies.
...require users to reconfigure their browsers (education problem).
...may not work with some corporate or commercial Internet Service Providers.
Rewriting Proxy servers...
...require no changes to the user's browser and work with browsers on firewalled networks.
...sensitive to "incorrect" HTML.
...may not work with sites using sophisticated JavaScripts.

Selection of Proxy Servers

Transparent Proxy servers Rewriting Proxy servers
  • Apache
  • Squid
  • Microsoft Internet Security and Acceleration (ISA) Server
  • Delegate
  • EZproxy
  • Libproxy
  • Obvia
  • Web Access Management

OhioLINK

Matrix of services

OhioLINK-hosted vendor-hosted
Participating
campus
networks
IP address recognition IP address recognition
Unrecognized
network
addresses
OhioLINK Remote Authentication OhioLINK Remote Authentication plus rewriting proxy

Authentication system

Key points

  • Authenticate users using OhioLINK-member patron databases when they come from an unknown IP address
  • Store an .ohiolink.edu domain cookie in the authenticated user's browser with the credentials
  • Validate the user's credentials against a database of valid cookies at each OhioLINK server

End User's Experience

  1. User requests services from an OhioLINK server
  2. Browser is redirected to an authentication form if IP address is not recognized
  3. User submits credentials and is authenticated via local patron database
  4. Credentials cookie is set in browser and browser is redirected to server from step 1

Technical Process

  1. Authentication form data processed by CGI script
  2. Credentials checked a cache of patron records. If not found, requested from campus ILS
  3. Patron record added to cache
  4. One-way hash used to create credentials cookie
  5. Cookie added to valid cookies database
  6. Cookie and a page with a 'META refresh' tag is sent to browser

Server Architecture

  • Using Apache HTTPD servers for database servers
  • Each database server has two virtual hosts
    1. IP-based authentication
    2. Cookie-based authentication
  • Mod_rewrite is used to redirect browsers with cookies to cookie-based virtual host
  • Browsers with unrecognized IP addresses are redirected to authentication server
  • Custom Apache module is used for cookie validation

Policies and Privacy

  • Patron record at home institution must be valid: member of the institution and not expired
  • Authentication failures are tracked
  • Cookies expire in two hours
  • Cookies are checked to see if they are coming from the same IP address
  • Cookie is not added to the patron record cache; user cannot be identified from cookie
  • For auditing purposes, a one-week record of cookie-to-patron matches is kept

Proxy servers

Delegate

  • Older of the two servers
  • Rewriting plug-in for Delegate
  • Authorization module can selectively allow access to services based on institutional participation

EZproxy

  • Rewriting proxy server (EZproxy from Useful Utilities) used for vendor-hosted databases common to all consortium participants
  • Custom CGI script detects OhioLINK credentials cookies and, if necessary, redirects to the authentication server

University of Texas system

TexShare

Future Developments

"Authenticate Locally, Authorize Globally" -- Project Shibboleth

"Shibboleth ... is developing architectures, policy structures, practical technologies, and an open source implementation to support inter-institutional sharing of web resources subject to access controls."
http://middleware.internet2.edu/shibboleth/

High-level Overview


http://middleware.internet2.edu/shibboleth/docs/draft-internet2-shibboleth-arch-v05.pdf

Key Project Attributes

  • Federated administration
  • Access control based on attributes
  • Active management of privacy
  • Standards based
  • A framework for multiple, scaleable trust and policy sets ('Clubs')
  • A standard AttributueValue vocabulary

Why Do It?

Benefits to Institutions

  • Provide information services to users without cumbersome IP address restrictions and proxy servers.
  • Enhance user privacy by protecting the identity while allowing authorization via non-personally identifying attributes.
  • Single infrastructure for inter-institutional collaborative relationships.

Benefits to Information Providers

  • Finer granularity of access control.
  • Greatly simplified management of use rules and user bases.
  • No need to maintain databases of authorized users.

For More Information...

  • NISO Annual Meeting. Sunday, 2pm to 4pm at the Hyatt Regency Atlanta Hanover Room C/D. (NCIP, Shibboleth, OpenURL, METS)
  • Shibboleth website. http://middleware.internet2.edu/shibboleth/

Wrap-up / Evaluation