== Representations, Warranties and Disclaimer ==
OSTYN CONSULTING OFFERS THIS WORK AS-IS AND MAKES NO
REPRESENTATIONS OR WARRANTIES OF ANY KIND CONCERNING THIS WORK,
EXPRESS, IMPLIED, STATUTORY OR OTHERWISE, INCLUDING, WITHOUT
LIMITATION, WARRANTIES OF TITLE, MERCHANTIBILITY, FITNESS FOR A
PARTICULAR PURPOSE, NONINFRINGEMENT, OR THE ABSENCE OF LATENT OR
OTHER DEFECTS, ACCURACY, OR THE PRESENCE OF ABSENCE OF ERRORS,
WHETHER OR NOT DISCOVERABLE.
== Limitation on Liability ==
EXCEPT TO THE EXTENT REQUIRED BY APPLICABLE LAW, IN NO EVENT WILL
OSTYN CONSULTING OR CLAUDE OSTYN BE LIABLE TO YOU ON ANY LEGAL
THEORY FOR ANY SPECIAL, INCIDENTAL, CONSEQUENTIAL, PUNITIVE OR
EXEMPLARY DAMAGES ARISING OUT OF THE USE OF THIS WORK, EVEN IF
OSTYN CONSULTING OR CLAUDE OSTYN HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
To prevent various kinds of nasty exploits, browsers do not allow scripting across frames or windows showing content from different hosts. If the runtime component of a runtime service and SCORM content come from two servers, this prevents the SCORM API from functioning. The solution is not to try to hack the browser or the content. It is to make both runtime components and content come from the same server, or at least what appears to the browser as a single server. There are several ways to do this; one of them is to use the reverse proxy capabilities built into Apache. This document describes only this one method; it does not mean that it is necessarily always the best one.
The following is excerpted from the Apache Software Foundation web page "Apache Module mod_proxy". The general description of proxy modes applies to any implementation, whether it is Apache, Microsoft ISA, or any other technology:
Apache can be configured in both a forward and reverse proxy mode.
An ordinary forward proxy is an intermediate server that sits between the client and the origin server. In order to get content from the origin server, the client sends a request to the proxy naming the origin server as the target and the proxy then requests the content from the origin server and returns it to the client. The client must be specially configured to use the forward proxy to access other sites. [...]
A reverse proxy, by contrast, appears to the client just like an ordinary web server. No special configuration on the client is necessary. The client makes ordinary requests for content in the name-space of the reverse proxy. The reverse proxy then decides where to send those requests, and returns the content as if it was itself the origin.
A typical usage of a reverse proxy is to provide Internet users access to a server that is behind a firewall. Reverse proxies can also be used to balance load among several back-end servers, or to provide caching for a slower back-end server. In addition, reverse proxies can be used simply to bring several servers into the same URL space.
A reverse proxy is activated using the ProxyPass directive or the [P] flag to the RewriteRule directive. It is not necessary to turn ProxyRequests on in order to configure a reverse proxy.
Note that unlike an open forward proxy, which is considered a high security risk, a reverse proxy is tightly controlled, since proxying is only allowed for destinations that you specify by a reverse proxy rule.
The model I suggest to use for reverse proxy through proxyPassReverse is that a virtual directory represents each origin server. When the reverse proxy sees a request for a resource on a specific path, it passes the request on to the corresponding origin server.
For example, let us say that the original URL for a sco
is:
http://somerepository.com/courses/course1/sco1.htm
The URL is modified to a URL relative to the URL of the reverse
proxy before launching the SCO. Let us say that the reverse proxy
is at 192.168.4.2. The modified URL could be
http://192.168.4.2/somerepository_com/courses/course1/sco1.htm
An Apache reverse proxy uses configuration file directives
that map the virtual directory somerepository to the actual
server at somerepository.com:
proxyPass
/somerepositorycom/ http://somerepository.com/
proxyPassReverse
/somerepositorycom/ http://somerepository.com/
Additional statements can be defined to map various virtual
directories to corresponding origin servers. For example,
proxyPass
/somerepositorycom/ http://somerepository.com/
proxyPassReverse
/somerepositorycom/ http://somerepository.com/
proxyPass /adlnetorg/
http://adlnet.org/
proxyPassReverse /adlnetorg/
http://adlnet.org/
proxyPass /lms/
http://lmshostedservice.com/
proxyPassReverse /lms/
http://lmshostedservice.com/
Note that in the example above, the LMS itself is remapped to the a separate host LMS server. The net effect is that when the browser requests a LMS page at http://192.168.4.2/lms/, the request actually gets passed to the LMS at http://lmshostedservice.com, and when it requests a content page at http://192.168.4.2/adlnet/, the request actually gets passed to the content repository at http://adlnet.org. As far as the browser is concerned, everything comes from http://192.168.4.2, and the browser is not aware that there are actually other servers involved.
Of course, there is a catch if the LMS uses active server pages (.NET, .jsp, .cfm, etc.) and the server side code relies on cookies. More about that below.
The following conventions are proposed for modifying of the URLs in a content package to enable relaying through reverse proxy. The conventions support automated modifying of the URLs in manifests, and even automatic generation of the required mapping statements, but can also be applied manually:
For each content repository, a corresponding mapping must exist in a proxyPassReverse statement. This is both a disadvantage, because it means that adding a new repository to the configuration requires updating the Apache configuration, and an advantage, because it means that only known repositories can participate, and thus one can avoid most of the abuse problems that may occur with an open proxy.
The URL modification can be done at launch time, when the LMS decides to launch a SCO and discovers that it has a fully qualified URL, or it can be done as part of the preparation of a manifest for launching by the LMS. It could even be done offline before uploading a manifest to an existing, unmodified LMS that is entirely unaware of the existence of a relay.
In every case, the result of the URL modification is to turn fully qualified URLs into relative URLs. These URLs become relative to the server address for whatever HTML page they are launched from, and in this case they will become relative to the server address of the reverse proxy. For example, when using proxyPassReverse, Apache adjusts the URL in the Location, Content-Location and URI headers on HTTP redirect responses. This is required to prevent bypassing of the reverse proxy because of HTTP redirects on the backend servers, which stay behind the reverse proxy.
Unless the LMS is in fact collocated at the same IP address as the reverse proxy, it must also be hidden behind the reverse proxy, so that both content and LMS will appear to be collocated in the same server. For example, let us say that companyX has a company-wide LMS, which is mapped into its domain name space as "lms.companyx.com". This subdomain can be redirected to the reverse proxy. The actual LMS may reside at a different address which is known only to the reverse proxy, as explained in the next section.
We saw that in order to avoid the cross-server scripting security block, both the API object provided by the LMS and the SCO must come from the same server. Since the API object must be ready before the SCO is launched, it means that the API object must come from the reverse proxy, and not directly from the LMS. The simplest way to achieve this is probably to conduct the entire LMS session from behind the reverse proxy.
In other words, whenever a user attempts to go to the official published URL for the LMS to begin a LMS session, the request should actually be redirected to the reverse proxy. This is because the stage must be set for same-domain scripting long before a SCO can be launched, and you don't want the LMS to start hopping to a different server during a LMS session.
For example, any attempt to go directly to the URL http://lms.companyx.com should be redirected to the URL of the reverse proxy, and the reverse proxy will then relay the request to the LMS. There could be an obvious problem here: If the relay itself passes requests to the official, published URL for the LMS, (http://lms.companyx.com in our example), this could result in an infinite loop. For this reason, the actual, "private" URL of the LMS should be different from the official, published URL. It should remain hidden, as it will be used only by the reverse proxy relay.
[Editorial comment - Technical: This section should be reviewed by a cookie expert. It is speculative until tested]
In theory, cookies should not be a problem if the cookies are set by HTML content without specifying a domain and path, because in that case the browser assigns the domain and path, mapping them to the reverse proxy rather than the origin server. However, some LMS and possibly some content, such as simulations, may be setting cookies that are hard wired to the domain of the origin server. Also, server side scripting of cookies, as in .asp pages or .NET, appears to hard wire cookies to the originin server on which the server side scripts are running. This prevents the application from working correctly, unless the cookies are also remapped.
The ideal would be to use only client side get and set functionality, and not hard wired cookies, since the browser can automaticlly assign a proper domain and path. But that may not be practical in some calses. To handle such hard wired cookies, Apache 2.2.x adds the directives ProxyPassReverseCookiePath and ProxyPassReverseCookieDomain.
Only content in which the links are relative can be proxied successfully. For example, if a HTML page contains links to an absolute URL, this will bypass the reverse proxy. See the Apache documentation for link to an add-on utility that can force a remapping.
In order to enable the reverse proxy, the following must be specified in the httpd.conf configuration file for Apache 2.x:
The following applies to Apache 2.2 only. In order to enable the automatic cookie remapping, the following must be specified in the httpd.conf configuration file for Apache 2.2. Note that, unlike the configuration for proxyPass and proxyPassReverse, which don't care about what the actual address of the reverse proxy server is, this will require knowing the reverse proxy's public address or domain because it is a required parameter for ProxyPassReverseCookieDomain.
An excellent article that discusses the security advantages of using a reverse proxy and that provides specific configuration instructions for locking down your Apache server can be found at http://www.securityfocus.com/infocus/1739
A lot has been written about the security features of Apache, and about the ways to prevent unwanted exploits or abuse. It is not necessary to enable ProxyRequests in order to configure a reverse proxy. In fact, you probably want to turn it off unless you want this server to function as an open proxy.
It is also recommended to evaluate whether the proxyVia directive should be set to on (it is off by default). If set to on, each request and reply will get a Via: header line added for the current host. This allows auditing and verification that the route taken by the requests and replies is through this reverse proxy.
Apache Documentation Team, "Apache Module mod_proxy", The Apache Software Foundation, 2005. Retrieved on 2006-05-21 from http://httpd.apache.org/docs/2.2/mod/mod_proxy.html
Ivan Ristic, "Web Security Appliance With Apache and mod_security", SecurityFocus, 2003. Retrieved on 2006-05-21 from http://www.securityfocus.com/infocus/1739