So I'm hacking away on a client's site which is protected by Basic Authentication via .htaccess file to keep bots and the general public from checking out this next generation of the site. This setup was working great, until I started planning out today's task: integrating Amazon's CloudFront CDN. The CDN is trivial to setup, but I could already see a gotcha developing: the CDN would contact the origin site, which would respond with an access denied error, which in turn would fail the request.
Option A would be to drop the basic authentication on the site altogether. Not a great option, as this would expose my client's site to eyes he didn't seeing it.
So I'd have to go with Option B: tweak the .htaccess file to require authentication on most, but not all requests. There's also Option C: find out what IPs CloudFront is requesting from, and let those in along with Basic Authentication, as outlined here. But, I wasn't sure how feasible it would be to find these IPs, and besides, I was curious if I could learn yet another bit of Apache configuration magic.
The solution to Option B turns out to be quite clever and is outlined here. Here's a snippet of code from that site:
# password protection allowing directory and file access AuthType Basic AuthName "Restricted Area" AuthUserFile /home/path/.htpasswd AuthGroupFile /dev/null Require valid-user SetEnvIf Request_URI "(path/to/directory/)$" allow SetEnvIf Request_URI "(path/to/file\.php)$" allow Order allow,deny Allow from env=allow Satisfy any
Apparently, you can flag a request with an environment variable (in this case named allow) and then use that variable later on to decide if a request should be authenticated. That's a remarkably flexible tool, especially considering you've got access to quite a number of variables.
And I always forget about that import Satisfy Any - which basically tells Apache to chill, and to trust the use if only a single Allow rule matches.
Once I had a way of allowing some requests but not others in, it was just a matter of putting the right regular expressions in place that represented my CDN content.
One additional gotcha I ran into relates to RewriteRules. Suppose you've got the following rules in place:
RewriteRule ^(static/*) ./files/$1 [L]
And your request looks like:
/static/logo.gif
You need to make sure you have SetEnvIf clauses for both /static/logo.gif and /files/logo.gif. With only static/logo.gif in place, the request which has been re-written to files/logo.gif, no longer matches a SetEnvIf clause and so it requires authentication. You'd need both of these lines:
SetEnvIf Request_URI "^/static/.*" allow SetEnvIf Request_URI "^/files/.*" allow
Thank you very much for this post! Been trying to troubleshoot my authentication bypass for a little while now. Turned out I was missing the "Allow from env=allow" line, which I realized after reading through your post. Cheers!
ReplyDelete