This wide- and large- screen layout may not work quite right without Javascript.

Maybe enable Javascript, then try again.

Chuck Kollars` Personal Home Fiddling with Home PCs

Munging (transforming) URI(web addresses) in Apache

This webpage is specifically about transforming the URIs sent by a user into something that better matches your webserver.

You may instead desire either a thorough general description of .htaccess or an in-depth description of exactly how .htaccess behaves (particularly useful if you're trying to do something arcane and it isn't working).

Alternatives

If your website is hosted on an Apache server, and you iwant clients to automatically adjust to your revised website layout, or iiwant clients to see friendly URIs even though your actual local path- and file-names are more complex, or iiifor better SEO want web search engines to get a cleaner view of your website, you will need to do some sort of rewriting of requested URIs. Apache offers four different ways to do this:

Three of the options provide (but do not require you to use) the full power of PERLish/PCRE regular expressions (also called regex or RE), so that's seldom a differentiator. (What this providing of powerful REs really means is that no matter which option you use, a regex nerd will likely be able to express the equivalent of your final result in fewer shorter statements ...but so what?)

(Note that with most alternatives, if you have root access to a dedicated webserver there are more reasonable debugging options. But on a shared webserver, I know of no reasonable debugging options for any of the alternatives. So build up your .htaccess file just a little bit at a time, testing at every step, and stopping immediately whenever something doesn't work quite right or behaves in an unexpected way. Then solve and correct [or at least fully understand] that issue before continuing. By doing things in baby steps you won't wind up with a large mass of code that doesn't work but you have no idea why not and no idea where to start or what to look for.)




The rest of this webpage assumes you've selected the mod_rewrite alternative (sometimes derogatorily called the old way). To create just a simple mod_rewrite recipe in a .htaccess file, just following the few simple rules of thumb below should bypass all the potential problems. (But to create a more complex mod_rewrite recipe, understanding more about how mod_rewrite actually works will likely be invaluable.)

(Another way to skim over all the whys and just hit the important do this is to look only at the accented [light plain background like this] portions below.)

Processing of .htaccess files by mod_rewrite is a little odd. There are a couple good reasons for this. First, as normal Apache rewriting and redirection are all finished long before individual .htaccess files are even reached, rewrite modules must undo then redo then fake out everything. And second, to address the conflicting requirements of providing the best performance in a process that's done for every single request (not just once) while keeping the configuration fairly simple, the processing is sometimes more unusual than what you'd naively expect. And although some things about mod_rewrite are minutely documented, many others are hardly documented at all.

Its quirks can make mod_rewrite seem much harder than it really is (especially if you try to create a simple naive .htaccess file without knowing how to avoid the gotchas). These webpages list standard workarounds for the gotchas, common idioms, and other usage hints.

Remember though that the golden rule of mod_rewrite is KISS (Keep It Simple Stupid). Before you try to implement anything complicated, check again if tweaking your website layout would make the problem go away entirely, if just a couple symlinks could easily solve the problem, if handling a whole subdirectory all at once rather than individual files would greatly simplify the problem, if the extra functionality is really necessary, if the problem can be prized apart into two separate and much much simpler problems, if the problem can be expressed more crisply, and in general if there's any easier way.

(Many times these webpages will refer to the absolute local file directory corresponding to the website root, the document root, which is of course different on different systems. This value is available in .htaccess files as %{DOCUMENT_ROOT}. But typing that is long and awkward [and possibly not as clear as it should be]. So the rest of this webpage will refer to the value of the document root path on your computer as simply rootpath.)




So what are the most basic rules of thumb for using mod_rewrite?

Rule 1) Use mod_rewrite only in the website root .htaccess file

The rules for handling multiple mod_rewrite .htaccess files, and exactly what's presented to each one (i.e. the per-dir relativizing of filenames), are tricky. While multiple mod_rewrite .htaccess files occasionally have their place, they're not helpful in the vast majority of cases; simply avoid them. (For the infrequent cases where they're actually helpful, understand what per-dir means and exactly how it works.)

In fact, it's best to start out with mod_rewrite statements (even just RewriteEngine on/off) only in the .htaccess file at the root of the whole website. In that case, filenames will always include the full path, and relative-vs.-absolute errors will largely disappear since they'll be the same anyway.

Rule 2) Always start with these three lines of boilerplate

Option +FollowSymlinks RewriteEngine on Rewrite Base / RewriteCond %{ENV:REDIRECT_STATUS} \d\d\d [OR] RewriteCond %{REQUEST_FILENAME}==%{ENV:SAVED_REQUEST_FILENAME} ^(.*?)==\1$ [OR] RewriteCond %{REQUEST_URI} ^.{300} RewriteRule ^ - [L] RewriteRule ^ - [E=SAVED_REQUEST_FILENAME:%{REQUEST_FILENAME}]

(The argument to RewriteBase should normally be the URI path to whichever subdirectory the .htaccess file is in [or you might think of it as the path from the website's root to the subdirectory the .htaccess file is in]. Use RewriteBase / in the .htaccess file in the website's root directory. If you have a complex setup with .htaccess files in subdirectories, the argument to RewriteBase should be adjusted in each one, probably to something like RewriteBase /grandparentsubdir/parentsubdir/thissubdir.)

The first mod_rewrite statement in a .htaccess file should always be RewriteEngine on, immediately followed by the other two. Even though it may not seem like it, The symlinks option really is relevant to the operation of mod_rewrite. In a few cases, to maintain security, mod_rewrite only works fully correctly if symlinks are enabled within Apache. The easiest resolution is to just turn them on all the time, even though they're often not really required in many cases. (Make sure the system-wide Apache configuration (probably file httpd.conf) doesn't try to force symlinks permanently off or disallow overriding of that option, either of which can result in symlinks not being on even though you've specified the correct line in your .htaccess file.)

(The RewriteBase / statement [the argument / would usually be different somewhere else besides the webserver's root directory] only affects relative external-redirects, and isn't really necessary because you shouldn't use such redirects anyway. However example code on the Internet uses RewriteBase / so widely it seems prudent to include it.)

(The net effect of RewriteBase / is to make relative external-redirect statements [like RewriteRule ... filename [R=301,L]] return to the client a revised URI of the form http://yourwebserver/filename rather than http://yourwebserver/grandparentpath/parentpath/filename. Unfortunately incorrect descriptions of RewriteBase -including that it has something to do with making absolute filenames relative- are so widespread it can be difficult to figure out what the directive really does. Fortunately, it doesn't matter very much, especially when mod_rewrite appears only in the website's root directory.)

If you temporarily change RewriteEngine on to RewriteEngine off, the .htaccess file will still be considered a mod_rewrite .htaccess file. The exact same .htaccess files will still be identified and processed in the exact same order. However, sometimes some mod_rewrite statements will be partially processed anyway (not just completely ignored as expected). For example some RewriteRule statements may act as internal redirects, even though RewriteEngine off has been specified and even though the statements normally act as external redirects.

Rule 3) Avoid LOOPs (public enemy #1 of mod_rewrite)

Loops (either infinite loops or one extra time loops) are very common with naive mod_rewrite instructions in .htaccess files. Looping seldom means anything is coded wrong. Rather loops are simply a gotcha of the way mod_rewrite behaves in .htaccess files. They're easy to corral, and with very little or no change at all to your code.

First a brief explanation of one of the reasons why they happen: In .htaccess files, the [L]ast flag does not do what you probably expect it to do. In an httpd.conf context, both [N]ext and [L]ast behave exactly the way they would in PERL, either looping to the top of the file or exiting the file completely.

But in a .htaccess context, [L]ast behaves a little differently. It ends the current pass (so calling it [L]ast still makes reasonable sense). But it does not necessarily exit the file completely. If the target was changed (redirected) by the current pass (the one that encountered the [L]ast flag), execution starts all over again from the top of the file! (See some of the potential for infinite loops and duplication?)

So in a .htaccess context, [N]ext and [L]ast sound pretty much alike at first. Why are there separate flags at all? Do they really behave differently? It turns out they do behave differently ...but in rather subtle ways that may not be immediately obvious. Both always go to the top of the .htaccess file. But [N]ext always makes another pass (with existing environment variables), whereas [L]ast only makes another pass (with greatly modified and even lost environment variables) if the target was changed/redirected by the previous pass.

And what can you do to prevent this problem? (You should do at least two and perhaps all three of these, don't just choose one)

Rule 4) Don't mix mod_rewrite with mod_alias

A combination of mod_alias statements (Redirect ..., Alias ...) and mod_rewrite statements (Rewrite...) is not recommended.

It will do something (occasionally even what you want:-), and it will do the same thing every time. But understanding the behavior of the combination thoroughly enough to get it to do tricky things reliably and maintainably is so weird it's better to just avoid the mixture entirely.

Specifying both Alias ... and Rewrite... statements in the same .htaccess file seems so logical: first specify with Alias ... any whole directories that have been relocated internally, then specify with Rewrite... the more esoteric remappings for individual files and uncommon cases. But unfortunately it doesn't actually work. In fact mixing Alias ... and Rewrite... statements will do something different than what you intended so often that it's best simply avoided.

Rule 5) Avoid intimately mixing mod_rewrite with mod_env

mod_rewrite ([E=NAME:value]) and mod_env (SetEnv ... and SetEnvIf ...) both have full access to all the environment varables. So it's tempting to switch back and forth between mod_rewrite and mod_env when handling environment variables. But don't.

The relative execution timing of statements belonging to the two different modules is almost certainly not what you wish it were. The net result is often a variable being set after the other module has already read the empty value from it, often leading to subtle logic errors that are hard to debug.

One way to avoid this problem is to in your mind assign each variable to either mod_rewrite or mod_env, and never ever set or read that variable with the other module. (Another way to avoid this problem is to not use mod_env at all, setting environment variables exclusively with mod_rewrite.)




In simple cases the above are all the rules of thumb you'll ever need. Here's a description of some of the more nitty-gritty details that may be useful when implementing complex uses of mod_rewrite.


Location: (N) 42.680943, (W) -70.839384
 (North America> USA> Massachusetts> Boston> Metro North> Ipswich)

Email comments to Chuck Kollars
Time: UTC-5 (USA Eastern Time Zone)
 (UTC-4 summertime --"daylight savings time")

Peruse Chuck Kollars' Facebook Profile
All content on this Personal Website (including text, photographs, audio files, and any other original works), unless otherwise noted on individual webpages, are available to anyone for re-use (reproduction, modification, derivation, distribution, etc.) for any non-commercial purpose under a Creative Commons License.