The FreeBSD Diary |
(TM) | Providing practical examples since 1998If you buy from Amazon USA, please support us by using this link. |
Rewriting URLs within Apache
19 January 2000
|
||||||
This article shows you how you can rewrite URLs as they arrive at your
Apache server. I used this on my website. I moved everything from /freebsd/ to
/ and renamed all the files from *.htm to *.html. I told Apache how to process the
incoming URLs so the correct files were found. This is way-cool stuff! Note: although I was doing the rewrites for quite some time after reorganizing the website, I have now removed the rewrites. This solution requires mod_rewrite (which is included in the Apache port I used). Make sure the following are present in your httpd.conf: LoadModule rewrite_module libexec/apache/mod_rewrite.so AddModule mod_rewrite.c NOTE *** WARNING *** - if you are using FrontPage Extensions, you can break things when using mod-rewrite. See FrontPage doesn't like RewriteRule for how to avoid this problem.. |
||||||
Different solutions for different situations
|
||||||
If you are using your ISP's webserver, then the
.htaccess solution is what you will probably need. If you have renamed the files from .htm to .html, then the simple solution will do this. And the more complex solution is what you have if you have changed directory names. |
||||||
Useful Resources
|
||||||
I found that the following URLs were useful.
|
||||||
This section deals with the renaming of files from *.htm to *.html.
Here is what I added to my the virtual host section of /usr/local/etc/apache/httpd.conf
as an example:<Directory "/www/test.freebsddiary.org"> RewriteEngine on RewriteBase / RewriteRule ^rewrite\.htm$ rewrite.html [R=permanent] </Directory> When a request is encountered for rewrite.htm, this will be rewritten to be rewrite.html. This is demonstrated by https://test.freebsddiary.org/rewrite.htm which will rewrite the URL to https://test.freebsddiary.org/rewrite.html. This is a simple solutions and works well for one file. |
||||||
All files
|
||||||
We will now look at how we can do rewrite rules for all files. But in this
example, we'll use a silly file extension name, just because we can.<Directory "/www/test.freebsddiary.org"> RewriteEngine on RewriteBase / RewriteRule ^(.*)\.xyz$ $1.html [R=permanent] </Directory> The above translates any request for an .xyz file to a .html file. As I had renamed all such files, this is enough for me. It also changes the URL in the user's browsers. If they had requested foo.xyz, their browser will display foo.html. If foo.html doesn't exist, they will get the normal error screen. The "=permanent" indicates to the client that this is a permanent change in the URL. If you don't supply this option, the relocation is deemed temporary. To demonstrate this rewrite, click on https://test.freebsddiary.org/rewrite.xyz which will rewrite the URL to https://test.freebsddiary.org/rewrite.html. Given the above rule, https://test.freebsddiary.org/rewrite2.xyz. will not work because there is no file named rewrite2.html at the test webserver. I have seen a solution which first checks if foo.html exists, and if it does, return foo.html. If foo.html does not exist, the URL in the browser remains unchanged at foo.htm. And I wrote it about it here. |
||||||
NOTE: In recent testing, I was unable to get this solution to work. This solution deals with the moving of files from one directory to another as well as the renaming of the extension.. Here is what I added to my the virtual host section of /usr/local/etc/apache/httpd.conf: <Directory "/www/test.freebsddiary.org/example"> RewriteEngine on RewriteBase / RewriteRule ^(.*)\.htm$ $1.html [R=permanent] </Directory> Now, https://test.freebsddiary.org/example/rewrite.htm will rewrite the URL to https://test.freebsddiary.org/rewrite.html. |
||||||
Redirect and rewrite - the file is on another server in another directory
|
||||||
If you want to redirect racesys
to http://www.racingsystem.com/ here is what I
used. Within the .htaccess on freebsddiary.org, I place this:Redirect permanent /racesys http://www.racingsystem.com/racesys This redirects the client to http://www.racingsystem.com/racesys. At that website, I have this <Directory "/www/racingsystem.com/racesys"> AllowOverride All RewriteEngine on RewriteBase / RewriteRule ^$ / [R=permanent] </Directory> This rewrite says that for an empty string (i.e. ^$), rewrite the rule to be just /. And the URL becomes http://www.racingsystem.com/. Why did we not redirect straight to http://www.racingsystem.com/ in the first place? Because I also have these types of rewrites on the website in addition to the above: RewriteRule ^booksmags\.htm$ booksmags.html [R=permanent] RewriteRule ^download\.htm$ download.html [R=permanent] RewriteRule ^enhance\.htm$ enhance.html [R=permanent] I preferred to put those rewrites in the /racesys/ directory rather in the main directory. |
||||||
These solutions can also be accomplished with .htaccess entries.
This is useful to know if you are not running your own webserver and do not have access to
httpd.conf. Here's what I put into my .htaccess for this
solution:RewriteEngine on RewriteBase / RewriteRule ^(.*)\.htm$ $1.html [R=permanent] The virtual host in question must allow FileInfo to be overridden. <Directory "/www/freebsddiary.org/freebsd"> AllowOverride All </Directory> |
||||||
This set of rules is based on an example from http://www.engelschall.com/pw/apache/rewriteguide/#ToC21
and can be used when you have renamed files from .htm to .html.
It first checks to see if a file with the new extension exists. If it does,
it returns that URL. Otherwise, it returns the original URL.RewriteEngine on RewriteBase / RewriteRule ^(.*)\.htm$ $1 [C,E=WasHTM:yes] RewriteCond %{REQUEST_FILENAME}.html -f RewriteRule ^(.*)$ $1.html [S=1,R] RewriteCond %{ENV:WasHTM} ^yes$ RewriteRule ^(.*)$ $1.htm |
||||||
Redirecting/rewriting for a specific file
|
||||||
If you have moved a file from one server to another, this is my favorite
method for redirecting. I put this within the virtual host section of the website in
question.Redirect permanent /cats/ http://www.freebsddiary.org/cats/ You should also read Redirecting URL requests with Apache for more information on redirects. If you have renamed a file, and wish to redirect incoming requests, you can do this: RewriteEngine on RewriteBase / RewriteRule ^about\.htm$ about.html [R=permanent] The above will result in requests for about.htm being redirected to about.html. The "^" represents the start of the substitution. The "\" is an escape which allows the "." The "$" represents the end of the substitution. |
||||||
Redirects vs rewrites
|
||||||
When should you use a redirect? When should you use a rewrite? If the file is on the same website, you should use a rewrite. If the file is on another server, you should use a redirect. Why? A simple answer is bandwidth. A redirect sends the new URL back to the client and the client must reissue the URL request, which creates more traffic. With a rewrite, the original request is satisfied and a new URL is returned along with the new file. The client does not have to reissue anything. | ||||||
What I'm using now
|
||||||
NOTE: Since writing this article, I have removed these rewrites from my
webserver. When I rearranged the Diary, I moved everything from /freebsd/ into /. I wanted the old URLs to still work. The following is the contents of /freebsd/.htaccess: RewriteEngine on RewriteRule ^$ / [R=permanent] RewriteBase / RewriteRule ^(.*)\.htm$ $1.html [R=permanent] The following describes each of the above lines:
Line 2 allows for http://www.freebsddiary.org/freebsd/ to take you to the home page. Lines 3/4 allow http://www.freebsddiary.org/freebsd/ed1.htm to still work. NOTE: Since writing this article, I have removed these rewrites from my webserver. |
||||||
Coming soon to a log file near you!
|
||||||
Here's what I get in my log files if someone browses to
http://www.freebsddiary.org/freebsd/search.htm:"GET /search.htm HTTP/1.0" 302 335 "-" "Mozilla/3.01Gold" "GET /search.html HTTP/1.0" 200 2304 "-" "Mozilla/3.01Gold" As you can see, the first request for search.htm is shown. The code 302 refers to a rewrite, I think. Then you can see the real page being requested, search.html. You can also log the rewrites by putting the following within your virtual host definition: RewriteLog /var/log/apache/racingsystem.com-rewrite.log RewriteLogLevel 1 The log level should only be used for debugging as high levels of logging can dramatically affect performance. See http://www.apache.org./docs/mod/mod_rewrite.html for detail. |
||||||