Hosting Websites on Bare Minimum VPS/Dedicated Servers

Setup And Configure Mod_Rewrite on Ubuntu 16.04

Tags: , , , , , Date/Time: July 4, 2019 @ 5:39 pm, by Dusko Savic

In this tutorial, we will study Mod_Rewrite, one of the most important modules of the Apache server. It can change the incoming address from the browser into something else, on the fly. The most common usage is to improve SEO rankings. Google, Bing and other search engines are in the business of relevancy – the site they show in the rankings should be relevant to the keywords the searcher has typed in. One of the most important relevancy factors is to have the address of the page on the site mimic or show the keywords searched as closely as possible.

If the keyword was

how to install apache on ubuntu

search engines will rank higher a page that has the address such as

https://example.com/how-to-install-apache-on-ubuntu

In reality, such page may not even exist in a fixed form, the site may just use a call to a database of articles with an address such as:

https://example.com/show.php?keyword1=install&keyword2=apache&keyword3=ubuntu

This is what popular site frameworks such as WordPress, Joomla, Drupal and many others do behind the scenes all day long. In this tutorial, you’ll see how you can achieve the same effect on your own site.

What We Are Going To Cover

  • Installation of Apache
  • Activating Mod Rewrite from Apache
  • Permitting .htaccess files to change
  • Explain the structure of a Mod Rewrite command
  • Use Mod Rewrite to change the file served
  • Make an incoming parameter look like directory
  • Use RewriteCond to always serve site address without www.

Prerequisites

  • We use Ubuntu 16.04 but the procedure is practically the same for all other servers on which Apache can be installed.
  • Root user privileges

Step 1: Install Apache

First, update your package manager’s cache:

sudo apt update -y

Install the Apache web server:

sudo apt install apache2 -y

Next, enable its service to make it run on every system boot:

sudo systemctl enable apache2

Finally, start it:

sudo systemctl start apache2

To verify that Apache was installed successfully, access it from your local browser by navigating to http://SERVER_IP/. If that does not work, try adding :80 in the end, like this:

http://SERVER_IP:80

You should see a welcome page for Apache, which means that you now have Apache running.

Step 2: Enable Mod Rewrite

Here is a command to enable Mod_Rewrite within Apache:

sudo a2enmod rewrite

Activate it by running:

sudo systemctl restart apache2

Step 3: Enabling Access to .htaccess File

In theory, you can set everything up in the Apache configuration files while in real life, it is often not possible. Your site may be on shared hosting where you do not have access to Apache and the solution is to use files called .htaccess. (It starts with a dot so it is not readily visible with the usual server commands.)

Open Apache global configuration file for editing:

sudo nano /etc/apache2/apache2.conf

Under the <Directory /var/www/> block, you’ll find the following line:

AllowOverride None

Change it to

AllowOverride All

When you are done, save the file.

Restart Apache so that it takes new configuration into account:

sudo systemctl restart apache2

We are now ready to use Mod_Rewrite.

Mod_Rewrite Example 1: Redirecting To Another Site

.htaccess file can change Apache settings for the folder it is in as well as for all the folders beneath it. If you want a rule to hold for the entire site, you can create a .htaccess file and put it into root document folder. In Ubuntu, that would be /var/www/html. Let’s create it:

sudo nano /var/www/html/.htaccess

If you want to run several sites, you will probably use this as the destination:

sudo nano /var/www/example.com/html

Put the following text into .htaccess file:

Options +FollowSymLinks
RewriteEngine On
RewriteRule ^empty.html$ http://www.google.com/ [R=301]

The first line should always be present in a .htaccess file. The second line turns rewriting on and off – it should always be turned on! The third line is the rewrite rule. Let’s explain what it does:

  • RewriteRule is the command for rewriting. The rest of the line is what, to where, and how the redirection should be executed.
  • ^empty.html$ is an example of a regular expression. Symbol ^ denotes start and symbol $ denotes the end of the expression. Text in between them is the regular expression itself. Here it is empty.html and is unique in the sense that only if the address after the domain name contains empty.html, the rewrite rule will be obeyed.
  • http://www.google.com/ is the address to go to if text empty.html was found in the incoming address. In this case, we end up on Google main site but we could have pointed it to any other valid domain.
  • [R=301] The parameters in square brackets change the way the transition is executed. Here R=301 means the a redirection of type 301 should take place. In this context, 301 is a permanent redirect and 301 is the HTTP status code that the page visited will return to the browser.

WARNING 301 redirection stays stored in your browser, so if it so happens that you wish to redirect empty.html to another destination in the future, it may still go to Google’s site.

Now go to your browser and type in the following address:

http://SERVER_IP/empty.html

You will end up on google.com.

Mod_Rewrite Example 2: Redirecting A URL To the Same Site

Now create a new html page, write1.html, in root document folder:

cd /var/www/html
sudo nano write1.html

and write the following to it:

<html>
  <head>
    <title>Welcome</title>
  </head>
  <body>
    <h1>Welcome to my page write1.html!</h1>
  </body>
</html>

This is what it would look like in the browser, with its normal address

http://SERVER_IP/write1.html

The next command redirects users from old.html to write1.html:

RewriteRule ^old.html$ write1.html [R=301]

The address in the browser would be

http://SERVER_IP/old.html

but still the contents of page write1.html would be shown, the same as in the image above.

Mod_Rewrite Example 3: Rewrite All html Files To php Files, Internally

We have redirected one file to another file on the same site and now we will redirect all files ending in .html to files ending in .php. Put this into .htaccess file:

Options +FollowSymlinks
RewriteEngine on
RewriteRule ^(.*)\.html$ $1.php [NC]

Here is an explanation, character by character:

  • ^ starts the rule, while
  • $ ends it.
  • .* in the parentheses accepts whatever text there is
  • \. after ) must be there to represent the dot itself (where \ serves as an escape character)
  • .html is the ending
  • $1 in the third part of the RewrireRule command takes everything in the parentheses from the second parameter and joins it with the .php ending
  • [NC] is a flag meaning “No Case”, standing for both upper-case and lower-case characters.

The above rule will buy you some time to make a transition from a static, html based site, to a dynamic one, based on php. However, both visitors and the search engines will still use html page while they could use the php versions of the pages – if they knew they existed.

Mod_Rewrite Example 4: Rewrite All html Files To php Files, Externally

The following code will redirect to the same site, this time via proper HTTP requests:

Options +FollowSymlinks
RewriteEngine on
RewriteRule ^(.+)\.html$ http://example.com/$1.php [R,NC]

These addresses will be seen in the browser so both the visitors and the search engines could bookmark them for further use.

Flag R in [R,NC] stands for temporary redirection, which has the official page code of 302.

Mod_Rewrite Example 5: Make an Incoming Parameter Look Like Directory

Please note that the contents of the parentheses in the previous example, (.+) were taken verbatim by Mod_Rewrite and put into the new address via parameter $1. If there were a second pair of parentheses, it would go to $2, the third pair of parentheses would go to $3 and so on. Instead of a link such as

http://example.com/download.php?section=games&file=mygame

we can produce flat link like this:

http://example.com/files/games/mygame.zip

It is shorther, easier to remember and can be typed in directly into the browser. Here is the RewriteRule:

Options +FollowSymlinks
RewriteEngine on
RewriteRule ^files/([^/]+)/([^/]+).zip /download.php?section=$1&file=$2 [NC]

Mod_Rewrite Example 6: Three Input Parameters With Regular Expressions

The two most frequently used types of HTTP requests are GET and POST. The latter has several parameters and will look like this:

http://example.com/edit.php?category=newspaper&section=sports&detail=tennis

We want to transform it into this:

http://example.com/newspaper/sports/tennis

and here is the RewriteRule to do that:


RewriteRule ^(newspaper|magazine|site)/([^/.]+)/([^/.]+)$ edit.php?category=$1&section=$2&detail=$3

  • The first parentheses (newspaper|magazine|site) gives only three choices for category
  • The other two parentheses, ([^/.]+) are translated as follows: + means that any characters and at any length can be served as input, while the brackets, [^/.], tell us that that input may not start with characters / and ..

Eliminating www. from the Address – Using RewriteCond

It is possible to perform rewrite rules only if certain conditions are met. It is a concept from programming languages: “if this is true, execute this rewrite rule, if not true… then stop evaluating”. This requires new keyword RewriteCond. Full explanation of RewriteCond is out of scope of this article, but we will illustrate how it works.

Sites can use www.example.com address or only example.com. The former stems from the early days of Internet, but for SEO purposes, these two addresses appear as two different sites. Here is the code to eliminate www. from the address in the browser.

Options +FollowSymlinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^www\.(.*) [NC]
RewriteRule ^(.*)$ http://%1/$1 [R=301,NC,L]

If the address %{HTTP_HOST} starts with www., it will be evaluated in the RewriteCond command. If true, the second rewrite rule, here with the command RewriteRule, will be evaluated as the same address (parameter %1) and the same page (parameter $1). Pay attention to the flags here. R=301 means the redirection is permanent, which is good for search engines to know and L states that this must be the last rewrite rule evaluated.

What Else You Can Do Now

There are many other uses of Mod_Rewrite, for instance:

  • Protect your site from hotlinking and leeching images
  • Redirect your site to its /blog directory
  • Redirect nonexisting pages to your main site address
  • Add www. to every page of you site
  • Block specific IP addresses from your site
  • Use multiple domains in one root
  • Automatic translation of pages
  • Set up cookies and use them later for authentication purposes
  • Prevent user agents (bots) from eating bandwidth on your site

Dusko Savic is a technical writer and programmer.

duskosavic.com

No Comments

Leave a Reply

Some notes on commenting on LowEndBox:

  • Do not use LowEndBox for support issues. Go to your hosting provider and issue a ticket there. Coming here saying "my VPS is down, what do I do?!" will only have your comments removed.
  • Akismet is used for spam detection. Quoting webhostingtalk.com URL seems to get binned consistently here, but I do peek into the spam box frequently to publish those comments.
  • Use <pre>...</pre> to quote the output from your terminal/console, or consider using a pastebin service.

Your email address will not be published. Required fields are marked *