Trying to detect your application URI using DOCUMENT_ROOT? Read this first!

The application URI is the part of the URL after the host name (domain name) and before the home of your application. For example, for this blog, the URI is /blog/. If you are writing a PHP application to be distributed and installed in many environments that you have no idea how they are configured, you need to write your application to be as generic as possible when it comes to handling environment parameters. And one of the things you should not assume, is the path in which the application is going to be installed, and accordingly, the application URI.

I used to do the following in my applications to detect the application URI, and I thought that was generic enough to work on any environment:

1. function app_uri() {
2.     $doc_root = $_SERVER['DOCUMENT_ROOT'];
3.     $helper_path = dirname(__FILE__);
4.
5.     $lib_uri = str_replace($doc_root, '', $helper_path);
6.     // returning the app uri after removing 'lib'
7.     return substr($lib_uri, 0, -3);
8. }

I’d place the above function into a helper file and include it in all application pages. calling app_uri() should then return the application URI and it should, presumably, work on all environments, right? Wrong! Sorry!

First of all, let’s see how the above function works, and then see where it would fail. Line 2 above uses DOCUMENT_ROOT to retrieve the path to the main directory for serving pages from our web server. This would, in most Debian systems, for example, return /var/www/html.

Line 3 would return the path to the file containing the app_uri() function. This might be something like /var/www/html/blog/lib, assuming the helper file is stored inside a sub directory named ‘lib’. Now, retrieving the application URI would be a simple matter of removing the DOCUMENT_ROOT from that path, and removing the ‘lib’ sub directory at the end, which is what line 7 does, returning the precious /blog/ URI.

This used to work beautifully, until it didn’t! Like I mentioned above, we have no idea whatsoever about the environments in which our app is going to be deployed. Making the assumption that the above works everywhere was, as I found painfully, an arrogant claim!

Some environments are configured so that the document root is a symbolic link. Maybe, to make things easier for users of a server, the server administrator decided to define a symbolic link in each user’s home directory named public_html that links to a directory in their name under /var/www … For example, let’s say our user is johndoe, and his home directory on the server is /home/johndoe

When johndoe lists his home directory contents, he sees public_html and so he installs our app into that directory. So, is our app installed into /home/johndoe/public_html/blog? Well, not really. Since public_html is actually a symbolic link, the real path to our app is /var/www/johndoe/blog.

The problem with DOCUMENT_ROOT is that it doesn’t resolve real paths .. so it would return /home/johndoe/public_html … OTOH, __FILE__ does resolve the real path, thus it would return /var/www/johndoe/blog/lib … Suddenly, our beautiful little function above that used to work elegantly no longer works!

I tried fixing this by applying realpath() to DOCUMENT_ROOT .. Unfortunately, some server configurations would fail that as well.

So, after researching the above for hours, I came to the conclusion that the ONLY accurate way of detecting the application URI is not to try! Simply avoid DOCUMENT_ROOT and __FILE__ altogether, and write the URI into a config file:

// db parameters
...
// app uri
define('APPLICATION_URI', '/blog/');

And the above line would be added to the config file by the setup script that runs the first time the app is installed to the server, and can be obtained from dirname($_SERVER['REQUEST_URI']). It seems this is indeed the way popular open source PHP apps/frameworks retrieve the application URI .. they don’t try to detect it from DOCUMENT_ROOT .. they just save it as a fixed predefined config value. I didn’t find this explicitly mentioned anywhere I searched. I guess this is because people usually write about how to do things rather than how not to do them! So, I thought I’d write a post about this.

To sum up, don’t try to detect the application URI from DOCUMENT_ROOT … you’re doing it wrong this way, even if it seems to work! Save yourself the trouble, detect it once during setup using REQUEST_URI, save that to a config file, and read it from there from now on!

Published by Genedy

I'm the founder of BigProf Software. We're a tiny team of developers who create tools that make it easy (and affordable) for anyone to create connected business applications that work from any device with a browser.

Join the Conversation

2 Comments

  1. Thanks first of all for you great work. May be this can also assist
    Working with Include Paths
    Another issue when working with large applications is tracking the locations of files to include. Say
    you ’ ve organized all your common library code, including animal_functions.php , into a centrally
    located /home/joe/lib folder. You have a Web site in /home/joe/htdocs that contains the previously
    shown mouse.php file in the document root. How does mouse.php reference animal_functions.php ?
    It could use an absolute path:

    include_once( “/home/joe/lib/animal_functions.php” );
    //Alternatively, it could use a relative path:

    include_once( “../lib/animal_functions.php” );

    Either approach would work, but the resulting code isn ’ t very portable. For example, say the Web site
    needs to be deployed on a different Web server, where the library code is stored in a /usr/lib/joe
    folder. Every line of code that included animal_functions.php (or any other library files, for that
    matter) would need to be updated with the new path:

    include_once( “/usr/lib/joe/animal_functions.php” );

    To get around this problem, you can use PHP ’ s include_path directive. This is a string containing a list
    of paths to search for files to include. Each path is separated by a colon (on Linux servers) or a semicolon
    (on Windows servers).

    The PATH_SEPARATOR constant contains the separator in use on the current system.
    The paths are searched in the order they appear in the string. You can display the current include_path
    value — which is usually pulled from the php.ini configuration file — by calling the get_include_path() function:

    echo get_include_path(); // Displays e.g. “.:/usr/local/lib/php”

    /*In this example, the include_path directive contains two paths: the current directory ( . ) and
    /usr/local/lib/php .
    To set a new include_path value, use — you guessed it — set_include_path() :

    set_include_path( “.:/home/joe/lib” );

    It ’ s a good idea to precede your include path with the current directory( . ), as just shown. This
    means that PHP always searches the current directory for the file in question first, which is usually
    what you want.
    You ’ d usually set your include path just once, right at the start of your application. Once you ’ ve set the
    include path, you can include any file that ’ s in the path simply by specifying its name:

    1. Thanks for sharing this, Kim. I tend to use dirname(__FILE__) for includes to address include issues and relative paths. Your suggestion is insightful as well.

Leave a comment

Your email address will not be published. Required fields are marked *