Updating a Legacy PHP Web App Part 2: Understanding Your App

In part 1 of this series, I discussed the decisions behind not throwing the code away entirely. In this part, I will jump right into the technical steps to establish a stable working environment that you can consistently dig into everyday:

  1. Just Get the Damn Thing Running

  2. Start Taking Notes

  3. Understand the Structure

Step 1: Just Get the Damn Thing Running

The sooner you can easily relate the screens you are seeing on your web browser to the files that correspond to these screens, the sooner you’ll be able to fix broken stuff! Don’t overthink your development environment’s requirements and your operating system and things like Docker. Just try to get to the login page on your localhost.

A good hint that you need to active Apache mod_rewrite is to look at the .htaccess file in the docroot.

A good hint that you need to active Apache mod_rewrite is to look at the .htaccess file in the docroot.

This particular web app was running on a shared hosting service on Apache talking to a MySQL database. Hey, I’ve set up Oracle installations on Windows 2000 back since Oracle 8i was distributed on CD-ROMs. I can handle a MySQL install with a basic Apache server. Au contraire, mon ami. It’s not this application was terribly complicated, it’s that without proper documentation, the only way to figure out how get this Login page to show up is going to be to take it step by step. At first, I saw that I could not hit any URLs at all and I was getting all sorts of errors about redirecting and I figured out I need the old school Apache mod_rewrite module enabled. So now I can hit URLs and I was hoping to get a simple login page or some error telling me it cannot find a MySQL connection but I was just getting a white screen of death.

I had to look at the code to see that this web site was looking for two very specific domain names: the custom localhost the previous developer put in his hosts file and the production domain. Luckily, I could configure both on my machine so that I could get some static content to load.

Localhost logic found in: docroot/inc/template.php

if(strpos(getcwd(),"/Users/Adam/")!==false) define('LOCALHOST',true);

Well, this isn’t going to work. My name is not Adam and, well, I’m not developing on Windows. Also, an example of misplaced logic: system bootstrapping in the template engine.

My next step would be to export the Production database to my local computer. While there were some weird issues with all of the HTML content stored in the cms table, and ensuring I wasn’t wasting time exported tens of thousands of rows of user data, I was able to create a good enough copy to get started. This is all except for the fact that I could not access the database without executing this command:

SET GLOBAL sql_mode=(SELECT REPLACE(@@sql_mode,'ONLY_FULL_GROUP_BY',''))

Forgive me for not remembering the exact error that occurred as I have not experienced it in more than one year. I wrote in the technical wiki that this was a required step in the database configuration. This brings me to my next step:

Step 2: Start Taking Notes

README

It can be very frustrating trying to get the damn thing running on your computer. Taking notes is usually the last thing we care about. As you inch towards progress, we often forget the steps we took to get there! Do yourself a favor and create a simple text file where you can jot down quick and reasonably legible notes. Eventually some of these notes might become part of your formal technical specs. (Yes, you should write those too).

deployment directions

Other documentation you should consider writing is how you would like to formalize deploying code for Test releases and Production releases. It might be as simple as pulling from Git in a temp directory on your remote server.

If you are planning on expanding the team or even taking a break from the project for a while, it’s these sorts of fundamental steps that differ between projects that are hard to pick up when you need to do something quick like a small bug fix.

Create a guide for a new developer

You do not need to memorize all of the details about the oddities of this web app. By writing it down in a wiki, you’ve given yourself a searchable tool and the gift of sharing. Even mundane things like the ssh command to the Production server or the URLs of the various environments will prove useful in the future. The less about this app you have to memorize, the better for everyone. See the screenshots and associated captions of documentation examples below:

Step 3: Understand the structure

FIle Structure, config, get this into source control

You’ve already dabbled around the app configuration to connect your database. I’m sure there are many other configuration variables. Is the config centralized? In my case, there was config for:

  • cookie domain.

  • MySQL connection.

  • localhost detection.

  • CMS images location (while CMS HTML is stored in the database and reference relative image URLs, the path of the images ended up being plugins/Xinha-0.96.1/plugins/ExtendedFileManager/demo_images. If you see humor in this as well, we can be friends).

  • TEST_MODE detection (no, this is not for testing the app. We discovered this was for Adam’s experimental features).

At this point, you want to document the directories were user content (pics, thumbnails) live so you do not end up checking them into source code. You don’t want to check in 1.6 GB of data into GitHub. That’s not fun. It’s time to move to GitHub.

Check it into GitHub and bam!

If you are still updating a text file, it’s time to move your documentation into the wiki, set standards for source control, branching and deployments. You might not be on this project as long as you think and it’s good to leave projects in better condition than you received them.

Example of directory structure

|- css
  | events.css
  | profiles.css
|- inc
  | events.php
  | profiles.php
  | template.php
|- templates
  | events.html
  | profiles.html
.htaccess
events.php
profile.php

Application Architecture

Once I found out how the configuration was coded, I naturally wanted to get deeper into the application code. I discovered that what seemed like a complete mess of 200 PHP files did actually have some form of MVC architecture behind it. In the root directory of the project folder, files with names like “events.php” and “profile.php” existed. Inside a folder called “templates/”, I found subsequent “events.html” and “profile.html”. In the “inc/“ folder, I found another set of “events.php” and “profile.php” files. Similarly in the “css/“ folder, I found files with the same names. With the understanding that all hits to the web site were touching files in the root directory of the project and that most of the other project folders were restricted from serving content according to the .htaccess file in the project root, I was able to determine that the root directory files served as endpoints and controllers and view generators.

In Java, we would have called these classes things like EventsController or put them in a package of controllers, but don’t get me started with my Java nostalgia.

Hitting local.mywebsite.com:8001/events touched /events.php whose source code included PATH ./inc/events.php and /css/events.css/. In the same file, the Template engine performed a merge on the output from the database with the HTML template file using PHP’s str_replace(). I did find a semblance of the View of the MVC paradigm. /inc/events.php ended up being a combination of the Data Access Layer and the Service Layer for all things “events”-related. The original developer did have some forethought to separate the SQL from the HTML rendering, although this ended up being quite inconsistent.

Below is a slimmed-down example of how a list of events is populated. After spending quite a bit of time understanding how this app was built, I gained an appreciation for the elegance of it given the lack of official MVC framework.

<?php
define('PATH', '');

require_once PATH . "inc/template.php";
require_once PATH . "inc/util.php";
require_once PATH . "inc/events.php";
    
// Retrieve the template
$html_body = file_get_contents('templates/events.html');

// Pull out the template mask
$tpl_event_item = Util::template_segment($html_body, '<!-- EVENT_ITEM_START -->', '<!-- EVENT_ITEM_END -->');
    
// Data Access Layer
$result = Events::getUpcoming();
    
$event_list = "";
while($row = mysqli_fetch_assoc($result))
{
   $tmp = $tpl_event_item;
   // Fill in 
   $tmp = str_replace('{{EVENT_TITLE}}', $row['title'], $tmp);
   $event_list .= $tmp;
}
    
// Fill in the template
 $html_body = str_replace($tpl_event_item, $event_list, $html_body);
    
// Override styles
Template::style("
   .new__style
    {
      border-radius: 3px;
    }
");
    
// Include stylesheet
Template::css("/css/events.css");
Template::js("/include/a/js/file/here.js");
Template::script("console.log('output any JS you want here');");
    
// Send the HTML back to the browser, we're done here
Template::mkT();
    
?>

While there were plenty of other discoveries I made that baffled me (like how some of the template HTML files for core features were stored in the database), I was able to find enough good patterns in the code so that I could find a way forward and, more importantly, create follow-ups for code improvements (put it in the wiki!). Let’s start with “all templates should be HTML files checked into GitHub”.

Conclusion

By this stage of the code discovery process, I started feeling more confident that any mysteries I would encounter would be solvable. The next phase of the project setup is to create a stable and reusable local development environment. By choosing Docker technology, I set my self up for a possible future of using Docker in Production as well. See part 3 for how I tackled these next steps.