WordPress: globalise all the things

Debugging WordPress' post publishing process and the pitfalls of global variables

Date
25th May 2021
Word count
1413
Read time
8 mins

I will get this out the way now. I don't like WordPress. It's old and obsolete, a pain to develop with, and should be replaced by more modern purpose built CMS solutions like Statamic and Craft CMS.

However, there are many demonstrable reasons as to why I hate it so much, and here is one of them.


I have recently built a WordPress plugin to create content on an external platform when a post is created (exact details are classified Top Secret until it is released). I did this by using the transition_post_status hook to capture when a post is published, and among other things, I load the excerpt of the post. I also used another hook (the_content) to add a link to the external resource to the post body, which would only be intended for use on the front end article pages.

Here is some pseudo-code for the two hooks:

add_action('transition_post_status', function ($new_status, $old_status, $post) {
    
	if ($new_status != 'publish') {
        return;
    }

    $excerpt = get_the_excerpt($post->ID);

    if (!$excerpt) {
        return;
    }

    create_external_content($post, $excerpt);

});
add_filter('the_content', function ($content) {
    
    global $post;

    if (!$post) {
        return;
    }

	$link = generate_link_to_content($post);

    return $content . $link;

});

You may have already spotted an issue here, but bear with me.

So, I'm basically capturing when the post is published, checking if there is an excerpt, and creating the external content if there is. Additionally, I'm creating a link to the external content which, importantly, requires me to have access to the $post inside the_content.


I had this code working well for several months, until yesterday, where on a new installation with a database backup imported, the status transition hook was not creating the external content. However, the same code still worked on my original development installation.

I established that the reason it wasn't working on the newer installation was because get_the_excerpt was returning an empty string which meant I returned out of the function - but why was it empty? WordPress lets you set an excerpt for a post, but if you don't, it will fall back to the main content body, and the post I was creating did have content, so it should have been generating an excerpt from that.

I traced through the code to see how WordPress generated the excerpt, and found this line in /wp-includes/formatting.php:

$text = apply_filters( 'the_content', $text );

After this ran, $text was an empty string. This seemed to be the problem.

This code ran the content through my the_content hook function (as well as other functions registered on the same hook). I dumped out whether $post was set inside my hook callback - it turned out that while this was a full WP_Post object on my original installation, on the new one, it was null.

Confused as to why this would be, I started to debug the XHR requests WordPress was making when publishing the post, and noticed something odd. While both installations were sending a request to /wp-json/wp/v2/posts/<postId> when publishing the post, my original installation was then sending additional XHR requests to /wp-admin/post.php afterwards. When I dumped out the value of $excerpt, it was an empty string when being called via the first XHR request, but was the value I was expecting when being called via the second XHR request.

I now had several questions. Why was one installation sending these requests to post.php and the other wasn't? Why is $post globally available in one request and not the other? And if I can't reliably access $post inside the the_content hook, which I needed to be able to do, how would I generate the link to the external content?

I came to the obvious conclusion that each request did something different and one globalised $post while the other didn't, however I needed to establish why it didn't always send both requests, and whether I should be able to rely on the request being made, or whether I should assume it won't be. This would determine how I went about fixing the issue. I then realised that ultimately, my code in the_content needed to be changed as I couldn't guarantee that WordPress would send this second XHR request all the time.

The problem I had was that on the front end I needed $post to be available (which it would be), but if it wasn't always going to be there during the publishing of a post, how was I supposed to get around this? This is the code again:

add_filter('the_content', function ($content) {
    
    global $post;

    if (!$post) {
        return;
    }

	$link = generate_link_to_content($post);

    return $content . $link;

});

I was returning early if $post wasn't set so it didn't try to generate the link... and that's when it dawned on me. The reason why the the code in /wp-includes/formatting.php returned an empty string was because I was returning nothing from the the_content filter.

All I had to change, was this:

if (!$post) {
    return;
}

to this:

if (!$post) {
    return $content;
}

That was is. Problem solved. And it seems obvious in hindsight (as it always does) - the filter needs to always return content, and I was returning nothing, so all I had to do was return the content that was initially passed through if I couldn't do anything with it first.

It turns out that the additional XHR requests to /wp-admin/post.php were due to custom meta boxes on the post compose page. The initial XHR request uses the WP API to save the post, and the secondary calls are used to process the meta boxes. I hadn't realised that only the original installation had a custom meta box displaying, hence why it only ran these requests on that installation.


So, as a summary of the problem:

  • I was processing a post being published using the transition_post_status hook

  • Inside the hook callback I called get_the_excerpt()

  • This internally runs the the_content filter

  • I had a callback set up on this filter too, which checked if $post was globally defined

  • If it wasn't, I returned out early, instead of returning the $content that was passed through

  • Due to a meta box being present on the post compose page, an additional XHR request was made

  • Code in this request globalised $post before running the transition_post_status hook again

  • This meant the the_content callback didn't return early, and returned content as expected, masking the problem

  • When the meta box was later hidden, the second XHR request was never run

  • This meant $post was not available inside the the_content callback, and the early return bug was exposed, leading to an empty excerpt being returned from get_the_excerpt()

Over 5 hours of debugging later and here we are.

It was essentially a perfect storm of a meta box triggering another request that ran some code that just so happened to define a variable that my code was checking before returning from a function incorrectly, which it then didn't actually do because the variable was defined so the bug was never spotted. Stupid, right? But this sort of "pure dumb luck" of something just happening to be defined is one of the reasons I find WordPress so difficult to work with compared to modern frameworks.

Both requests update the post, which call the transition_post_status hook. With the WP API call, the status goes from draft to publish, and with the secondary calls, it is going from publish to publish. This in itself makes sense as the post is being saved but the status is not changing.

The problem is that these two requests do completely different things before the hook is called, and the application is in a completely different state on each request. The most important line is from inside /wp-admin/post.php itself:

global $post_type, $post_type_object, $post;

This is why $post was only available in the second XHR request, and does not happen during the WP API request.

This is precisely why globally scoped variables are an absolute fucking nightmare. You never know when something will be there and when it won't. It means that in any code called within the transition_post_status hook (or any hook really), whether you call it directly or not (it could be 20 function calls deep later), if anything checks whether $post was available globally, that code would have behaved completely differently between the two XHR requests, and it would be very likely that that would be almost impossible to debug in some circumstances.

Don't get me wrong, this problem was ultimately caused by a bug in my code, but when you've got global variables floating around that are defined in one request and not another, I'm surprised this doesn't happen more often.