Debugging WordPress' post publishing process and the pitfalls of global variables
I will get this out the way now. I don't like WordPress. It's old and obsolete, a pain to develop with, and should be replaced by more modern purpose built CMS solutions like Statamic and Craft CMS.
However, there are many demonstrable reasons as to why I hate it so much, and here is one of them.
I have recently built a WordPress plugin to create content on an external platform when a post is created (exact details are classified Top Secret until it is released). I did this by using the transition_post_status
hook to capture when a post is published, and among other things, I load the excerpt of the post. I also used another hook (the_content
) to add a link to the external resource to the post body, which would only be intended for use on the front end article pages.
Here is some pseudo-code for the two hooks:
add_action('transition_post_status', function ($new_status, $old_status, $post) { if ($new_status != 'publish') { return; } $excerpt = get_the_excerpt($post->ID); if (!$excerpt) { return; } create_external_content($post, $excerpt); });
add_filter('the_content', function ($content) { global $post; if (!$post) { return; } $link = generate_link_to_content($post); return $content . $link; });
You may have already spotted an issue here, but bear with me.
So, I'm basically capturing when the post is published, checking if there is an excerpt, and creating the external content if there is. Additionally, I'm creating a link to the external content which, importantly, requires me to have access to the $post
inside the_content
.
I had this code working well for several months, until yesterday, where on a new installation with a database backup imported, the status transition hook was not creating the external content. However, the same code still worked on my original development installation.
I established that the reason it wasn't working on the newer installation was because get_the_excerpt
was returning an empty string which meant I returned out of the function - but why was it empty? WordPress lets you set an excerpt for a post, but if you don't, it will fall back to the main content body, and the post I was creating did have content, so it should have been generating an excerpt from that.
I traced through the code to see how WordPress generated the excerpt, and found this line in /wp-includes/formatting.php
:
$text = apply_filters( 'the_content', $text );
After this ran, $text
was an empty string. This seemed to be the problem.
This code ran the content through my the_content
hook function (as well as other functions registered on the same hook). I dumped out whether $post
was set inside my hook callback - it turned out that while this was a full WP_Post
object on my original installation, on the new one, it was null
.
Confused as to why this would be, I started to debug the XHR requests WordPress was making when publishing the post, and noticed something odd. While both installations were sending a request to /wp-json/wp/v2/posts/<postId>
when publishing the post, my original installation was then sending additional XHR requests to /wp-admin/post.php
afterwards. When I dumped out the value of $excerpt
, it was an empty string when being called via the first XHR request, but was the value I was expecting when being called via the second XHR request.
I now had several questions. Why was one installation sending these requests to post.php
and the other wasn't? Why is $post
globally available in one request and not the other? And if I can't reliably access $post
inside the the_content
hook, which I needed to be able to do, how would I generate the link to the external content?
I came to the obvious conclusion that each request did something different and one globalised $post
while the other didn't, however I needed to establish why it didn't always send both requests, and whether I should be able to rely on the request being made, or whether I should assume it won't be. This would determine how I went about fixing the issue. I then realised that ultimately, my code in the_content
needed to be changed as I couldn't guarantee that WordPress would send this second XHR request all the time.
The problem I had was that on the front end I needed $post
to be available (which it would be), but if it wasn't always going to be there during the publishing of a post, how was I supposed to get around this? This is the code again:
add_filter('the_content', function ($content) { global $post; if (!$post) { return; } $link = generate_link_to_content($post); return $content . $link; });
I was returning early if $post
wasn't set so it didn't try to generate the link... and that's when it dawned on me. The reason why the the code in /wp-includes/formatting.php
returned an empty string was because I was returning nothing from the the_content
filter.
All I had to change, was this:
if (!$post) { return; }
to this:
if (!$post) { return $content; }
That was is. Problem solved. And it seems obvious in hindsight (as it always does) - the filter needs to always return content, and I was returning nothing, so all I had to do was return the content that was initially passed through if I couldn't do anything with it first.
It turns out that the additional XHR requests to /wp-admin/post.php
were due to custom meta boxes on the post compose page. The initial XHR request uses the WP API to save the post, and the secondary calls are used to process the meta boxes. I hadn't realised that only the original installation had a custom meta box displaying, hence why it only ran these requests on that installation.
So, as a summary of the problem:
I was processing a post being published using the transition_post_status
hook
Inside the hook callback I called get_the_excerpt()
This internally runs the the_content
filter
I had a callback set up on this filter too, which checked if $post
was globally defined
If it wasn't, I returned out early, instead of returning the $content
that was passed through
Due to a meta box being present on the post compose page, an additional XHR request was made
Code in this request globalised $post
before running the transition_post_status
hook again
This meant the the_content
callback didn't return early, and returned content as expected, masking the problem
When the meta box was later hidden, the second XHR request was never run
This meant $post
was not available inside the the_content
callback, and the early return bug was exposed, leading to an empty excerpt being returned from get_the_excerpt()
Over 5 hours of debugging later and here we are.
It was essentially a perfect storm of a meta box triggering another request that ran some code that just so happened to define a variable that my code was checking before returning from a function incorrectly, which it then didn't actually do because the variable was defined so the bug was never spotted. Stupid, right? But this sort of "pure dumb luck" of something just happening to be defined is one of the reasons I find WordPress so difficult to work with compared to modern frameworks.
Both requests update the post, which call the transition_post_status
hook. With the WP API call, the status goes from draft
to publish
, and with the secondary calls, it is going from publish
to publish
. This in itself makes sense as the post is being saved but the status is not changing.
The problem is that these two requests do completely different things before the hook is called, and the application is in a completely different state on each request. The most important line is from inside /wp-admin/post.php
itself:
global $post_type, $post_type_object, $post;
This is why $post
was only available in the second XHR request, and does not happen during the WP API request.
This is precisely why globally scoped variables are an absolute fucking nightmare. You never know when something will be there and when it won't. It means that in any code called within the transition_post_status
hook (or any hook really), whether you call it directly or not (it could be 20 function calls deep later), if anything checks whether $post
was available globally, that code would have behaved completely differently between the two XHR requests, and it would be very likely that that would be almost impossible to debug in some circumstances.
Don't get me wrong, this problem was ultimately caused by a bug in my code, but when you've got global variables floating around that are defined in one request and not another, I'm surprised this doesn't happen more often.