How to EASILY and SIGNIFICANTLY reduce WordPress comment SPAM

Documentation

“I’m a friend of Sarah Connor. I was told she was here. Could I see her please?”

“no”

Let’s face it. Sometimes comment spam can be a real issue. We’ve all logged into a client’s site (or even our own) only to see hundreds of pending and unapproved comments waiting in the queue. Now you have to go through and delete them all, because the truth is, they are almost all certainly spam.

If only there was a way to keep comments open so that people can still leave replies on your blog posts, but stop those darned bots! Well, I’m here to spread the good news.

Disable comments on attachments

The first thing you can do is something that I can safely recommend to just about everyone. In fact, it is my belief that this should be the default behaviour on WordPress. Disable attachment comments.

I’m sure that you’ve noticed that the majority of your spam comments aren’t even on your blog posts, but are instead people commenting on… your attachments somehow. Like, how is this even possible? Your attachments aren’t pages? How are spammers even finding these attachment pages to leave comments?

What are attachment pages?

Well, the answer is that they aren’t. Not directly, at least. If you go to your Media Library (./wp-admin/media.php) and select an attachment, you will see in the lightbox a link to “View attachment page”. That’s right, WordPress creates an attachment page for each and every upload on your site and associates that attachment with a unique ID, just like how every post, page, comment etc gets an ID.

The main purpose for this is so that each attachment can have its own template and be a unique page. Using this, you could create an art site, or upload designs or PDFs so that teammates can review and leave replies – that kind of thing. There are also SEO benefits since these pages can be crawled and indexed by search engines, making it easier for them to find attachments and images on your site.

Why they are problem

But here is the thing. I’ve NEVER needed or wanted this functionality. Most of you reading this right now might not even be aware that attachment pages even exist. So not only do these pages attract spam like the plague, but now you might be scared.

Wait, that super secret file I uploaded to my site might be found by random people or even indexed by Google?

If your site is set up to announce attachments pages in any way, then yes! That secret PDF you uploaded might not be so secret. One of the biggest culprits to look out for are SEO plugins that handle creating sitemaps. Make sure that attachments are EXCLUDED from your sitemaps. This should stop Google from indexing them, and should also make it harder for both bots and people to find. But that’s not enough, so let’s talk about how we can significantly reduce spam AND increase security.

How to stop WordPress attachment pages

There are two parts to this solution. The first is to stop access to the actual attachment pages themselves. The second is to automatically stop comments on attachment pages (Discussion: Allow comments, comments open).

Before we begin, you will need to backup your theme’s functions.php since this is the file we will be adding our code to. You MUST do this, because if you make a mistake adding the following code to your site, it will crash and you’ll need to reload the backup file to try again.

Deny access to attachment pages

// Remove attachment URLs
// This will remove "View attachment page" on Media Library pages
function hdplugins_remove_attachment_link(){
	return "";
}
add_filter("attachment_link", "hdplugins_remove_attachment_link", 10, 2);

// Redirect attachment pages
// and set as a 404 page
function hdplugins_stop_attachment_pages()
{
	if (is_attachment()) {
		global $wp_query;
		$wp_query->set_404();
		status_header(404);
	}
}
add_filter("template_redirect", "hdplugins_stop_attachment_pages");

The first function hdplugins_remove_attachment_link only removes the “View attachment page” links and does not stop people from accessing the attachments pages themselves.

The second function hdplugins_stop_attachment_pages if the function that checks if we are on an attachment page, and returns a standard 404 instead.

The above will stop anyone and anything from accessing the WordPress attachment pages, including search engines.

Close comments on WordPress attachment pages

Just because we have denied access to the pages themselves, doesn’t mean that we have completely eliminated comment spam for attachments. This is because most bots don’t actually visit the page directly and instead send POST requests directly to your site. So if a bot already somehow knows the ID of an attachment (like if they are still in your sitemap file) then they can still post a comment to your site passing that ID as the page , never needing to even go to the attachment page.

So let’s put a stop to that.

// stop comments on attachments
function hdplugins_filter_media_comment_status($open, $post_id)
{
	$post = get_post($post_id);
	if ($post->post_type == 'attachment') {
		return false;
	}
	return $open;
}
add_filter('comments_open', 'hdplugins_filter_media_comment_status', 10, 2);

The above function runs whenever a comment is posted to your site, but before that comment is accepted, and is meant to make sure that the current page/post/attachment/whatever is allowed to have comments.

What we are doing is checking to see if the ID is an attachment page, and if it is, we return false to indicate that “nope, you cannot post comments here, buddy”.

How to protect regular comments

The above code to disable attachment pages and comments on attachments will make a HUGE difference for most people, and is something I recommend adding to every WordPress site – but how can you stop spam comments on regular posts?

Well, there are lots of things you can do, but one thing I avoid whenever possible is adding a CAPTCHA. They are great at stopping automated spam, but aren’t the easiest things to implement on your own, and my own stance is to avoid doing things that piss your users off. I know; controversial.

Adding a honeypot

What do bots and spies have in common? Neither can resist a good Honeypot.

What is a honeypot?

A honeypot is a hidden field that humans can’t see, but bots can. The idea is that since only bots can see this hidden field, only bots will fill it out. Then, on the server-side, we can check the comment data to see if this field has any data. If it does, we can reject the comment as a no good delinquent bot.

How to add a honeypot to your comments

The first thing we need to do is add the hidden field to WordPress’ comment form. We can do this by using the comment_form_default_fields filter.

Feel free to use whatever conventions you want, but some general advice is that you want the new field to be something generic that a bot is more likely to fill out.

function hdplugins_honeypot($fields)
{
	$fields["City"] = '<p class="comment-form-city"><label for="city">City <span class="required">*</span></label> <input id="city" name="city" type="text" value size="30" maxlength="245" required="required" autocomplete="off" tabindex="-1" /></p>';
	return $fields;
}
add_filter('comment_form_default_fields', 'hdplugins_honeypot');

The above code will add a new field called “City” to the comment form. The field is marked as required to make it far more likely that a bot will fill it out. Also note that I set tabindex to -1. This is because we don’t want users accidentally tabbing onto our secret field.

You need to also beware of autocomplete (especially on dumb Safari). Since I named this example field City, it’s possible that if a user autofills your comment form, City will be autofilled too. To avoid this, you will need to name the field and label something more unique – just note that doing this will also reduce the chances that a bot fills it out. I have autocomplete="off" in the field, but Safari ignores that because Apple gotta be Apple.

So now let’s detect when a comment has been submitted, and check if the City field has been filled out.

function hdplugins_preprocess_new_comment($comment)
{
	if (isset($_POST['City']) && $_POST['City'] !== "") {
		die('Go away spam');
	}
	if (!isset($_POST['City'])) {
		// Comment was sent directly instead of through the form
		die('Go away spam');
	}
	return $comment;
}
add_action('preprocess_comment', 'hdplugins_preprocess_new_comment');

Here we are hooking into the WordPress function for when a new comment is about to be posted, and checking to see if the City field has any data. If it does, we reject the comment.

We’re almost done! We have added a new field, and are checking to see if that field was filled out when a comment is posted. But we still have two problems. The first is that the new City field is visible to humans, and the second is that the field is required meaning that humans have to fill it out in order to submit the form!

Let’s start with hiding the field from humans first.

A simple method would be to add some CSS to the .comment-form-city class to hide it with display:none;. This will hide it from humans perfectly! The issue is that some bots are too smart for that and will know that the field is hidden.

So let’s get creative.

.comment-form-city {
	opacity: 0;
	pointer-events: none;
	position: relative;
	z-index: -100;
	text-indent: 100%;
	white-space: nowrap;
	overflow: hidden;
}

The above CSS goes hard. We set pointer-events to none to stop click events. Depending on the design of your form, there may still be ugly “empty” space where the field is supposed to be. If this is the case for you, you can also add height: 1px, just note that that may increase the chances of a bot recognizing this as a trap.

Now we need to deal with the whole required thing. For this, we will use JavaScript.

There’s actually several methods we could use here. We could:

  • Detect when a user submits the form, and before submitting, set the City field to required = false
  • Detect when a user types in one of the other fields, and use that to change the City field to turn required to false
  • Time based enable. Bots usually submit forms fast. For this example, I am going to use 15 seconds as the limit.

I’m going to use a combination here.

function hdplugins_honey_js(interval = 1000) {
	setTimeout(() => {
		const HP = document.getElementById("city");
		const comment = document.getElementById("comment");
		// Do we have comment data yet?
		if (comment.value.length > 5) {
			// good. we have data.
			HP.required = false;
		} else {
			// recheck for data every second
			hdplugins_honey_js();
		}
	}, interval);
}
hdplugins_honey_js(15000);

Note that the above JavaScript is just an example to get the idea across. It would be better to add a change eventListener for the comment field and use that to set a variable instead of rechecking every second like what I am doing in the above example. You could ignore the whole comment.value.length part if you want and just set HP.required to false after the 15 seconds.

What about a reverse honeypot?

A regular honeypot works by trying to get bots to add content where they shouldn’t. A reverse honeypot is the opposite in that this time we expect there to be data.

For example, in our above example, we add a hidden field, and on comment submit we check to see if this field was filled out. If it was, we know it was a bad bot.

However, what if we add a hidden field, and pre-fill it with our own data (such as giving the City field a value of “HDPlugins”)? Then on form submission we A) make sure that the data exists, and B) that the data wasn’t modified. This is actually my preferred method, as it still catches almost all bots, while being the least intrusive for users.

The idea is that we can stop spam by checking in preprocess_comment if A) our custom field has data, and B) if the data matches what we want it to be.

If the field has no data, then we know that the comment did not originate from our form. And if the data was changed (IE: does not equal “HDPlugins”), then we know that a bot must have changed it. Easy peasy!

Liked this article? Sharing is caring

Leave a reply

👍 😆 😠 😢 😍
Reply