Combat Comment Spam
From Textbook
WARNING! Content may need revised to the current release -- use with caution!
Starting in December 2005 some spammers have caught up with textpattern's built-in spam prevention methods for the first time. In reponse the 4.0.3 release has brought significant improvements with respect to combatting spam, which are detailed in the rest of this page.
Contents |
Built-in Methods
Forced Preview
Built-in methods won't be discussed in detail here, but a few you should know about as a user. Forced Preview is a feature that some people try to get rid of, without realizing the benefits. It is not only the extra step of "clicking the preview (a bot wouldn't care much about that anyhow), but things going on under the hood as well, like creating a valid nonce and checking it after submit. Via the nonce, which in a way also acts similar to a session-id, it is also possible to "connect" different pageviews to the same user, which is also necessary/helpful for anti-spam-plugins. It also keeps anti-spam-resource-usage limited to fewer pages, like only preview and submit, instead of every pageview to an individual article with a comment-form (i.e. processing-time that go into generating captchas, nonces and possibly other measures as well).
Pre-Moderation
Another tool which is sometimes overlooked is (pre-)moderation (which admittedly has a lot more effect on user-experience). Submitted comments will not be immediately visible and have to be manually approved. Bots that beat the forced preview, already use a "dialogue" with your site to submit the comment, they will often check whether their spam-message became visible right away and use that information for their future actions.
Finite Life Span for Comments
Also automatically turning off comments after a period of x Weeks has benefits. You set the week duration (a maximum of 6) in the Preferences Subtab panel. Spammer-tools will often look for certain keywords or heavily linked pages via search-engines to find targets. If you don't have a Comment form on the page, such tools will immediately turn and move on.
Other Comment Security Options
Further options are banning ips, using blacklist-servers for ips and bulk-banning.
Plugins
- asy_captcha - CAPTCHA plugin, Image verification for comments. (optimized for wide compatibility, does not require GD or TTF support)
- mrw_spamkeywords_urlcount - The number of urls in the comment-text is checked against a soft limit (-> moderation queue) and a hard limit (-> marked as spam). Also allows to define blacklisted words.
- asy_stopdude - totally transparent to users. Specifically targetted against the current spam-bots/tools.
- EBL_TXPAkismet - uses Akismet to filter comments.
Writing/Porting an Anti-Spam Plugin
If you are interested in writing an Anti-spam plugin for Textpattern, you are encourages to subscribe to the mailing lists (txp-plugin or txp-dev). An example plugin was posted to those lists and is still available in the archives, including some discussions about it.
For reference here are the commands that will be interesting to you:
// This will call your function before textpattern saves a comment
// So you can analyse the submitted data
register_callback('abc_myfunction','comment.save');
// This will call your function and add the output (use return $mystuff) to the
// comment-form, right after the msg-textarea. This is optional and basically a
// way to immediately have the plugin working. You should offer tags for the user
// to add to his form, and then check for them, before returning anything from this
// callback
register_callback('abc_myotherfunction2','comment.form');
// This will return you the form-variables in an associative array.
// Use this rather than accessing $_POST/$_REQUEST directly.
$form_array = getComment();
// With the following two lines you can tell textpattern if you your
// plugin found out something noteworthy about comment
$evaluator =& get_comment_evaluator();
$evaluator -> add_estimate(SPAM, 0.6);
// add_estimate() takes two values, the first must be one of the following constants:
// SPAM, MODERATE, VISIBLE, RELOAD
// the second value is a number between 0 and 1 which expresses the
// probability/certainty of your estimate. Further examples:
$evaluator->add_estimate(SPAM, 0.3);
$evaluator->add_estimate(VISIBLE, 1);
// If you want to use key-pairs, hidden values etc. you can use the nonce-value and the
// corresponding secret value. The nonce is passed in the form, the secret is only in
// the DB. When checking a submitted form, use the values returned from getComment()
$nonce = getNextNonce();
$secret = getNextSecret();
Multiple plugins can be active at the same time and each will add their estimates. Textpattern will then decide on the following actions:
- mark as visible => comment will be instantly visible
- mark as unmoderated => comment will be pending moderation
- mark as spam => comment will be saved, but marked as spam
- reload the preview page => give the user another try
The administrator is later free to correct the comment-status from the comment-tab. Note, that the commenter will see a notification, that his comment is pending moderation for both unmoderated and spam. (The reason is not to leak information to spammers about what's happening.)
In some cases you may be absolutely sure that the comment is undeniably spam or otherwise unwanted. In that case, you can "kill" off the page-request right away and display an error message like this:
txp_die('Unable to record the comment. Please contact the publisher of this site, if this problem persists.'
,'412 Precondition failed');
Execution of the whole script will stop immediately after showing the message. (The error-page can be customized by the user by creating a 412 error-page).




