chipplechipple

Blog - Movable Type NoHarvester plugin

Technology Movable Type NoHarvester plugin

NoHarvester logoA popular way for comment spammers to get their job done is to automatically harvest comment forms and forward the data to zombie computers who do the spamming. If you're getting a lot of spam comments to the same blog entries, you're probably a victim of the method.

This Movable Type plugin makes it nearly impossible for zombie nodes to post comments to entries harvested by another computer. Extra hidden values are added to all comment forms, one of those values is a server-computed key that's different for each entry and user, and cannot be used by a different computer or faked.

Movable Type NoHarvester plugin

For the active spam fighter, the plugin also adds "whois" and "report" links to the junk comment editing form in Movable Type's back-end, allowing easy reporting of non-harvester spam. Many providers are more than willing to shut down abusive users. Also, a message like "All spam will be reported" near your comment forms may discourage some human spammers with half a brain.

According to Six Apart's ProNet Plugin Survey, 100% of NoHarvester users think that this plugin should be included in the core of Movable Type! That's how good it is!

日本語の説明はこちらにあります。 (oscarさんありがとうございます。)

Features

Requirements

Digg!

Download

NoHarvester plugin version 1.2.1
Tested on Movable Type 3.3, 3.4, 4.0, 4.3 (should work with 3.2, and maybe with some earlier versions)

NoHarvester_1_2_1.zip (7K)

Movable Type 3.2 (or earlier) users must download the plugin BigPAPI version 1.04 and put it in the plugins or NoHarvester folder.
BigPAPI_1_04.zip (3K)

Installation and setup

Warning: Please follow installation steps carefully, otherwise legit comments can be blocked by this plugin.

  1. Place the NoHarvester folder inside of your Movable Type installation's plugins folder (typically /cgi-bin/mt/plugins/)
  2. Go to the plugins settings in System Overview (there are no blog-level settings)
  3. Follow instructions to add required fields to your CGI and PHP (static) templates
    (This is not necessary for CGI templates that use <MTCommentFields> to output the form.)
  4. Optionally, for increased security, specify an original "salt" and save changes
  5. Rebuild all individual archives or others that contain comment forms, and wait a little while (in case users might be in the process of posting comments)
  6. Again in the plugin settings, check the "Enable blocking" box, and save changes

NoHarvester is now operational. Try posting a comment to your own blog to make sure that everything works correctly. If there's a problem, like the "Spam isn't welcome here" message (which shouldn't occur if instructions above were followed), try rebuilding.

User agreement

This software's author will not be held responsible if data was to be lost or damaged because of its use.

Version history

Suggestions and comments

Suggestions to improve this plugin are welcome.

Update notifications

To be notified of the latest updates to this plugin and other great plugins, please install MT Plugin Network.


Logo created from "No Rockets, No Ray Guns, No Robots!" photo/artwork by The Rocketeer

Posted on September 26, 2006 at 12:00 | Tweet |


Trackback


Comments RSS

I am suffering from exactly the problem you're describing. So this plugin will be a huge help! Thanks!

Posted by Brad on September 27, 2006 at 10:20


Great! Let me know how it works out for you!
You will have to make your individual archives PHP first though, in order to use it.

Posted by Patrick on September 27, 2006 at 10:24


Ooohh, you're right. Hmmm...OK, maybe this is more than I can get into at work. Maybe I'll wait until I get home tonight...

Posted by Brad on September 27, 2006 at 11:03


OK, just so I'm clear on this, we have to be using MT's generated CGI. Can we not manually add tags to our raw MT generated HTML files?

Posted by Jake on September 28, 2006 at 02:08


Unfortunately the concept can't work with static HTML files, but it will work on PHP files (created from MT's static templates).

Why this plugin can make it impossible for spammers to use zombie computers across the planet lies in that the key is created dynamically depending on the client IP. That's why it won't work with static HTML.

So, as long as your server supports PHP, you could either change the extension of your files to "php" (Settings > Publishing > File Extension for Archive Files) and rebuild.

To then prevent your URLs from breaking, you could add something like this in a .htaccess file. (You must replace "mt/archives/" by your actual path to your MT archives, and be careful that this rule doesn't match non-archive files.)

RewriteEngine On
RewriteRule ^mt/archives/(.*)\.html$ /mt/archives/$1.php [R,L]

Beware that PHP files must be made executable. More info here:
http://www.sixapart.jp/movabletype/manual/3.2/mtmanual_troubleshooting.html#my%20php%20output%20files%20need%20to%20be%20executable

Posted by Patrick on September 28, 2006 at 02:33


My files are already *.php files -- sorry, I wasn't clear on that -- they're just not dynamic, they're static, with a few PHP includes and variables. So how would I implement this then?

Posted by Jake on September 28, 2006 at 02:38


Nevermind. I can't read directions, and see you already put the tags in the plugin setup :-)

Posted by Jake on September 28, 2006 at 02:40


OK, this thing appears to be blocking legitimate comments, as I'm getting the "Spam isn't wanted here" message, but the activity logs have this:

Blocked comment: Bad auth_check. entry_id = 1, remote_ip = '66.39.163.6', REMOTE_ADDR = 66.39.163.6

If I'm not mistaken, that's a match, right? Why's it getting blocked?

Note that this is happening older entries where moderation might be automatically kicking in, but appears to work fine on newer entries.

Posted by Jake on September 28, 2006 at 03:26


Think I found the problem. I accidently had copied two of the the "auth_check" fields (both types) PHP-generated pages, and it freaked out. Works fine now (at least in my bit of testing).

Posted by Jake on September 28, 2006 at 04:48


Sorry for the trouble, and thanks for giving the plugin a try! I'm glad you could figure it out!

For clarity, the only reason for a "Spam isn't welcome here" is that the auth_check value doesn't match. Possible causes are:
1. auth_check field hasn't been added correctly to the templates (like in your case) or isn't being computed (static HTML page?)
2. pages haven't been rebuilt after adding the fields or changing the "salt"
Though when following instructions closely, the two cases above shouldn't happen. Then the only cause is:
3. a spammer is trying to tamper with it

Posted by Patrick on September 28, 2006 at 08:21


I have been recently actively attacked by spam comments which either have a '_' instead of '.' in a URL link in the comment content, or have the URL in BBCode format and Spam Lookup and Keyword filters kept letting them through.

I got hold of a regex to counteract this specific comment content.

I also read (and implemented) a tip from 'Learning Movable Type'. This is deleting a code from Individual Archive and Comment Listing Archive, forcing all commentators to |Preview| their comment post and then |Post|. This is said to counteract the bots.

Would this counteract the 'harvester' bots also???

Posted by RJ on October 17, 2006 at 22:13


The "must preview" trick might prevent some zombie bots (who received data from harvester bots) from posting comments, but still it would be easy for spammers to hack around that tactic. I don't know whether they do or not.

The NoHarvester plugin gets rid of all comments submitted by zombie bots who receive data from harvesters, no matter what.

Posted by Patrick on October 18, 2006 at 00:35


I'm getting a lot of these types of errors in my logs after just installing this plugin on a friend's site:

&Digest::MD5::md5_hex function probably called as class method at /home/[username]/public_html/cgi-bin/mt/plugins/NoHarvester/NoHarvester.pl line 162.

Any ideas why? It appears I see them on my error logs as well. The plugin works fine, just want to make sure things are OK.

(The reason I'm even looking there is that I'm having odd-ball comment errors crop up, occassionally seeing the default PHP MT error template with there's no error being reported, and this is the only thing I recently installed that would affect comments.)

Posted by Jake on October 24, 2006 at 08:54


Thanks for letting me know of that warning. It's inoffensive, but I fixed the problem in the new version 1.0.2, so you can upgrade if you want to get rid of the warning.

I don't know about the odd-ball comment errors. What does the error page say?
You could try temporarily disabling the plugin and see whether the errors persist or not.

Posted by Patrick on October 24, 2006 at 10:40


I am getting the following error coming up in my cron logs with MT 3.3:

Use of uninitialized value in concatenation (.) or string at ~/cgi-bin/mt/plugins/NoHarvester/NoHarvester.pl line 163.

Any ideas on what is causing this?

Posted by Justin on January 20, 2007 at 00:42


Thanks for trying the plugin Justin!

I can't see why that warning would happen, unless the server variable $ENV{'SCRIPT_FILENAME'} wouldn't be set for some reason on your web server.

If this doesn't give you a hint of what may be the cause, you can try replacing line 163 in NoHarvester.pl by the following (however be sure to enter something for the "salt" value in the plugin's settings).

  return '';

Don't hesitate to let me know if there's anything else.

Posted by Patrick on January 20, 2007 at 01:12


What does this error mean?

Subroutine MT::App::Comments::post redefined at /home/mgknet/public_html/cgi-bin/mt/plugins/NoHarvester/NoHarvester.pl line 55

Posted by Mike on February 21, 2007 at 18:26


Thanks for trying out NoHarvester!

That warning (not an error), although inoffensive, has just been fixed in new version 1.0.3.
Please upgrade if you wish for those warnings to disappear, but they can also be safely ignored.

Posted by Patrick on February 22, 2007 at 10:18


Wow! What fast response and great support! You rock!

Keep up the good work. ;)

Posted by Mike on February 23, 2007 at 04:58


No problem, and thanks again! :)

Posted by Patrick on February 23, 2007 at 10:18


I'm testing it and I keep getting blocked by "cached page?" error. I set myself as a trusted commenter though so shouldn't it ignore anything coming from trusted commenters.

Posted by darkmoon on March 3, 2007 at 07:57


Thanks for trying NoHarvester!

You must rebuild your pages before NoHarvester works. Please follow the installation instructions carefully and it should work.

Don't hesitate to let me know if you run into other trouble.

Posted by Patrick on March 3, 2007 at 08:39


I'm afraid I'm not clear on the addition to the CGI templates.

1) Those would be "Comment Preview Template" and "Comment Error Template" under "System Templates"?

2) If we're using <MTCommentFields preview="1" static="1"> in those templates, where do we put the required additions?

Sorry to be confused.

(The plugin is currently disabled on my page until I get this problem sorted.)

Posted by Kate Nepveu on March 3, 2007 at 12:09


Either I forgot the second part of my comment or it was stripped as HTML.

Do we have to stop using the "MTCommentFields" template tag if we're using it in "Comment Preview Template" and "Comment Error Template"?

Posted by Kate Nepveu on March 3, 2007 at 12:26


hrm. I rebuilt the whole system and it's still giving me the cached page error. added the links too.

Posted by darkmoon on March 3, 2007 at 13:43


When you add the links, does it matter where you stick those input lines? I don't think it does, but wasn't sure.

Posted by darkmoon on March 3, 2007 at 16:22


Kate >
Thanks for trying the plugin!
1) Yes these are the ones.
2) Indeed currently this doesn't support <MTCommentFields> (I didn't even know this existed). I might do something in the next version to support it, but in the mean time if you wish to use the plugin you would have to use a full form instead of <MTCommentsFields>.
Otherwise please stay tuned for a new version, as I am definitely interested in supporting all cases.

darkmoon >
Yes it matters. You must add them right after the <form> line (or elsewhere between the <form> and </form> tags).

Posted by Patrick on March 3, 2007 at 20:13


Patrick: Thanks for the quick response. The Spam Firewall plugin has currently knocked my spam traffic way, way down, so I will wait for a new version (or another spike in traffic) rather than mess around with the comment forms, which I traditionally have bad luck with.

Thanks again, and I'll keep an eye on the plugins news.

Posted by Kate Nepveu on March 3, 2007 at 23:06


Okay... errors, wise:
My Comment Errors page doesn't have a form tag, so there's no where to place the tag.

When I turn on blocking, after putting the noharvester tags in between the form tags, I still get blocked.

Now the errors are "cached page, jumping ips?"

Turned off blocking for now. Any ideas?

Posted by darkmoon on March 4, 2007 at 01:08


Then you must be having the same problem as Kate (with a <MTCommentFields> tag in these templates instead of a <form>).

You will have to wait for the next version, which I will try to come up with as soon as possible. Please check back in a few days.

Posted by Patrick on March 4, 2007 at 09:58


I have now added support for <MTCommentFields>. Kate and darkmoon, please download latest version 1.1 and see if it works for you.

For templates using <MTCommentFields>, no change to the template is needed, as it's handled automatically.

Posted by Patrick on March 5, 2007 at 12:01


Can you check the zip? I've downloaded 1.1 a couple times and it says it's damaged and won't expand.

Posted by darkmoon on March 5, 2007 at 14:47


Ouch, you're right. It's fixed now!

Posted by Patrick on March 5, 2007 at 14:54


Couple of things.

1) nowhere to put it in my comments-error template still. For the most part, I can only tell that it's just an error message area.

2) Still getting the cached page? jumping ips? error.

3) In activity logs, it shows this:
Blocked comment: Bad remote_ip '', Bad auth_check. entry_id = 4337, REMOTE_ADDR = X.X.X.X

I blanked out the IP, but not sure why the php isn't running. Perhaps it's because I typekey validated? Not sure.

Posted by darkmoon on March 5, 2007 at 15:10


Turning off blocking for now. :)

Posted by darkmoon on March 5, 2007 at 15:11


Ok, there seems to be an additional problem in your case. Perhaps you overlooked the following in the plugin requirements:

"All comment forms must be located on PHP pages generated from static templates and/or MT's CGI templates (no raw HTML pages or dynamic templates)"

Currently, your individual entry pages are raw .HTML, so the plugin cannot work.You'd have to first make your individual pages .PHP in order to use this plugin.

Posted by Patrick on March 5, 2007 at 15:17


you can actually generate php in extensions of html. Done it plenty of times. I'll also point out that when I test it, I get the above with the php lines, but I actually have activity logs of actual spambots getting blocked also.

So it does work. Just blocking everything that I can tell.

I'm using pure static templates except for the spamfw stuff which doesn't effect the current.

Posted by darkmoon on March 5, 2007 at 15:50


I'm just saying that right now, the PHP code appears as is in your pages, so it's not being executed. So in this current state, it won't correctly block spam only, but rather block every comment.

Posted by Patrick on March 5, 2007 at 16:00


I'll test it with php extension in the morning.

Posted by darkmoon on March 5, 2007 at 16:41


looks like it works. I still don't have a place under comment errors template (probably because the template creator never had a form in the errors template to begin with). Otherwise, I'm getting comments through now. Not sure if everything else works, but at least that. :)

Thanks for the fix and help with debug.

Posted by darkmoon on March 5, 2007 at 16:49


Hi! Looks like you're almost there. At least you've got it nailed in the individual entry pages.
However you seem to have put the wrong code in the Comment Preview template. There you should use the correct code for CGI templates from the plugin's settings.

Posted by Patrick on March 5, 2007 at 16:57


Hey! Thanks for that catch. It's fixed now.

In your instructions, you might want to add to put the links between the form tags. I don't remember seeing it in the plugin or instructions above.

Maybe I'm just dense, but I surely didn't figure it out until you told me. :)

Posted by darkmoon on March 5, 2007 at 23:59


I'm glad it finally worked out! :)
You're right about the instructions, I'll try to make that clearer!

Posted by Patrick on March 6, 2007 at 00:01


I am completely lost.

I saved "No Harvester" to my computer.

Now what do I do?

My MT 3.2 installation files are not on my computer. It is through yahoo and I just log on and do what I gotta do.

How does the No Harvestor get uploaded to my MT plugins folder?

Totally confused, please help...

Thanks, Joe

Posted by Joe D. on March 6, 2007 at 07:32


Please refer to the "Installing Plugins" documentation from the Movable Type help.
http://www.sixapart.com/movabletype/docs/3.2/11_advanced_topics/installing_plugins.html

You need to know how to (and be allowed to, by your hosting provider) upload files to your Movable Type installation. If you need help with this you will have to contact your provider.

Good luck.

Posted by Patrick on March 6, 2007 at 08:33


Version 1.1 works a treat! I tested posting, previewing, and posting from a Google cache.

Thanks very much--I look forward to seeing blocked comments pile up in the activity log. =>

Posted by Kate Nepveu on March 6, 2007 at 08:48


Oh, minor nit: it still shows as version 1.0.3 in the Plugins Settings page.

Posted by Kate Nepveu on March 6, 2007 at 08:55


Glad it worked!

I've now fixed the version number. Good catch. :)

Posted by Patrick on March 6, 2007 at 10:42


I spent an hour and a half on the phone with Yahoo Web Hosting, and they were not able to tell me how to upload NoHarvest into my plugins for Movable Type 3.2.

If anyone can help me I would really appreciate it.

Thanks

Posted by Joe D. on March 7, 2007 at 03:29


Dumb question: by design, will this block comments coming from behind a proxy? I'm behind a firewall at work, and wasn't able to post a comment from there ("cached page? jumping URLs?"). The activity log has:

Blocked comment: Bad remote_ip '12.***.***.**', Good auth_check. entry_id = 3392, REMOTE_ADDR = 172.**.***.**

I'm not very familiar with network workings.

Posted by Kate Nepveu on March 7, 2007 at 08:19


Joe, usually it would be a folder called /cgi-bin/mt/plugins/ (as indicated in the instructions above), but otherwise I'm afraid I can't help any further.

Kate, there shouldn't be a problem with most proxies.
The only proxy issue I can think of is when a proxy uses several IPs when accessing the web even for the same user. That's what the "Jumping IPs?" in the error message refers to.
NoHarvester does allow some flexibility there, and if both IPs are of the same C class (this means having the same first three numbers, e.g. 172.11.22.1 and 172.11.22.2) the comment is allowed through. This should therefore be flexible enough for most proxies, but there may be stranger network setups that I'm not aware of...

Posted by Patrick on March 7, 2007 at 10:04


Thanks--I'll re-enable the plugin tomorrow and try it again from work, in case it was some weird passing problem.

Posted by Kate Nepveu on March 7, 2007 at 10:37


After only installing this plugin, the posting of comments comment forms would end in failure - a MT error page with no content (i.e. error message). This prevents the plugin from being used on individual blogs. In other words, the plugin has to be enabled and implemented on all blogs within that install. Can you please fix this bug? Many thanks!

Posted by Ian Fenn on March 29, 2007 at 13:02


Thanks a lot for trying NoHarvester, Ian!

So far this plugin was intended for use on a whole system, not specific blogs. I understand that it may be useful to implement it only on certain blogs of a system, however I haven't yet had time to do that.
That's on my "to do" list, but meanwhile NoHarvester needs to be implemented on all your blogs in order to function properly.

As for the empty error page, I cannot see what may be the cause, as NoHarvester always provides an error message. So perhaps this error page could be caused by another plugin (an incompatibility with NoHarvester could be possible).

Posted by Patrick on March 29, 2007 at 13:12


Hi Patrick, thanks for the reply. It would be great if you could look at sorting this soon. It's an excellent plugin but I can't use it as one of the blogs in the install doesn't parse for php and a switch is out of the question currently. I'm afraid the blank error page is definitely caused by NoHarvester - I've checked this on two installs with all other plugins disabled. This is with MT3.34.

Posted by Ian Fenn on March 30, 2007 at 14:00


パトリックさんこんにちは
NoHarvesterはとても重宝しています
1つリクエストがあるのですが、Docomoなど携帯電話からの投稿が時々NoHarvesterでブロックされてしまいます
携帯からのコメント投稿時に移動中だったりしてIPが変わったりするのが原因じゃないかと推測しています
もしホワイトリストのような設定があれば、NoHarvesterがさらに便利になると思うのです
そのような機能を追加することが可能でしたらご検討下さい
今後一層の活躍を期待しています

Posted by Kazz on May 14, 2007 at 12:22


Kazzさん、コメントありがとうございました。
返事が遅くなってすみません。DoCoMoでの投稿の問題ははじめて聞きましたが、DoCoMoが利用するIPを調べてみたら、確かにありえる問題ですね。

まずは、どういうふうに携帯からブログにアクセスしているかお聞きしたいのですが。
・ MTで、携帯版テンプレートを作って別URLで公開している?
・ mt4i [1] などの携帯用ツール/プラグインからアクセス?
・ 携帯のフルブラウザで普通にブログを見ている?

ホワイトリスト機能も必要になりそうですが、普段と別なアクセス方法をしている場合でしたら、他の方法も考えられるかもしれません。

[1] http://www.hazama.nu/pukiwiki/index.php?MT4i

Posted by Patrick on May 18, 2007 at 09:48


Patrickさんこんにちは
誤解を招く書き方だったようですみません
ブログのエントリの新規投稿時ではなく、サイトの利用者がDoCoMoなどの携帯電話を使って「コメント」を書き込もうとした時に、この現象が発生しています
# 電車などで移動中にコメントを投稿しようとした
# 携帯電話の「戻る」などの操作でページを表示して、コメントを投稿しようとした
このような時に「Cashed page? Jumping IPs?」のコメント受け付けエラーが出てくることが多いようです(もちろんこれはNoHarvesterの動作としては正しいものだと認識しています)
現状では、利用者に対して、携帯からコメントを投稿する場合は、なるべく移動中は控え、さらに必ず「再読込」してからコメントを書くようにお願いしていますが、サイト側(NoHarvester側)で、もし対応することが可能だったらうれしいなと思った次第です

Posted by Kazz on May 19, 2007 at 02:32


Hi
I was wondering if I could do something to prevent your plugging from stopping real commenters.
currently i have been getting emails from blogers who daily visit my site and they are saying they are
not being allowed to comment... how can i fix this and still use your plugging?
any help will be great.

Posted by amintorres on November 6, 2007 at 11:55


First of all, disable NoHarvester from the system overview's plugin settings, to make it stop blocking comments.

Next, if you still want to use NoHarvester, follow instructions above ("Installation and setup") carefully to add it to your templates before enabling it, as it probably hasn't been installed properly since it's blocking real commenters.

Very important note: All comment forms must be located on PHP pages generated from static templates and/or MT's CGI templates (no raw HTML pages or dynamic templates)

Posted by Patrick on November 6, 2007 at 12:01


Hey, I'm on MT 4, and I've followed the instructions exactly for installation (as far as I know). But I'm getting a "Cached page? Jumping IPs?" message every time I try to post, like other users above. I've tried everything you suggested to them, and still nothing works.

Do I need to do anything differently for mt4? I'm pretty much using their templates out of the box, and I'm adding the code after the tag in the "comment form" module template.

Something isn't right. Thanks for any help. I look forward to using this.

Posted by JT on November 10, 2007 at 23:23


JT, thanks a lot for trying out NoHarvester!

However, I'm afraid you missed this important requirement:
"All comment forms must be located on PHP pages [...]"

Currently, your forms appear on static HTML pages, and this plugin cannot do its job.

Assuming that your server allows PHP, you would have to change your archives extension to "php" (this can be done in MT, in the blog's settings) and then rebuild your blog.

Posted by Patrick on November 10, 2007 at 23:44


Thanks Patrick. I'm gonna give that a shot. Are there any general drawbacks to publishing as php pages instead of html?

Posted by JT on November 11, 2007 at 02:13


You'll have to excuse me Patrick. I'm learning MT as I go, and the official documentation for mt4 still officially sucks. I tried changing the extensions and re-publishing my blog but them all the links become broken.

Did you mean republish when you said "rebuild your blog"? or is rebuilding something different.

Thanks again.

Posted by JT on November 11, 2007 at 02:25


Yes, rebuild is just the old term for republish.

This is getting quite beyond of the scope of support for this plugin, but I'll try to answer your questions.

Naturally, links to your old pages will become broken unless you use mod_rewrite to redirect old URLs into new ones. I've explained how to earlier in the comments here. The two lines below must be put in a text file named .htaccess in your site's root directory.

RewriteEngine On
RewriteRule ^([0-9]{4}/[0-9]{2}/.*)\.html$ /$1.php [R,L]

I can't help much further than this.

Posted by Patrick on November 11, 2007 at 13:32


Understood Patrick. Thanks for the quick responses. I'll see if I can work it out.

Posted by JT on November 11, 2007 at 13:46



« BNE: Worldwide Graffiti Art? | Back to main page | Countryside landscapes »