MojeekBot is the web crawler for the Mojeek search engine. Although every attempt has been made to be considerate of webmaster, site owner, and hosts, unfortunately mistakes and errors are inevitable. If you have noticed our bot misbehaving in any way, crawling a page or directory it shouldn't, or you just have general enquiries please contact us, we would appreciate your feedback.
MojeekBot should not request, whether successful or not, more than one page from your site within the same 2 second time period. MojeekBot does not support the nonstandard robots.txt crawl-delay directive at this time.
MojeekBot obeys the Robot Exclusion Standard. MojeekBot will obey the first record with a User-Agent containing "MojeekBot". If there is no such record it will obey the first entry with a User-Agent of "*".
MojeekBot will not retrieve any documents with a URL containing a disallowed string, i.e.:
This would cause all URLs containing the string "/private" to be disallowed. For example all of the following would not be retrieved:
Mojeek's engine obeys the noindex, nocache and nofollow meta-tags. If you place the following into the head of your page:
<META NAME="robots" CONTENT="noindex">
MojeekBot will retrieve the page but will not index the document nor will it enter it into the search database.
To verify it's a real MojeekBot visiting your site perform two steps, first, a reverse dns lookup on the visiting ip address:-
> host 184.108.40.206
220.127.116.11.in-addr.arpa domain name pointer crawl-5-102-173-71.mojeek.com.
This should resolve to a name within the mojeek.com domain. Now check this is not a false reverse dns by performing a forward dns lookup on the above response:-
> host crawl-5-102-173-71.mojeek.com
crawl-5-102-173-71.mojeek.com has address 18.104.22.168
This should now return the original visiting ip address, if not, it's not a genuine MojeekBot.
If you have any further questions or comments regarding our bot, please do not hesitate to contact us.