########################################################################### ## Start New Site Disallows ## ########################################################################### ## Added MJ12bot 07-02-2013 CB ## Added Yandex 5 second delay 07-10-2013 CB ## Added YYSPider Disallow 07-10-2013 CB ## Added sogou 30 second delay 7-10-13 HE ## Added baiduspider 10 second delay 7-15-13 HE ## Added alexa blocked to other pages, allowed home page with time-out 7-15-13 HE # Blocked proximic 7-15-13 HE # Allowed shopwiki 7-18-13 HE # File is case sensitive - User-agent:, Disallow, Allow - no whitespaces # Blocked psbot 7-19-13 HE # Blocked Mail.Ru 7-21-13 HE # Blocked EasouSpider 7-24-13 HE # Added chinese spiders 08-19-2013 CB # Removed Disallow on /notifyme.asp because the page has been deleted 01-09-2014 AB User-agent: * Disallow: /cd/ Disallow: /checkout/ Disallow: /dvd/ Disallow: /demo/ Disallow: /giftcert/ Disallow: /login.asp Disallow: /images/default_coverart.gif Disallow: /javascript/tagcreator.js Disallow: /iframe.asp Disallow: /wishlist.asp # http://ahrefs.com/robot/ User-agent: AhrefBot Disallow: / User-agent: AhrefsBot Disallow: / User-agent: admedia Disallow: / User-agent: ia_archiver # Alexa Spider Allow: /default.asp Disallow: / Crawl-delay: 10 # specifies a 10 second timeout. # The WayBack Machine (does not honor crawl delay) # AB 11-25-13: Commented Out below two lines # User-agent: archive.org_bot # Allow: /default.asp User-agent: fatbot Crawl-delay: 20 # http://shoulu.jike.com/spider.html They didnt honor a 30 second crawl delay, so now they are blocked 08-21-2013 CB User-agent: JikeSpider Disallow: / User-agent: linkdexbot Disallow: / # http://www.majestic12.co.uk/bot.php?+ User-agent: majestic Crawl-delay: 30 # specifies a 30 second timeout. User-agent: MJ12bot Crawl-delay: 30 # specifies a 30 second timeout. # psbot/0.X (+http://www.picsearch.com/bot.html) User-agent: psbot Disallow: / # Mozilla/5.0 (compatible; proximic; +http://www.proximic.com/info/spider.php) User-agent: proximic Disallow: / # Mozilla/5.0 (compatible; EasouSpider; +http://www.easou.com/search/spider.html) User-agent: EasouSpider Disallow: / # ShopWiki/1.0 ( +http://www.shopwiki.com/wiki/Help:Bot ) User-agent: ShopWiki Crawl-Delay: 5 User-agent: Slurp Crawl-delay: 0.5 # BiggerBetter exalead.com Commercial Data Collection User-agent: exabot Disallow: / User-agent: grapeshot Disallow: / User-agent: ADmantX Disallow: / #################################################### ## Chinese Search Engines in Order of Marketshare ## #################################################### User-agent: baiduspider Crawl-delay: 10 User-agent: 360Spider Crawl-delay: 30 User-agent: sogou Crawl-delay: 30 # specifies a 30 second timeout. User-agent: Sosospider Crawl-delay: 30 # soso does not appear to honor this delay. We're getting a page request a second ## End of Chinese Search Engines ############################ ## Russian Search Engines ## ############################ User-agent: Yandex Crawl-delay: 30 User-agent: YandexBot Crawl-delay: 30 User-agent: Nigma Disallow: / User-agent: Mail.Ru Disallow: / User-agent: YYSpider Disallow: /