?? robots.pm
字號:
# AWSTATS ROBOTS DATABASE#-------------------------------------------------------# If you want to add robots to extend AWStats database detection capabilities,# you must add an entry in RobotsSearchIDOrder_listx and RobotsHashIDLib.#-------------------------------------------------------# $Revision: 1.44 $ - $Author: eldy $ - $Date: 2006/07/17 23:50:54 $# 2005-08-19 Sean Carlos http://www.antezeta.com/awstats.html# added dipsie (not tested with real data).# added DomainsDB.net http://domainsdb.net/# added ia_archiver-web.archive.org (was inadvertently grouped with Alexa traffic)# added Nutch (used by looksmart (furl?))# added rssImagesBot# added Sqworm# added t\-h\-u\-n\-d\-e\-r\-s\-t\-o\-n\-e# added w3c css-validator# added documentation link to bot home pages for above and selected major bots.# In the case of international bots, choose .com page.# Included tool tip (html "title").# To do: parameterize to match both AWStats language and tooltips settings.# To do: add html links for all bots based on current documentation in source# files referenced below.# changed '\wbot[\/\-]', to '\wbot[\/\-]' (removed comma)# made minor grammar corrections to notes below# 2005-08-24 added YahooSeeker-Testing# added w3c-checklink# updated url for ask.com# 2005-08-24 added Girafabot http://www.girafa.com/# 2005-08-30 added PluckFeedCrawler http://www.pluck.com/# added Gaisbot/3.0 (robot05@gais.cs.ccu.edu.tw; )# dded geniebot (wgao@genieknows.com)# added BecomeBot link http://www.become.com/site_owners.html# added topicblogs http://www.topicblogs.com/# added Powermarks; seen used by referrer spam# added YahooSeeker# added NG/2. http://www.exabot.com/# 2005-09-15 added link for Walhello appie# added bender focused_crawler# updated YahooSeeker description (blog crawler)# 2005-09-16 added link for http://linkchecker.sourceforge.net# added ConveraCrawler/0.9d ( http://www.authoritativeweb.com/crawl)# added Blogslive info@blogslive.com intelliseek.com # added BlogPulse (ISSpider-3.0) intelliseek.com# 2005-09-26 added Feedfetcher-Google (http://www.google.com/feedfetcher.html)# added EverbeeCrawler # added Yahoo-Blogs http://help.yahoo.com/help/us/ysearch/crawling/crawling-02.html# added link for Bloglines http://www.bloglines.com# 2005-10-19 fixed Feedfetcher-Google (http://www.google.com/feedfetcher.html)# added Blogshares Spiders (Synchronized V1.5.1)# added yacy# 2005-11-21 added Argus www.simpy.com# added BlogsSay :: RSS Search Crawler (http://www.blogssay.com/)# added MJ12bot http://majestic12.co.uk/bot.php# added OpenTaggerBot (http://www.opentagger.com/opentaggerbot.htm)# added OutfoxBot/0.3 (For internet experiments; outfox.agent@gmail.com)# added RufusBot Rufus Web Miner http://64.124.122.252.webaroo.com/feedback.html# added Seekbot (http://www.seekbot.net/bot.html)# added Yahoo-MMCrawler/3.x (mms-mmcrawler-support@yahoo-inc.com)# added link for BaiDuSpider# added link for Blogshares Spider# added link for StackRambler http://www.rambler.ru/doc/faq.shtml# added link for WISENutbot# added link for ZyBorg/1.0 (wn-14.zyborg@looksmart.net; http://www.WISEnutbot.com. Moved location to above wisenut to avoid classification as wisenut# 2005-12-15# added FAST Enteprise Crawler/6 (www dot fastsearch dot com). Note spelling Enteprise not Enterprise.# added findlinks http://wortschatz.uni-leipzig.de/findlinks/# added IBM Almaden Research Center WebFountain? http://www.almaden.ibm.com/cs/crawler [hc3]# added INFOMINE/8.0 VLCrawler (http://infomine.ucr.edu/useragents)# added lmspider (lmspider@scansoft.com) http://www.nuance.com/# added noxtrumbot http://www.noxtrum.com/# added SandCrawler (Microsoft)# added SBIder http://www.sitesell.com/sbider.html# added SeznamBot http://fulltext.seznam.cz/# added sohu-search http://corp.sohu.com/ (looked for //robots.txt not /robots.txt)# added the ruffle SemanticWeb crawler v0.5 - http://www.unreach.net# added WebVulnCrawl/1.0 libwww-perl/5.803 (looked for //robots.txt not /robots.txt)# added Yahoo! Japan keyoshid http://www.yahoo.co.jp/# added Y!J http://help.yahoo.co.jp/help/jp/search/indexing/indexing-15.html# added link for GigaBot# added link for MagpieRSS# added link for MSIECrawler# 2005-12-21# added aipbot http://www.aipbot.com aipbot@aipbot.com [matthys70 users.sourceforge.net]# added Everest-Vulcan Inc./0.1 (R&D project; http://everest.vulcan.com/crawlerhelp)# added Fast-Search-Engine http://www.fast-search-engine.com/ [matthys70 users.sourceforge.net]# added g2Crawler (nobody@airmail.net) http://crawler.instantnetworks.net/# added Jakarta commons-httpclient http://jakarta.apache.org/commons/httpclient/ (hit robots.txt). May be used as robot or browser - a site may want to remove this entry.# added OmniExplorer_Bot http://www.omni-explorer.com/ [matthys70 users.sourceforge.net]# added USTC-Semantic-Group ai.ustc.edu.cn/mas/en/research/index.php ?# 2005-12-22# added EARTHCOM.info www.earthcom.info# added HTTrack off-line browser 'httrack','HTTrack', http://www.httrack.com/ [Moizes Gabor]# added KummHttp http://www.psychedelix.com/cgi-bin/csv2html.pl?data=allagents.csv&template=detail.html&match=\bid_g_l_301105_2\b [Moizes Gabor]# 2006-01-01 # added Dulance http://www.dulance.com/bot.jsp# added MojeekBot http://www.mojeek.com/bot.html# added nicebot http://www.egghelp.org/setup.htm ?# added Snappy http://www.urltrends.com/faq.php # added sohu agent# added TencentTraveler# added VORTEX http://marty.anstey.ca/robots/vortex/ [matthys70 users.sourceforge.net]# added zspider http://feedback.redkolibri.com/# 2006-01-13# added boitho.com-dc http://www.boitho.com/dcbot.html# added IRLbot http://irl.cs.tamu.edu/crawler# added virus_detector virus_harvester@securecomputing.com# added Wavefire http://www.wavefire.com; info@wavefire.com# added WebFilter Robot# 2006-01-24# added Shim-Crawler http://www.logos.ic.i.u-tokyo.ac.jp/crawler/; crawl@logos.ic.i.u-tokyo.ac.jp# added Exabot exabot.com# added LetsCrawl.com http://letscrawl.com# added ichiro http://help.goo.ne.jp/door/crawlerE.html# 2006-01-27 additional 22 robots from a list provided by Moizes Gabor# added ALeadSoftbot http://www.aleadsoft.com/bot.htm # added CipinetBot http://www.cipinet.com/bot.html # added Cuasarbot http://www.cuasar.com/ # added Dumbot http://www.dumbfind.com/ # added Extreme_Picture_Finder http://www.exisoftware.com/ # added Fooky.com/ScorpionBot/ScoutOut http://www.fooky.com/scorpionbots # added IlTrovatore-Setaccio http://www.iltrovatore.it/aiuto/motore_di_ricerca.html bot@iltrovatore.it# added InsurancoBot http://www.fastspywareremoval.com/ # added InternetArchive http://lucene.apache.org/nutch/bot.html nutch-agent@lucene.apache.org# added KazoomBot http://www.kazoom.ca/bot.html kazoombot@kazoom.ca # added Kurzor http://www.easymail.hu/ cursor@easymail.hu# added NutchCVS http://lucene.apache.org/nutch/bot.html nutch-agent@lucene.apache.org# added NutchOSU-VLIB http://lucene.apache.org/nutch/bot.html nutch-agent@lucene.apache.org# added Orbiter http://www.dailyorbit.com/bot.htm # added PHP_version_tracker http://www.nexen.net/phpversion/bot.php # added SuperBot http://www.sparkleware.com/superbot/ # added SynooBot http://www.synoo.de/bot.html webmaster@synoo.com# added TestBot http://www.agbrain.com/ # added TutorGigBot http://www.tutorgig.info/ # added UP.Browser http://developer.openwave.com/dvl/support/faqs/faq_mag_browser.htm # added WebIndexer mailto://webindexerv1@yahoo.com # added WebMiner http://64.124.122.252/feedback.html# 2006-02-01 # added heritrix https://sourceforge.net/forum/message.php?msg_id=3550202# added Zeus Webster Pro https://sourceforge.net/forum/message.php?msg_id=3141164# additional robots from a list provided by Moizes Gabor [ mojzi -a-t- free mail hu ]# added Candlelight_Favorites_Inspector# added DomainChecker # added EasyDL # added FavOrg # added Favorites_Sweeper# added Html_Link_Validator# added Internet_Ninja # added JRTwine_Software_Check_Favorites_Utility# fixed Microsoft_URL_Control# added miniRank # added Missigua_Locator# added NPBot # added Ocelli # added Onet.pl_SA # added proodleBot # added SearchGuild_DMOZ_Experiment # added Susie # added Website_Monitoring_Bot# added Xenu_Link_Sleuth# 2006-05-15# added ASPseek http://www.aspseek.org/# added AdamM Bot http://home.blic.net/adamm/ # added archive.org_bot http://crawls.archive.org/collections/bncf/crawl.html# added arianna.libero.it (Italian Portal/search engine)# added Biz360 spider http://www.biz360.com# added BlogBridge Service http://www.blogbridge.com/# added BlogSearch http://www.icerocket.com/ # added libcrawl# added edgeio-relanshanbottriever http://www.edgeio.com# added FeedFlow http://feedflow.com/about# added Biblioteca Nazionale Centrale di Firenze (Italian National Archive) http://www.bncf.firenze.sbn.it/raccolta.txt# added Java catchall - used by many spam bots # added lanshanbot http://www.psychedelix.com/cgi-bin/csv2html.pl?data=allagents.csv&template=detail.html&match=%5Cbid_g_l_140406_1%5Cb# added msnbot-media http://search.msn.com/msnbot.htm# added MT::Telegraph::Agent# added Netluchs http://www.netluchs.de/ (German SE bot)# added oBot http://www.webmasterworld.com/forum11/1616.htm# added Onfolio http://www.onfolio.com/ (IE Toolbar plugin) - hit rss feeds.# added ping.blo.gs http://blo.gs/ping.php blog bot# added sogou spider http://corp.sohu.com/20051130/n240842344.shtml# added sogou test http://corp.sohu.com/20051130/n240842344.shtml# added Sphere Scout http://www.sphere.com/# added sproose crawler http://www.sproose.com/bot.html# added SyndicAPI http://syndicapi.com/bot.html# added Yahoo! Mindset http://mindset.research.yahoo.com/# added msrabot# added Vagabondo & Vagabondo-WAP http://www.wise-guys.nl/Contact/index.php?botselected=webagents&lang=uk# fixed Missigua Locator detection (Missigua_Locator -> Missigua Locator)# changed echo to echo! to avoid conflict with the bonecho (Firefox 2.0) browser.# This requires you to reprocess historic logs if you want EchO! to be recognized for older reports.# 2006-05-17# added Alpha Search Agent # 62.152.125.60 Eurologon Srl# added Krugle http://www.krugle.com/crawler/info.html the search engine for developers# added Octora Beta Bot http://www.octora.com/ # Blog and Rss Search Engine# added UbiCrawler http://law.dsi.unimi.it/ubicrawler/# added Yahoo! Slurp China http://misc.yahoo.com.cn/help.html# You must reprocess old logs for the Yahoo! Slurp China bot to be detected in old reports# 2006-05-20# added 1-More Scanner http://www.myzips.com/software/1-More-Scanner.phtml# added Accoona-AI-Agent http://www.accoona.com/# added ActiveBookmark http://www.libmaster.com/active_bookmark.php# added BIGLOTRON http://www.biglotron.com/robot.html# added Bookmark-Manager http://bkm.sourceforge.net/# added cbn00glebot # added Cerberian Drtrs http://www.pgts.com.au/cgi-bin/psql?robot_info=25240# added CFNetwork http://www.cocoadev.com/index.pl?CFNetwork# added CheckWeb link validator http://p.duby.free.fr/chkweb.htm# added Computer and Automation Research Institute Crawler http://www.ilab.sztaki.hu/~stamas/publications/p184-benczur.html# added ConveraCrawler http://www.authoritativeweb.com/crawl/# added ConveraMultiMediaCrawler http://www.authoritativeweb.com/crawl/# added CSE HTML Validator Lite Online http://online.htmlvalidator.com/php/onlinevallite.php# added Cursor http://adcenter.hu/docs/en/bot.html # added Custo http://www.netwu.com/custo/# added DataFountains/DMOZ Downloader http://infomine.ucr.edu/ # added Deepindex http://www.deepindex.net/faq.php# added DNSGroup http://www.dnsgroup.com/# added DoCoMo http://www.nttdocomo.co.jp/# added dumm.de-Bot http://www.dumm.de/# added ETS v http://www.freetranslation.com/help/# added eventax http://www.eventax.de/# added FAST Enterprise Crawler * crawleradmin.t-info@telekom.de http://www.telekom.de/# added FAST Enterprise Crawler http://www.fast.no/# added FAST Enterprise Crawler * T-Info_BI_cluster crawleradmin.t-info@telekom.de http://www.telekom.de/# added FeedValidator http://feedvalidator.org/# added FilmkameraBot http://www.filmkamera.at/bot.html# added Findexa Crawler http://www.findexa.no/gulesider/article26548.ece # added Global Fetch http://www.wesonet.com/# added GOFORITBOT http://www.goforit.com/about/# added GoForIt.com http://www.goforit.com/about/# added GPU p2p crawler http://gpu.sourceforge.net/search_engine.php# added HooWWWer http://cosco.hiit.fi/search/hoowwwer/# added HPPrint # added HTMLParser http://htmlparser.sourceforge.net/# added Hundesuche.com-Bot http://www.hundesuche.com/# added InfoBot http://www.infobot.org/# added InfociousBot http://corp.infocious.com/tech_crawler.php# added InternetSupervision http://internetsupervision.com/# added isearch2006 http://www.yahoo.com.cn/# added IUPUI_Research_Bot http://spamhuntress.com/2005/04/25/a-mail-harvester-visits/# added KalamBot http://64.124.122.251/feedback.html# added kamano.de NewsFeedVerzeichnis http://www.kamano.de/# added Kevin http://dznet.com/kevin/# added KnowItAll http://www.cs.washington.edu/research/knowitall/# added Knowledge.com http://www.knowledge.com/# added Kouaa Krawler http://www.kouaa.com/# added ksibot http://ego.ms.mff.cuni.cz/# added Link Valet Online http://www.htmlhelp.com/tools/valet/# added lwp-request http://search.cpan.org/~gaas/libwww-perl-5.69/bin/lwp-request# added lwp-trivial http://search.cpan.org/src/GAAS/libwww-perl-5.805/lib/LWP/Simple.pm# added MapoftheInternet.com http://MapoftheInternet.com/# added Matrix S.p.A. - FAST Enterprise Crawler http://tin.virgilio.it/# added Megite http://www.megite.com/# added Metaspinner http://index.meta-spinner.de/# added Mini-reptile # added Misterbot http://www.misterbot.fr/# added Miva http://www.miva.com/# added Mizzu Labs http://www.psychedelix.com/cgi-bin/csv2html.pl?data=allagents.csv&template=detail.html&match=\bid_m_141105_2\b # added MSRBOT http://research.microsoft.com/research/sv/msrbot/# added MS SharePoint Portal Server - MS Search 4.0 Robot http://support.microsoft.com/default.aspx?scid=kb;en-us;284022# added Mydoyouhike http://www.doyouhike.net/my# added NASA Search http://www.psychedelix.com/cgi-bin/csv2html.pl?data=allagents.csv&template=detail.html&match=\bid_n_s_140506_2\b# added NetSprint http://www.netsprint.pl/serwis/# added NimbleCrawler http://www.healthline.com/# added OpenWebSpider http://www.openwebspider.org/# added Oracle Ultra Search http://www.oracle.com/technology/products/ultrasearch/index.html# added OSSProxy http://www.marketscore.com/FAQ.Aspx# added passwordmaker.org http://passwordmaker.org/# added PEAR HTTP Request class http://pear.php.net/# added PEERbot http://www.peerbot.com/# added PHP version tracker http://www.nexen.net/phpversion/bot.php# added PictureOfInternet http://malfunction.org/poi/# added plinki http://www.plinki.com/# added Port Huron Labs http://www.psychedelix.com/cgi-bin/csv2html.pl?data=allagents.csv&template=detail.html&match=\bid_n_s_1133\b# added PostFavorites http://www.psychedelix.com/cgi-bin/csv2html.pl?data=allagents.csv&template=detail.html&match=\bid_n_s_1135\b # added ProjectWF-java-test-crawler # added PyQuery http://sourceforge.net/projects/pyquery/# added Schizozilla http://spamhuntress.com/2005/03/18/gizmo/ # added Scumbot# added Sensis Web Crawler http://www.sensis.com.au/# added snap.com beta crawler http://www.snap.com/# added Steeler http://www.tkl.iis.u-tokyo.ac.jp/~crawler/ # added STEROID Download http://faqs.org.ru/progr/pascal/delphi_internet2.htm# added Suchfin-Bot http://www.suchfin.de/# added Sunrise http://www.sunrisexp.com/# added Tagyu Agent http://www.tagyu.com/# added Tcl http client package http://www.tcl.tk/man/tcl8.4/TclCmd/http.htm# added TeragramCrawlerSURF http://www.teragram.com/# added Test Crawler http://netp.ath.cx/# added UnChaos Bot Hybrid Web Search Engine http://www.unchaos.com/# added unido-bot http://www.unchina.org/unido/unido/our_projects/3_3.html# added UniversalFeedParser http://feedparser.org/ (seen from md301000.inktomisearch.com)# added updated http://www.updated.com/# added Vermut http://vermut.aol.com# added versus crawler from eda.baykan@epfl.ch http://www.epfl.ch/Eindex.html # added Vespa Crawler (Yahoo Norway?) http://www.psychedelix.com/cgi-bin/csv2html.pl?data=allagents.csv&template=detail.html&match=%5Cbid_t_z_030406_1%5Cb# added VSE http://www.vivisimo.com/# added webcrawl.net http://www.webcrawl.net/# added Web Downloader http://www.krasu.ru/soft/chuchelo/
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -