1、鎖定某個主題抓??; 2、能夠產生日志文本文件,格式為:時間戳(timestamp)、URL; 3、抓取某一URL時最多允許建立2個連接(注意:本地作網頁解析的線程數則不限) 4、遵守文明蜘蛛規則:必須分析robots.txt文件和meta tag有無限制;一個線程抓完一個網頁后要sleep 2秒鐘; 5、能對HTML網頁進行解析,提取出鏈接URL,能判別提取的URL是否已處理過,不重復解析已crawl過的網頁; 6、能夠對spider/crawler程序的一些基本參數進行設置,包括:抓取深度(depth)、種子URL等; 7、使用User-agent向服務器表明自己的身份; 8、產生抓取統計信息:包括抓取速度、抓取完成所需時間、抓取網頁總數;重要變量和所有類、方法加注釋; 9、請遵守編程規范,如類、方法、文件等的命名規范, 10、可選:GUI圖形用戶界面、web界面,通過界面管理spider/crawler,包括啟停、URL增刪等
標簽: 日志
上傳時間: 2013-12-22
上傳用戶:wang5829
This a GBA(Game Boy Advance) animation sample code. It continue and reverse display 45 BMPs on GBA screen let it looks like an animation.
標簽: GBA animation continue Advance
上傳時間: 2013-12-12
上傳用戶:change0329
How well do you really know Java? Are you a code sleuth? Have you ever spent days chasing a bug caused by a trap or pitfall in Java or its libraries? Do you like brainteasers? Then this is the book for you!
上傳時間: 2013-11-25
上傳用戶:王慶才
Student status management system is development two aspects that typical information management system, IMS( MIS), its development includes primarily the background database creates with support and the front end applies the procedure.Creates to rise to the former request the consistency of data is strong with the integrity, the library that the safeness of data like.But request the latter very much to apply the procedure function complete, easy usage etc. characteristics.
標簽: management development information Student
上傳時間: 2015-11-01
上傳用戶:1101055045
<%@ LANGUAGE="VBSCRIPT" %> <!--#include file="conn.asp" --> <% ProductClass_2=request("ProductClass_2") set rs=server.createobject("adodb.recordset") sqltext="select * from Product" if request("Product_Name")<>"" then sqltext=sqltext &" where Product_Name like %"& request("Product_Name") &"% " else sqltext=sqltext &" where Product_Name like %"& "" &"% " end if if request("Product_Class")<>"" then sqltext=sqltext &" and Class_1 like %"& request("Product_Class") &"% " end if
標簽: ProductClass lt LANGUAGE VBSCRIPT
上傳時間: 2013-11-25
上傳用戶:wl9454
a system for management of library,this system has implement a lot of fuctions ,I appriciate u will like it
標簽: system appriciate management implement
上傳時間: 2015-11-06
上傳用戶:努力努力再努力
This a USB core stack for the built-in USB device of LPC214x microcontrollers. It handles the hardware interface and USB enumeration/configuration. Also included are application examples like a USB joystick HID and USB serial port emulation.
標簽: microcontrollers USB the built-in
上傳時間: 2015-11-14
上傳用戶:talenthn
windows開源代碼 Microsoft Windows is a complex operating system. It offers so many features and does so much that it s impossible for any one person to fully understand the entire system. This complexity also makes it difficult for someone to decide where to start concentrating the learning effort. Well, I always like to start at the lowest level by gaining a solid
標簽: Microsoft operating features windows
上傳時間: 2015-11-24
上傳用戶:zhuyibin
This the implementation of structural SVM for training complex alignment models for protein sequence alignment, especially for homology modeling. The structural SVM algorithm can incorporate many relevant features like secondary structure, relative exposed surface area, profiles and their various interaction into the alignment model. It was developed under Linux and compiles under gcc, built upon the svm^light software by Thorsten Joachims.
標簽: implementation structural for alignment
上傳時間: 2014-01-11
上傳用戶:chenbhdt
Abstract: By using gateway systems on large 32-bit platforms, networks of small, 8- and 16-bit microcontrollers can be monitored and controlled over the Internet. With embedded Linux, these gateways are easily moved from full-blown host PCs to embedded platforms like the PC104. In this class you will learn about hardware platforms that support embedded Linux, Linux kernel configuration, feature selection, installation, booting and tuning.
標簽: bit platforms Abstract networks
上傳時間: 2014-01-05
上傳用戶:kytqcool