-
以后再也不用擔(dān)心寫(xiě)爬蟲(chóng)ip被封,不用擔(dān)心沒(méi)錢(qián)買(mǎi)代理ip的煩惱了
在使用python寫(xiě)爬蟲(chóng)時(shí)候,你會(huì)遇到所要爬取的網(wǎng)站有反爬取技術(shù)比如用同一個(gè)IP反復(fù)爬取同一個(gè)網(wǎng)頁(yè),很可能會(huì)被封。如何有效的解決這個(gè)問(wèn)題呢?我們可以使用代理ip,來(lái)設(shè)置代理ip池。
現(xiàn)在教大家一個(gè)可獲取大量免費(fèi)有效快速的代理ip方法,我們?cè)L問(wèn)西刺免費(fèi)代理ip網(wǎng)址
這里面提供了許多代理ip,但是我們嘗試過(guò)后會(huì)發(fā)現(xiàn)并不是每一個(gè)都是有效的。所以我們現(xiàn)在所要做的就是從里面提供的篩選出有效快速穩(wěn)定的ip。
以下介紹的免費(fèi)獲取代理ip池的方法:
優(yōu)點(diǎn):免費(fèi)、數(shù)量多、有效、速度快
缺點(diǎn):需要定期篩選
主要思路:
從網(wǎng)址上爬取ip地址并存儲(chǔ)
驗(yàn)證ip是否能使用-(隨機(jī)訪問(wèn)網(wǎng)址判斷響應(yīng)碼)
格式化ip地址
代碼如下:
1.導(dǎo)入包
import requests
from lxml import etree
import time
1
2
3
2.獲取西刺免費(fèi)代理ip網(wǎng)址上的代理ip
def get_all_proxy():
url = 'http://www.xicidaili.com/nn/1'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36',
}
response = requests.get(url, headers=headers)
html_ele = etree.HTML(response.text)
ip_eles = html_ele.xpath('//table[@id="ip_list"]/tr/td[2]/text()')
port_ele = html_ele.xpath('//table[@id="ip_list"]/tr/td[3]/text()')
proxy_list = []
for i in range(0,len(ip_eles)):
proxy_str = 'http://' + ip_eles[i] + ':' + port_ele[i]
proxy_list.append(proxy_str)
return proxy_list
1
2
3
4
5
6
7
8
9
10
11
12
13
14
3.驗(yàn)證獲取的ip
def check_all_proxy(proxy_list):
valid_proxy_list = []
for proxy in proxy_list:
url = 'http://www.baidu.com/'
proxy_dict = {
'http': proxy
}
try:
start_time = time.time()
response = requests.get(url, proxies=proxy_dict, timeout=5)
if response.status_code == 200:
end_time = time.time()
print('代理可用:' + proxy)
print('耗時(shí):' + str(end_time - start_time))
valid_proxy_list.append(proxy)
else:
print('代理超時(shí)')
except:
print('代理不可用--------------->'+proxy)
return valid_proxy_list
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
4.輸出獲取ip池
if __name__ == '__main__':
proxy_list = get_all_proxy()
valid_proxy_list = check_all_proxy(proxy_list)
print('--'*30)
print(valid_proxy_list)
1
2
3
4
5
技術(shù)能力有限歡迎提出意見(jiàn),保證積極向上不斷學(xué)習(xí)
————————————————
版權(quán)聲明:本文為CSDN博主「彬小二」的原創(chuàng)文章,遵循 CC 4.0 BY-SA 版權(quán)協(xié)議,轉(zhuǎn)載請(qǐng)附上原文出處鏈接及本聲明。
原文鏈接:https://blog.csdn.net/qq_39884947/article/details/86609930
標(biāo)簽:
python
ip
代理
防止
上傳時(shí)間:
2019-11-15
上傳用戶(hù):fygwz1982
-
In the hit CBS crime show Person of Interest, which debuted in 2011,
the two heroes—one a former Central Intelligence Agency agent and
the other a billionaire technology genius—work together using the
ubiquitous surveillance system in New York City to try to stop violent
crime. It’s referred to by some as a science fiction cop show. But the
use of advanced technology for crime analysis in almost every major
police department in the United States may surpass what’s depicted
on TV crime dramas such as Person of Interest. Real-time crime cen-
ters (RTCCs) are a vital aspect of intelligent policing. Crime analysis
is no longer the stuff of science fiction. It’s real.
標(biāo)簽:
Intelligence
Analysis
Crime
上傳時(shí)間:
2020-05-25
上傳用戶(hù):shancjb
-
This books attempts to provide an extensive overview on Long-Term Evolution
(LTE) networks. Understanding LTE and its Performance is purposely written to
appeal to a broad audience and to be of value to anyone who is interested in 3GPP
LTE or wireless broadband networks more generally. The aim of this book is to
offer comprehensive coverage of current state-of-the-art theoretical and techno-
logical aspects of broadband mobile and wireless networks focusing on LTE. The
presentation starts from basic principles and proceeds smoothly to most advanced
topics. Provided schemes are developed and oriented in the context of very actual
closed standards, the 3 GPPP LTE.
標(biāo)簽:
Performance
LTE
and
its
上傳時(shí)間:
2020-05-27
上傳用戶(hù):shancjb
-
The wide deployment of wireless networks and mobile technologies, along with the
significant increase in the number of mobile device users, have created a very strong
demand on various wireless-based, mobile-based software application systems and
enabling technologies. This not only provides many new business opportunities and
challenges to wireless and networking service providers, mobile technology ven-
dors, and software industry and solution integrators, butalso changes and enhances
people’s lives in many areas, including communications, information sharing and
exchange, commerce, home environment, education, and entertainment. Business
organizations and government agencies face new pressure fortechnology updatesto
upgrade their networking infrastructures with wireless connectivity to enhance
enterprise-oriented systems and solutions.
標(biāo)簽:
Wireless-Based
Software
Systems
上傳時(shí)間:
2020-06-01
上傳用戶(hù):shancjb
-
The main aim of this book is to present a unified, systematic description of
basic and advanced problems, methods and algorithms of the modern con-
trol theory considered as a foundation for the design of computer control
and management systems. The scope of the book differs considerably from
the topics of classical traditional control theory mainly oriented to the
needs of automatic control of technical devices and technological proc-
esses. Taking into account a variety of new applications, the book presents
a compact and uniform description containing traditional analysis and op-
timization problems for control systems as well as control problems with
non-probabilistic models of uncertainty, problems of learning, intelligent,
knowledge-based and operation systems – important for applications in the
control of manufacturing processes, in the project management and in the
control of computer systems.
標(biāo)簽:
Modern_Control_Theory
上傳時(shí)間:
2020-06-10
上傳用戶(hù):shancjb
-
It all started rather innocuously. I walked into Dr GT Murthy’s office one fine day, andchanged my life. “Doc” was then the General Manager, Central R&D, of a very largeelectrical company headquartered in Bombay. In his new state-of-the-art electronics center,he had hand-picked some of India’s best engineers (over a hundred already) ever assembledunder one roof. Luckily, he too was originally a Physicist, and that certainly helped me gainsome empathy. Nowadays he is in retirement, but I will always remember him as athoroughly fair, honest and facts-oriented person, who led by example. There were severalthings I absorbed from him that are very much part of my basic engineering persona today.You can certainly look upon this book as an extension of what Doc started many years agoin India … because that’s what it really is! I certainly wouldn’t be here today if I hadn’t metDoc. And in fact, several of the brash, high-flying managers I’ve met in recent years,desperately need some sort of crash course in technology and human values from Doc!
標(biāo)簽:
開(kāi)關(guān)電源
上傳時(shí)間:
2021-11-23
上傳用戶(hù):
-
PADS Layout 的用戶(hù)接口具有非常易于使用和有效的特點(diǎn)。PADS Layout 在滿(mǎn)足專(zhuān)業(yè)用戶(hù)需要的同時(shí),還考慮到一些初次使用PCB 軟件的用戶(hù)需求。教程的這節(jié)將將覆蓋以下內(nèi)容:· 使用PADS Layout 進(jìn)行交互操作· 工作空間的使用· 設(shè)置柵格(Grids)· 使用取景(Pan)和縮放(Zoom)· 面向目標(biāo)(Object Oriented)的選取方式
標(biāo)簽:
pads
上傳時(shí)間:
2021-11-28
上傳用戶(hù):
-
高清電子書(shū)-C++ Primer Plus, 第6版英文版 1438頁(yè)Learning C++ is an adventure of discovery, particularly because the language accommodates several programming paradigms, including object-oriented programming,
generic programming, and the traditional procedural programming.The fifth edition of
this book described the language as set forth in the ISO C++ standards, informally
known as C++99 and C++03, or, sometimes as C++99/03. (The 2003 version was
largely a technical correction to the 1999 standard and didn’t add any new features.)
Since then, C++ continues to evolve.As this book is written, the international C++
Standards Committee has just approved a new version of the standard.This standard had
the informal name of C++0x while in development, and now it will be known as
C++11. Most contemporary compilers support C++99/03 quite well, and most of the
examples in this book comply with that standard. But many features of the new standard
already have appeared in some implementations, and this edition of C++ Primer Plus
explores these new features.
C++ Primer Plus discusses the basic C language and presents C++ features, making
this book self-contained. It presents C++ fundamentals and illustrates them with short,
to-the-point programs that are easy to copy and experiment with.You learn about
input/output (I/O), how to make programs perform repetitive tasks and make choices,
the many ways to handle data, and how to use functions.You learn about the many
features C++ has added to C, including the followi
標(biāo)簽:
C++
上傳時(shí)間:
2022-02-19
上傳用戶(hù):trh505
-
《HeadFirstJava》是一本完整地面向?qū)ο?object-oriented,OO)程序設(shè)計(jì)和Java的學(xué)習(xí)指導(dǎo)用書(shū),根據(jù)學(xué)習(xí)理論所設(shè)計(jì),你可以從程序語(yǔ)言的基礎(chǔ)開(kāi)始,到線程、網(wǎng)絡(luò)與分布式程序等項(xiàng)目。重要的是,你可以學(xué)會(huì)如何像一個(gè)面向?qū)ο箝_(kāi)發(fā)者一樣去思考,而且不只是讀死書(shū)?! ≡谶@里,你可以會(huì)玩游戲、拼圖、解謎題以及以意想不到的方式與Java交互?! ≡谶@些活動(dòng)中,你還會(huì)寫(xiě)出一堆真正的Java程序,如一個(gè)船艦炮戰(zhàn)游戲和一個(gè)網(wǎng)絡(luò)聊天程序等等?! 癏eadFirst系列”圖文并茂學(xué)習(xí)方式能讓你快速地在腦海中掌握住知識(shí),敞開(kāi)心胸準(zhǔn)備好學(xué)習(xí)這些關(guān)鍵性的主題: ★Java程序語(yǔ)言 ★面向?qū)ο蟪绦蜷_(kāi)發(fā) ★Swing圖形化接口 ★使用JavaAPI函數(shù)庫(kù) ★編寫(xiě)、測(cè)試與布署應(yīng)用程序 ★處理異常;多線程 ★網(wǎng)絡(luò)程序設(shè)計(jì) ★集合與泛型
標(biāo)簽:
java
上傳時(shí)間:
2022-06-12
上傳用戶(hù):
-
RFID(Radio Frequency Identification)中間件的設(shè)計(jì)與系統(tǒng)的多個(gè)層相關(guān),如RFID電子標(biāo)簽的數(shù)據(jù)采集、標(biāo)簽數(shù)據(jù)管理、RFID系統(tǒng)安全等。對(duì)于不同層,不同的設(shè)計(jì)和實(shí)現(xiàn)被具體應(yīng)用所采納。然而,以這種方法設(shè)計(jì)出來(lái)的中間件就會(huì)缺乏一致性和靈活性,設(shè)計(jì)者不能夠以一個(gè)統(tǒng)一的框架設(shè)計(jì)RFID中間件。面向服務(wù)的RFID中間件架構(gòu)SOA(Service-oriented Architecture)是一種用于RFID各個(gè)應(yīng)用領(lǐng)域軟件開(kāi)發(fā)的框架,它是一種以服務(wù)為中心的包含運(yùn)行環(huán)境、編程架構(gòu)風(fēng)格在內(nèi)的一套新的分布式軟件系統(tǒng)構(gòu)造方法和環(huán)境。使用SOA開(kāi)發(fā)RFID中間件,能很好地改善軟件設(shè)計(jì)中的整體性、靈活性和統(tǒng)一性。SOA是RFID中間件設(shè)計(jì)的基礎(chǔ),本文針對(duì)RFID中間件設(shè)計(jì)中存在的一些問(wèn)題,如EPC編碼的自動(dòng)解析、RFID讀寫(xiě)器的接入、RFID標(biāo)簽數(shù)據(jù)的交換或共享、RFID系統(tǒng)安全等,提出了面向服務(wù)的RFID中間件平臺(tái)架構(gòu)。本文用SOA的設(shè)計(jì)原則建立RFID中間件的軟件構(gòu)架,然后通過(guò)系統(tǒng)集成服務(wù)的方式——查詢(xún)服務(wù)、調(diào)用服務(wù)和提供服務(wù)清晰地定義出RFID讀寫(xiě)器管理服務(wù)、標(biāo)簽信息服務(wù)、RFID安全服務(wù)等。使其適合于不同的RFID應(yīng)用,并且根據(jù)EPCglobal 標(biāo)準(zhǔn)實(shí)現(xiàn)EPC編碼的自動(dòng)解析,這樣不僅有助于在不同平臺(tái)間RFID標(biāo)簽數(shù)據(jù)的交換和集成,而且對(duì)于不同的應(yīng)用降低了構(gòu)建RFID系統(tǒng)的難度。
標(biāo)簽:
rfid
上傳時(shí)間:
2022-06-25
上傳用戶(hù):