python 多协程异步IO爬取网页加速3倍。

2022-12-01,,,,

 from urllib import request
import gevent,time
from gevent import monkey#该模块让当前程序所有io操作单独标记,进行异步操作。 monkey.patch_all()#对当前程序的io操作打上补丁。没有该monkey方法,异步IO无效。
def f(url):
print('GET:%s'%url)
resp = request.urlopen(url)#获取网页
data = resp.read()#读取网页
print('%d bytes received from %s'%(len(data),url))#打印长度
url = ['https://www.yahoo.com/','https://www.python.org/',
'https://github.com/']
start = time.time()
for i in url:
f(i)#循环运行列表中的网页
print('串行执行时间:',time.time() - start)#串行执行时间
async_time = time.time()
gevent.joinall([
gevent.spawn(f,'https://www.yahoo.com/')#异步执行启动协程
, gevent.spawn(f,'https://www.python.org/'),
gevent.spawn(f,'https://github.com/'),
])
print('异步执行时间async time:',time.time() - async_time)#多协程异步IO执行时间
以下为运行结果,明显多协程的牛逼之处。。。。。。。如果不执行monkey方法,则异步IO就会按串行执行。
C:\Users\hushuning\Anaconda3\python.exe C:/Users/hushuning/PycharmProjects/untitled/njx/把当前程序的所有的io操作单独标记,进行异步操作.py
GET:https://www.yahoo.com/
510125 bytes received from https://www.yahoo.com/
GET:https://www.python.org/
48857 bytes received from https://www.python.org/
GET:https://github.com/
51373 bytes received from https://github.com/
串行执行时间: 4.710935354232788
GET:https://www.yahoo.com/
GET:https://www.python.org/
GET:https://github.com/
48857 bytes received from https://www.python.org/
512422 bytes received from https://www.yahoo.com/
51373 bytes received from https://github.com/
异步执行时间async time: 1.6521050930023193 Process finished with exit code 0

 

python 多协程异步IO爬取网页加速3倍。的相关教程结束。

《python 多协程异步IO爬取网页加速3倍。.doc》

下载本文的Word格式文档,以方便收藏与打印。