用python 协程爬百度小说西游记

前言

方法，不止一种，有不同见解可以一起讨论

""

使用协程爬取百度小说中的西游记整部小说

"""

import asyncio

import aiohttp

import aiofiles

import requests

from lxml import etree

async def async_download(title, url):

    """

    协程下载

    :param title:

    :param url:

    :return:

    """

    async with aiohttp.ClientSession() as session:

        file_name = "西游记/%s.txt" % title

        async with session.get(url) as resp:

            tree = etree.HTML(await resp.text())

            contents = tree.xpath("//dd[@id='contents']/text()")

            temp = ''

            for content in contents:

                if content == '\r\n':

                    continue

                temp += content

            async with aiofiles.open(file_name, mode='w', encoding='utf-8') as f:

                await f.write(temp)

    print("%s ...... 下载完成！" % title)

async def main(td_as):

    """

    封装协程对象并执行

    :param td_as:

    :return:

    """

    tasks = []

    for td in td_as:

        # print(td.xpath("./@href"))

        url_c = td.xpath("./@href")[0]

        # print(td.xpath("./text()"))

        title = td.xpath("./text()")[0]

        tasks.append(asyncio.create_task(async_download(title, url_c)))

    await asyncio.wait(tasks)

if __name__ == '__main__':

    """

    程序入口

    """

    url = 'http://www.wibaidu.com/modules/article/reader.php?aid=24537'

    resp = requests.get(url)

    resp.encoding = resp.apparent_encoding

    tree = etree.HTML(resp.text)

    td_as = tree.xpath("//td[@class='L']/a")

    # td_a = td_as[0]

    # tmp = td_a.xpath()

    loop = asyncio.get_event_loop()

    loop.run_until_complete(main(td_as))

用python 协程爬百度小说西游记的相关教程结束。

《用python 协程爬百度小说西游记.doc》

下载本文的Word格式文档，以方便收藏与打印。

用python 协程爬百度小说西游记

用python 协程爬百度小说西游记的相关教程结束。

相关推荐

python中len函数的使用方法是什么

python如何把字符串拆开

怎么使用python求解最小公倍数

python中split的使用方法是什么

idea怎么配置python运行环境

python如何下载第三方模块

Python中set函数去重的方法是什么

python中的filter函数有什么作用

用python 协程 爬百度小说西游记

用python 协程 爬百度小说西游记的相关教程结束。

相关推荐

python中len函数的使用方法是什么

python如何把字符串拆开

怎么使用python求解最小公倍数

python中split的使用方法是什么

idea怎么配置python运行环境

python如何下载第三方模块

Python中set函数去重的方法是什么

python中的filter函数有什么作用

用python 协程爬百度小说西游记

用python 协程爬百度小说西游记的相关教程结束。