Python 多线程爬取站酷（zcool.com.cn）图片

极速爬取下载站酷（）设计师/用户上传的全部照片/插画等图片。

项目地址：

特点：

极速下载：多线程异步下载，可以根据需要设置线程数
异常重试：只要重试次数足够多，就没有下载不下来的图片 (^o^)/
增量下载：设计师/用户有新的上传，再跑一遍程序就行了 o(∩_∩)o嗯!
支持代理：可以配置使用代理

环境：

python3.6及以上

1. 快速使用

1) 克隆项目到本地

git clone https://github.com/lonsty/scraper

2) 安装依赖包

cd scraper
pip install -r requirements.txt

3) 快速使用

通过用户名username下载所有图片到路径path下：

python crawler.py -u <username> -d <path>

运行截图

爬取结果

2. 使用帮助

查看所有命令

python crawler.py --help

usage: crawler.py [options]

  use multi-threaded to download images from https://www.zcool.com.cn in
  bulk by username or id.

options:
  -i, --id text              user id.
  -u, --username text        user name.
  -d, --directory text       directory to save images.
  -p, --max-pages integer    maximum pages to parse.
  -t, --max-topics integer   maximum topics per page to parse.
  -w, --max-workers integer  maximum thread workers.  [default: 20]
  -r, --retries integer      repeat download for failed images.  [default: 3]
  -r, --redownload text      redownload images from failed records.
  -o, --override             override existing files.  [default: false]
  --proxies text             use proxies to access websites.
                             example:
                             '{"http": "user:passwd@www.example.com:port",
                             "https": "user:passwd@www.example.com:port"}'
  --help                     show this message and exit.

3. 更新历史

version 0.1.0 (2019.09.09)

主要功能：
- 极速下载：多线程异步下载，可以根据需要设置线程数
- 异常重试：只要重试次数足够多，就没有下载不下来的图片 (^o^)/
- 增量下载：设计师/用户有新的上传，再跑一遍程序就行了 o(∩_∩)o嗯!
- 支持代理：可以配置使用代理

《Python 多线程爬取站酷（zcool.com.cn）图片.doc》

下载本文的Word格式文档，以方便收藏与打印。

Python 多线程爬取站酷（zcool.com.cn）图片

特点：

环境：

1. 快速使用

1) 克隆项目到本地

2) 安装依赖包

3) 快速使用

2. 使用帮助

3. 更新历史

version 0.1.0 (2019.09.09)

相关推荐

python中len函数的使用方法是什么

python如何把字符串拆开

怎么使用python求解最小公倍数

python中split的使用方法是什么

idea怎么配置python运行环境

python如何下载第三方模块

Python中set函数去重的方法是什么

python中的filter函数有什么作用