掌柜
centos7安装pyspider
01/07
本文最后更新于2023年07月07日,已超过503天没有更新。如果文章内容或图片资源失效,请留言反馈,我会及时处理,谢谢!
安装
## 安装
[root@ac4:~]$ pip3 install pyspider
## 运行,报错
[root@ac4:~]$ pyspider
ValueError: Invalid configuration:
- Deprecated option 'domaincontroller': use 'http_authenticator.domain_controller' instead.
## 解决:
[root@ac4:~]$ pip3 install wsgidav==2.4.1
## 再次运行,报错
[root@ac4:~]$ pyspider
from werkzeug.wsgi import DispatcherMiddleware
ImportError: cannot import name 'DispatcherMiddleware'
## 解决。先降低 flask 版本,再降低 werkzeug 的版本
[root@ac4:~]$ pip3 install flask==1.0
[root@ac4:~]$ pip3 install werkzeug==0.16.1
## 再次运行,成功。
[root@ac4:~]$ pyspider
[W 220107 11:35:30 run:413] phantomjs not found, continue running without it.
[I 220107 11:35:32 result_worker:49] result_worker starting...
[I 220107 11:35:33 tornado_fetcher:638] fetcher starting...
[I 220107 11:35:33 processor:211] processor starting...
[I 220107 11:35:33 scheduler:647] scheduler starting...
[I 220107 11:35:33 scheduler:782] scheduler.xmlrpc listening on 127.0.0.1:23333
[I 220107 11:35:33 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0
[I 220107 11:35:33 app:76] webui running on 0.0.0.0:5000
配置访问密码
## 准备配置文件
[root@ac4:/data/app/pyspider]$ cat config.json
{
"webui": {
"port": "5000",
"username": "abc",
"password": "123456",
"need-auth": true
}
}
## 运行
[root@ac4:/data/app/pyspider]$ pyspider --config config.json
[W 220107 11:42:07 run:413] phantomjs not found, continue running without it.
[I 220107 11:42:09 result_worker:49] result_worker starting...
[I 220107 11:42:10 tornado_fetcher:638] fetcher starting...
[I 220107 11:42:10 processor:211] processor starting...
[I 220107 11:42:10 scheduler:647] scheduler starting...
[I 220107 11:42:10 scheduler:782] scheduler.xmlrpc listening on 127.0.0.1:23333
[I 220107 11:42:10 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0
[I 220107 11:42:10 app:76] webui running on 0.0.0.0:5000