首页 技术 正文
技术 2022年11月18日
0 收藏 713 点赞 2,804 浏览 2913 个字

使用scrapy的telnet功能远程管理scrapy运行用法

telnet <IP_ADDR> <PORT>

官方文档

https://doc.scrapy.org/en/latest/topics/telnetconsole.html

简单使用

crawler            the Scrapy Crawler (scrapy.crawler.Crawler object)
engine Crawler.engine attribute
spider the active spider
slot the engine slot
extensions the Extension Manager (Crawler.extensions attribute)
stats the Stats Collector (Crawler.stats attribute)
settings the Scrapy settings object (Crawler.settings attribute)
est print a report of the engine status
prefs for memory debugging (see Debugging memory leaks)
p a shortcut to the pprint.pprint function
hpy for memory debugging (see Debugging memory leaks)

参数设置

TELNETCONSOLE_PORT      Default: [6023, 6073]TELNETCONSOLE_HOST      Default: '127.0.0.1'

telnet源码

"""
Scrapy Telnet Console extensionSee documentation in docs/topics/telnetconsole.rst
"""import pprint
import loggingfrom twisted.internet import protocol
try:
from twisted.conch import manhole, telnet
from twisted.conch.insults import insults
TWISTED_CONCH_AVAILABLE = True
except ImportError:
TWISTED_CONCH_AVAILABLE = Falsefrom scrapy.exceptions import NotConfigured
from scrapy import signals
from scrapy.utils.trackref import print_live_refs
from scrapy.utils.engine import print_engine_status
from scrapy.utils.reactor import listen_tcptry:
import guppy
hpy = guppy.hpy()
except ImportError:
hpy = Nonelogger = logging.getLogger(__name__)# signal to update telnet variables
# args: telnet_vars
update_telnet_vars = object()class TelnetConsole(protocol.ServerFactory): def __init__(self, crawler):
if not crawler.settings.getbool('TELNETCONSOLE_ENABLED'):
raise NotConfigured
if not TWISTED_CONCH_AVAILABLE:
raise NotConfigured
self.crawler = crawler
self.noisy = False
self.portrange = [int(x) for x in crawler.settings.getlist('TELNETCONSOLE_PORT')]
self.host = crawler.settings['TELNETCONSOLE_HOST']
self.crawler.signals.connect(self.start_listening, signals.engine_started)
self.crawler.signals.connect(self.stop_listening, signals.engine_stopped) @classmethod
def from_crawler(cls, crawler):
return cls(crawler) def start_listening(self):
self.port = listen_tcp(self.portrange, self.host, self)
h = self.port.getHost()
logger.debug("Telnet console listening on %(host)s:%(port)d",
{'host': h.host, 'port': h.port},
extra={'crawler': self.crawler}) def stop_listening(self):
self.port.stopListening() def protocol(self):
telnet_vars = self._get_telnet_vars()
return telnet.TelnetTransport(telnet.TelnetBootstrapProtocol,
insults.ServerProtocol, manhole.Manhole, telnet_vars) def _get_telnet_vars(self):
# Note: if you add entries here also update topics/telnetconsole.rst
telnet_vars = {
'engine': self.crawler.engine,
'spider': self.crawler.engine.spider,
'slot': self.crawler.engine.slot,
'crawler': self.crawler,
'extensions': self.crawler.extensions,
'stats': self.crawler.stats,
'settings': self.crawler.settings,
'est': lambda: print_engine_status(self.crawler.engine),
'p': pprint.pprint,
'prefs': print_live_refs,
'hpy': hpy,
'help': "This is Scrapy telnet console. For more info see: " \
"https://doc.scrapy.org/en/latest/topics/telnetconsole.html",
}
self.crawler.signals.send_catch_log(update_telnet_vars, telnet_vars=telnet_vars)
return telnet_vars
相关推荐
python开发_常用的python模块及安装方法
adodb:我们领导推荐的数据库连接组件bsddb3:BerkeleyDB的连接组件Cheetah-1.0:我比较喜欢这个版本的cheeta…
日期:2022-11-24 点赞:878 阅读:8,949
Educational Codeforces Round 11 C. Hard Process 二分
C. Hard Process题目连接:http://www.codeforces.com/contest/660/problem/CDes…
日期:2022-11-24 点赞:807 阅读:5,475
下载Ubuntn 17.04 内核源代码
zengkefu@server1:/usr/src$ uname -aLinux server1 4.10.0-19-generic #21…
日期:2022-11-24 点赞:569 阅读:6,288
可用Active Desktop Calendar V7.86 注册码序列号
可用Active Desktop Calendar V7.86 注册码序列号Name: www.greendown.cn Code: &nb…
日期:2022-11-24 点赞:733 阅读:6,103
Android调用系统相机、自定义相机、处理大图片
Android调用系统相机和自定义相机实例本博文主要是介绍了android上使用相机进行拍照并显示的两种方式,并且由于涉及到要把拍到的照片显…
日期:2022-11-24 点赞:512 阅读:7,735
Struts的使用
一、Struts2的获取  Struts的官方网站为:http://struts.apache.org/  下载完Struts2的jar包,…
日期:2022-11-24 点赞:671 阅读:4,770