channel和scrapy在Django中不兼容吗?



这是发生在我身上最奇怪的事情在我的Django中有scrapy和channels。我安装了channels3.0.5后,我的聊天室可以正常运行,但是我的scrapy不能正常运行,scrapy停留在2023-04-04 10:10:42 [scrapy.core .engine] DEBUG: Crawled (200) <POST http://pfsc.agri.cn/api/priceQuotationController/pageList?key=&order=> (referer: None),我调试后发现无法进入解析,我使用wireshare抓取Get数据包,有一个返回值。我将频道升级到4.0.0,然后我的聊天室无法链接到ws服务器,但是scrapy可以正常运行谢谢你的宝贵时间

scrapy日志:

['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2023-04-04 10:10:42 [scrapy.middleware] INFO: Enabled item pipelines:
['spider.pipelines.SpiderPipeline_PFSC']
2023-04-04 10:10:42 [scrapy.core.engine] INFO: Spider opened
2023-04-04 10:10:42 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2023-04-04 10:10:42 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2023-04-04 10:10:42 [scrapy.core.engine] DEBUG: Crawled (200) <POST http://pfsc.agri.cn/api/priceQuotationController/pageList?key=&order=> (referer: None)
2023-04-04 10:11:42 [scrapy.extensions.logstats] INFO: Crawled 1 pages (at 1 pages/min), scraped 0 items (at 0 items/min)

scrapy蜘蛛:

def __init__(self, **kwargs):
super().__init__(**kwargs)
self.total_prices = 1
self.url = 'http://pfsc.agri.cn/api/priceQuotationController/pageList?key=&order='
self.date = datetime.today().strftime('%Y-%m-%d')
print('today is ' + self.date)
date_list = PFSC_Price.objects.filter(reportTime=self.date)  
self.date_list = list(date_list.values())
for data in self.date_list:
del data['id']
self.max_length = 900
self.i = 0
def start_requests(self):
yield Request(
url=self.url,
method='POST',
body='{"pageNum":1,"pageSize":' + f'{self.total_prices}' + ',"marketId":"","provinceCode":"","pid":"","varietyId":""}',
callback=self.parse,
)
def parse(self, response, **kwargs):
print('in parse')
json_loads = json.loads(response.text)

amqp==5.1.1
asgiref==3.6.0
async-timeout==4.0.2
attrs==22.2.0
autobahn==23.1.2
Automat==22.10.0
beautifulsoup4==4.11.1
billiard==3.6.4.0
celery==5.2.7
certifi==2022.9.24
cffi==1.15.1
channels==4.0.0
channels-redis==4.0.0
charset-normalizer==2.1.1
click==8.1.3
click-didyoumean==0.3.0
click-plugins==1.1.1
click-repl==0.2.0
constantly==15.1.0
cryptography==39.0.0
cssselect==1.2.0
daphne==3.0.2
Django==4.1.5
django-extensions==3.2.1
et-xmlfile==1.1.0
filelock==3.9.0
hyperlink==21.0.0
idna==3.4
incremental==22.10.0
itemadapter==0.7.0
itemloaders==1.0.6
jmespath==1.0.1
kombu==5.2.4
lxml==4.9.2
msgpack==1.0.5
mysql-connector-python==8.0.31
mysqlclient==2.1.1
numpy==1.24.1
openpyxl==3.0.10
packaging==22.0
pandas==1.5.2
parsel==1.7.0
Pillow==9.4.0
prompt-toolkit==3.0.38
Protego==0.2.1
protobuf==3.20.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser==2.21
PyDispatcher==2.0.6
PyMySQL==1.0.2
pyOpenSSL==23.0.0
python-dateutil==2.8.2
pytz==2022.6
queuelib==1.6.2
redis==4.5.1
requests==2.28.1
requests-file==1.5.1
Scrapy==2.7.1
scrapy-djangoitem==1.1.1
sentry-sdk==1.12.1
service-identity==21.1.0
six==1.16.0
soupsieve==2.3.2.post1
sqlparse==0.4.3
tldextract==3.4.0
Twisted==22.10.0
txaio==23.1.1
typing_extensions==4.4.0
tzdata==2022.7
urllib3==1.26.13
vine==5.0.0
w3lib==2.1.1
wcwidth==0.2.6
zope.interface==5.5.2
在channels4.0.0

前端:

chatRoom.js:46 WebSocket connection to 'ws://127.0.0.1:8000/ws/chat/1111/' failed:

后端:

Not Found: /ws/chat/1111/
[04/Apr/2023 10:44:49] "GET /ws/chat/1111/HTTP/1.1" 404 5746
[04/Apr/2023 10:44:49,600] - Broken pipe from ('127.0.0.1', 44486)

asgi.py

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'everything.settings')
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'core.settings')
application = ProtocolTypeRouter({
'http': get_asgi_application(),
'websocket': SessionMiddlewareStack( 
URLRouter(
website.routing.websocket_urlpatterns
)
),  
})

routing.py

websocket_urlpatterns = [ re_path(r'ws/chat/(?P<room_name>w+)/$', consumers.ChatConsumer.as_asgi()), ]

我发现clannels4.0.0找不到ws链接的答案,并在Django中设置的INSTALLED_APPS的第一行添加了'daphne'。幸运的是聊天室可以运行,但不幸的是scrapy不能运行

我找到了一个解决方案,虽然我不明白为什么会有冲突,因为我使用pycharm,所以我给scrapy一个单独的虚拟环境,虽然它没有解决问题,但它可以作为一个小胜利运行

我使用channels4.0和scrapy,我有同样的问题。我仍然不知道如何解决在通道4.0中使用daphne作为asgi时scrapy无法运行的问题。

我的解决方案是:删除'daphne'的INSTALLED_APPS,并在uvicorn中运行Django,而不是在我的开发环境中使用runserver命令,然后一切都好了

最新更新