我正试图在本地环境中运行pyspark作业。
在设置pipenv并成功安装模块(numpy(后,代码仍然看不到该模块。
使用pip而不是pipenv来安装库是可行的。我在这里错过了什么?
终端输出如下所示。
PS C:UsersuserDesktopsparktest> pipenv shell
Shell for C:Usersuser.virtualenvstest-sCQB0P3C already activated.
No action taken to avoid nested environments.
PS C:UsersuserDesktopsparktest> pipenv graph
numpy==1.20.3
pipenv==2020.11.15
- certifi [required: Any, installed: 2020.12.5]
- pip [required: >=18.0, installed: 21.1.1]
- setuptools [required: >=36.2.1, installed: 56.0.0]
- virtualenv [required: Any, installed: 20.4.6]
- appdirs [required: >=1.4.3,<2, installed: 1.4.4]
- distlib [required: >=0.3.1,<1, installed: 0.3.1]
- filelock [required: >=3.0.0,<4, installed: 3.0.12]
- six [required: >=1.9.0,<2, installed: 1.16.0]
- virtualenv-clone [required: >=0.2.5, installed: 0.5.4]
pyspark==2.4.0
- py4j [required: ==0.10.7, installed: 0.10.7]
PS C:UsersuserDesktopsparktest> spark-submit --master local[*] --files
configsetl_config.json jobsetl_job.py
Traceback (most recent call last):
File "C:/Users/user/Desktop/spark/test/jobs/etl_job.py", line 40, in <module>
from dependencies.class import XLoader
File "C:UsersuserDesktopsparktestdependenciesX.py", line 2, in <module>
import numpy as np
ModuleNotFoundError: No module named 'numpy'
-
确保您与您的Pipfile 位于同一目录中
-
pipenv shell
然后 -
pipenv install