0%

No module named 'pyspark'

ModuleNotFoundError: No module named ‘pyspark’

root@gu-virtual-machine:/usr/local/spark/mycode/remdup# python3 remdup.py
Traceback (most recent call last):
File “/usr/local/spark/mycode/remdup/remdup.py”, line 1, in
from pyspark import SparkContext
ModuleNotFoundError: No module named ‘pyspark’

1.找到.bashrc文件在哪

1
/home/hadoop

编辑环境变量

1
2
3
export PYSPARK_HOME=/usr/local/spark
export PYTHONPATH=$PYSPARK_HOME/python:$PYTHONPATH
export PYTHONPATH=$PYSPARK_HOME/python/lib/py4j-0.10.9.5-src.zip:$PYTHONPATH

其中```py4j-0.10.9.5-src.zip``

需要在/usr/local/spark/python/lib/中自己找自己的是啥版本

然后

source .bashrc

chatgpt 的prompt

怎么将PySpark安装目录添加到PYTHONPATH环境变量中

1679936834308

https://spark.apache.org/docs/latest/api/python/getting_started/install.html

-------------本文结束感谢您的阅读-------------
老板你好,讨口饭吃