本文共 2539 字,大约阅读时间需要 8 分钟。
1、安装配置好Spark环境,确认执行Spark目录下的/bin/pyspark能够成功进入。
2、安装anaconda2
https://www.anaconda.com/download/#linux
bash Anaconda2-5.0.1-Linux-x86_64.sh
3、sudo pip install pyspark
4、进入jupyter notebook,编写程序测试
基本上SparkContext那句不报错就说明已经能够启动Spark
附上环境变量:
# /etc/profile: system-wide .profile file for the Bourne shell (sh(1))# and Bourne compatible shells (bash(1), ksh(1), ash(1), ...).export JAVA_HOME=/usr/lib/jvm/java-8-oracleexport JRE_HOME=$JAVA_HOME/jreexport CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATHexport PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$HIVE_HOME/bin:$PATHexport HADOOP_HOME=/home/chenjie/hadoop-2.6.5#export HADOOP_HOME=/home/chenjie/hadoop-2.6.5-netexport CLASSPATH=.:$HADOOP_HOME/lib:$CLASSPATHexport PATH=$PATH:$HADOOP_HOME/binexport PATH=$PATH:$HADOOP_HOME/sbinexport HADOOP_MAPRED_HOME=$HADOOP_HOMEexport HADOOP_COMMON_HOME=$HADOOP_HOMEexport HADOOP_HDFS_HOME=$HADOOP_HOMEexport YARN_HOME=$HADOOP_HOMEexport HADOOP_ROOT_LOGGER=INFO,consoleexport HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/nativeexport HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"#scalaexport SCALA_HOME=/home/chenjie/scala-2.10.4export PATH=${SCALA_HOME}/bin:$PATH#sparkexport SPARK_HOME=/home/chenjie/spark-1.6.0-bin-hadoop2.6#export SPARK_HOME=/home/chenjie/spark-1.6.0-bin-hadoop2.6-netexport PATH=${SPARK_HOME}/bin:${SPARK_HOME}/sbin:$PATH#Flumeexport FLUME_HOME=/home/chenjie/apache-flume-1.5.0-binexport FLUME_CONF_DIR=$FLUME_HOME/confexport PATH=.:$PATH::$FLUME_HOME/bin#hiveexport HIVE_HOME=/home/chenjie/apache-hive-2.3.0-binexport PATH=$PATH:$HIVE_HOME/bin#sqoopexport SQOOP_HOME=/home/chenjie/sqoop-1.4.6.bin__hadoop-2.0.4-alphaexport PATH=$PATH:$SQOOP_HOME/binexport SQOOP_SERVER_EXTRA_LIB=$SQOOP_HOME/extra#mavenexport PATH=$PATH:/home/chenjie/apache-maven-3.5.0/bin#export PYTHONPATH=/home/chenjie/spark-1.6.0-bin-hadoop2.6/pythonexport PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATHexport PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.9.0-src.zip:$PYTHONPATHPYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS='notebook' if [ "$PS1" ]; then if [ "$BASH" ] && [ "$BASH" != "/bin/sh" ]; then # The file bash.bashrc already sets the default PS1. # PS1='\h:\w\$ ' if [ -f /etc/bash.bashrc ]; then . /etc/bash.bashrc fi else if [ "`id -u`" -eq 0 ]; then PS1='# ' else PS1='$ ' fi fifi# The default umask is now handled by pam_umask.# See pam_umask(8) and /etc/login.defs.if [ -d /etc/profile.d ]; then for i in /etc/profile.d/*.sh; do if [ -r $i ]; then . $i fi done unset ifi