首页 > 解决方案 > 通过 R 本地或外部访问 spark

问题描述

我已经在 Windows 下安装了 Spark,并且可以运行 spark-shell 并在这个 shell 中执行一些 Scala 代码(另请参见此处)。例如,我现在如何通过 SparklyR 或 Python 从外部访问此 Spark 环境?

我跑了:

spark-class org.apache.spark.deploy.master.Master

现在可以访问:

http://localhost:8080/

但是,如果我运行:

library(sparklyr)
sc <- spark_connect(master = "http://localhost:8080/")

我得到:

To run Spark on Windows you need a copy of Hadoop winutils.exe:

1. Download Hadoop winutils.exe from:

   https://github.com/steveloughran/winutils/raw/master/hadoop-2.6.0/bin/

2. Copy winutils.exe to C:\spark-2.4.2-bin-hadoop2.7\tmp\hadoop\bin

Alternatively, if you are using RStudio you can install the RStudio Preview Release,
which includes an embedded copy of Hadoop winutils.exe:

  https://www.rstudio.com/products/rstudio/download/preview/


Traceback:

1. spark_connect(master = "http://localhost:8080/")
2. shell_connection(master = master, spark_home = spark_home, app_name = app_name, 
 .     version = version, hadoop_version = hadoop_version, shell_args = shell_args, 
 .     config = config, service = spark_config_value(config, "sparklyr.gateway.service", 
 .         FALSE), remote = spark_config_value(config, "sparklyr.gateway.remote", 
 .         spark_master_is_yarn_cluster(master)), extensions = extensions)
3. prepare_windows_environment(spark_home, environment)
4. stop_with_winutils_error(hadoopBinPath)
5. stop("\n\n", "To run Spark on Windows you need a copy of Hadoop winutils.exe:", 
 .     "\n\n", "1. Download Hadoop winutils.exe from:", "\n\n", 
 .     paste("  ", winutilsDownload), "\n\n", paste("2. Copy winutils.exe to", 
 .         hadoopBinPath), "\n\n", "Alternatively, if you are using RStudio you can install the RStudio Preview Release,\n", 
 .     "which includes an embedded copy of Hadoop winutils.exe:\n\n", 
 .     "  https://www.rstudio.com/products/rstudio/download/preview/", 
 .     "\n\n", call. = FALSE)

我安装了 winutils.exe 作为我的 spark 安装的一部分。我如何针对:http://localhost:8080/执行一些基本的 hello world 示例(例如在 R 中)

PS:

我也尝试运行:

spark-class org.apache.spark.deploy.master.Master
spark-class org.apache.spark.deploy.worker.Worker spark://10.0.20.67:7077
spark-shell --master spark://10.0.20.67:7077

陆续。

标签: apache-sparksparklyr

解决方案


推荐阅读