java - 如何根据 Spark 中的日期时间值过滤数据集
问题描述
我正在尝试根据日期时间字段过滤我的数据。我的数据样本:
303,0.00001747,4351040,75.9054,"2019-03-08 19:29:18"
这就是我初始化火花的方式:
SparkConf conf = new SparkConf().setAppName("app name").setMaster("spark://192.168.1.124:7077");
JavaSparkContext sc = JavaSparkContext.fromSparkContext(SparkContext.getOrCreate(conf));
首先,我将上面的数据读入我的自定义对象,如下所示:
// Read data from file into custom object
JavaRDD<CurrencyPair> rdd = sc.textFile(System.getProperty("user.dir") + "/data/data.csv", 2).map(
new Function<String, CurrencyPair>() {
public CurrencyPair call(String line) throws Exception {
String[] fields = line.split(","); // Split line from commas
// read each data into custom object
CurrencyPair cp = new CurrencyPair();
cp.setId(Integer.parseInt(fields[0].trim()));
cp.setValue(Double.parseDouble(fields[1].trim()));
cp.setBaseVolume(Double.parseDouble(fields[2].trim()));
cp.setQuoteVolume(Double.parseDouble(fields[3].trim()));
cp.setTimeStamp(new Date(fields[4].trim()));
System.out.println("Date:" + fields[4].trim()); // To see if it will print or not
return cp;
}
}
);
为了获取时间戳大于某个时间的数据,我编写了这个过滤器:
Calendar calendar = Calendar.getInstance();
calendar.add(Calendar.DAY_OF_MONTH, -10); // This is for test issue
// My filter to get the data for a certain time range
Function<CurrencyPair, Boolean> filter = new Function<CurrencyPair, Boolean>() {
@Override
public Boolean call(CurrencyPair currencyPair) throws Exception {
if(calendar.getTime().compareTo(currencyPair.getTimeStamp()) > 0){
return false;
}else{
return true;
}
}
};
这是我的自定义对象的样子:
public class CurrencyPair implements java.io.Serializable {
private int id;
private double value;
private double baseVolume;
private double quoteVolume;
private Date timeStamp;
// all getters and setters are here, but no constructor
}
为了检查我的过滤器的结果,我尝试查看其中的一些(这里的前 100 个):
Iterator<CurrencyPair> result = rdd.repartition(100).filter(filter).toLocalIterator();
int counter = 0;
while (counter < 100 && result.hasNext()){
System.out.println("Here: " + result.next());
counter++;
}
但问题是,当我运行我的代码时,在我写前 100 个结果的那一行出现了以下异常(这里:System.out.println("Here:" + result.next());)
错误:
19/05/12 00:05:47 INFO TaskSetManager: Lost task 0.3 in stage 0.0 (TID 6) on 192.168.1.124, executor 0: java.lang.ClassCastException (cannot assign instance of java.lang.invoke.SerializedLambda to field org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance of org.apache.spark.rdd.MapPartitionsRDD
在我的过滤器中,我还使用 System.out.println 将日期时间字符串写入控制台,但我也无法在控制台中看到它的结果。我究竟做错了什么?我怎样才能做到这一点?
编辑:我注意到我实际上下载了 spark 2.3.0 版本,但在我的 maven 文件中我使用的是 2.4.2。所以我将我的 maven 文件更改为 2.3.0 版本。
这次我收到以下错误:
19/05/14 00:35:35 INFO BlockManager: BlockManager stopped
19/05/14 00:35:35 INFO BlockManagerMaster: BlockManagerMaster stopped
19/05/14 00:35:35 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
19/05/14 00:35:36 ERROR SparkContext: Error initializing SparkContext.
java.lang.IllegalArgumentException: requirement failed: Can only call getServletHandlers on a running MetricsSystem
at scala.Predef$.require(Predef.scala:281)
at org.apache.spark.metrics.MetricsSystem.getServletHandlers(MetricsSystem.scala:91)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:516)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520)
at org.apache.spark.SparkContext.getOrCreate(SparkContext.scala)
at spark.BasicSpark.readDataFile(BasicSpark.java:107)
at spark.BasicSpark.getWholeData(BasicSpark.java:39)
at controller.TableScreenController$2.handle(TableScreenController.java:66)
at controller.TableScreenController$2.handle(TableScreenController.java:62)
at com.sun.javafx.event.CompositeEventHandler.dispatchBubblingEvent(CompositeEventHandler.java:86)
at com.sun.javafx.event.EventHandlerManager.dispatchBubblingEvent(EventHandlerManager.java:238)
at com.sun.javafx.event.EventHandlerManager.dispatchBubblingEvent(EventHandlerManager.java:191)
at com.sun.javafx.event.CompositeEventDispatcher.dispatchBubblingEvent(CompositeEventDispatcher.java:59)
at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:58)
at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:56)
at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:56)
at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at com.sun.javafx.event.EventUtil.fireEventImpl(EventUtil.java:74)
at com.sun.javafx.event.EventUtil.fireEvent(EventUtil.java:49)
at javafx.event.Event.fireEvent(Event.java:198)
at javafx.scene.Node.fireEvent(Node.java:8411)
at javafx.scene.control.Button.fire(Button.java:185)
at com.sun.javafx.scene.control.behavior.ButtonBehavior.mouseReleased(ButtonBehavior.java:182)
at com.sun.javafx.scene.control.skin.BehaviorSkinBase$1.handle(BehaviorSkinBase.java:96)
at com.sun.javafx.scene.control.skin.BehaviorSkinBase$1.handle(BehaviorSkinBase.java:89)
at com.sun.javafx.event.CompositeEventHandler$NormalEventHandlerRecord.handleBubblingEvent(CompositeEventHandler.java:218)
at com.sun.javafx.event.CompositeEventHandler.dispatchBubblingEvent(CompositeEventHandler.java:80)
at com.sun.javafx.event.EventHandlerManager.dispatchBubblingEvent(EventHandlerManager.java:238)
at com.sun.javafx.event.EventHandlerManager.dispatchBubblingEvent(EventHandlerManager.java:191)
at com.sun.javafx.event.CompositeEventDispatcher.dispatchBubblingEvent(CompositeEventDispatcher.java:59)
at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:58)
at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:56)
at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:56)
at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at com.sun.javafx.event.EventUtil.fireEventImpl(EventUtil.java:74)
at com.sun.javafx.event.EventUtil.fireEvent(EventUtil.java:54)
at javafx.event.Event.fireEvent(Event.java:198)
at javafx.scene.Scene$MouseHandler.process(Scene.java:3757)
at javafx.scene.Scene$MouseHandler.access$1500(Scene.java:3485)
at javafx.scene.Scene.impl_processMouseEvent(Scene.java:1762)
at javafx.scene.Scene$ScenePeerListener.mouseEvent(Scene.java:2494)
at com.sun.javafx.tk.quantum.GlassViewEventHandler$MouseEventNotification.run(GlassViewEventHandler.java:394)
at com.sun.javafx.tk.quantum.GlassViewEventHandler$MouseEventNotification.run(GlassViewEventHandler.java:295)
at java.security.AccessController.doPrivileged(Native Method)
at com.sun.javafx.tk.quantum.GlassViewEventHandler.lambda$handleMouseEvent$350(GlassViewEventHandler.java:432)
at com.sun.javafx.tk.quantum.QuantumToolkit.runWithoutRenderLock(QuantumToolkit.java:389)
at com.sun.javafx.tk.quantum.GlassViewEventHandler.handleMouseEvent(GlassViewEventHandler.java:431)
at com.sun.glass.ui.View.handleMouseEvent(View.java:555)
at com.sun.glass.ui.View.notifyMouse(View.java:937)
at com.sun.glass.ui.gtk.GtkApplication._runLoop(Native Method)
at com.sun.glass.ui.gtk.GtkApplication.lambda$null$208(GtkApplication.java:245)
at java.lang.Thread.run(Thread.java:748)
19/05/14 00:35:36 INFO SparkContext: SparkContext already stopped.
19/05/14 00:35:36 INFO SparkContext: Successfully stopped SparkContext
java.lang.IllegalArgumentException: requirement failed: Can only call getServletHandlers on a running MetricsSystem
at scala.Predef$.require(Predef.scala:281)
at org.apache.spark.metrics.MetricsSystem.getServletHandlers(MetricsSystem.scala:91)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:516)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520)
at org.apache.spark.SparkContext.getOrCreate(SparkContext.scala)
at spark.BasicSpark.readDataFile(BasicSpark.java:107)
at spark.BasicSpark.getWholeData(BasicSpark.java:39)
at controller.TableScreenController$2.handle(TableScreenController.java:66)
at controller.TableScreenController$2.handle(TableScreenController.java:62)
at com.sun.javafx.event.CompositeEventHandler.dispatchBubblingEvent(CompositeEventHandler.java:86)
at com.sun.javafx.event.EventHandlerManager.dispatchBubblingEvent(EventHandlerManager.java:238)
at com.sun.javafx.event.EventHandlerManager.dispatchBubblingEvent(EventHandlerManager.java:191)
at com.sun.javafx.event.CompositeEventDispatcher.dispatchBubblingEvent(CompositeEventDispatcher.java:59)
at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:58)
at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:56)
at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:56)
at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at com.sun.javafx.event.EventUtil.fireEventImpl(EventUtil.java:74)
at com.sun.javafx.event.EventUtil.fireEvent(EventUtil.java:49)
at javafx.event.Event.fireEvent(Event.java:198)
at javafx.scene.Node.fireEvent(Node.java:8411)
at javafx.scene.control.Button.fire(Button.java:185)
at com.sun.javafx.scene.control.behavior.ButtonBehavior.mouseReleased(ButtonBehavior.java:182)
at com.sun.javafx.scene.control.skin.BehaviorSkinBase$1.handle(BehaviorSkinBase.java:96)
at com.sun.javafx.scene.control.skin.BehaviorSkinBase$1.handle(BehaviorSkinBase.java:89)
at com.sun.javafx.event.CompositeEventHandler$NormalEventHandlerRecord.handleBubblingEvent(CompositeEventHandler.java:218)
at com.sun.javafx.event.CompositeEventHandler.dispatchBubblingEvent(CompositeEventHandler.java:80)
at com.sun.javafx.event.EventHandlerManager.dispatchBubblingEvent(EventHandlerManager.java:238)
at com.sun.javafx.event.EventHandlerManager.dispatchBubblingEvent(EventHandlerManager.java:191)
at com.sun.javafx.event.CompositeEventDispatcher.dispatchBubblingEvent(CompositeEventDispatcher.java:59)
at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:58)
at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:56)
at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:56)
at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
at com.sun.javafx.event.EventUtil.fireEventImpl(EventUtil.java:74)
at com.sun.javafx.event.EventUtil.fireEvent(EventUtil.java:54)
at javafx.event.Event.fireEvent(Event.java:198)
at javafx.scene.Scene$MouseHandler.process(Scene.java:3757)
at javafx.scene.Scene$MouseHandler.access$1500(Scene.java:3485)
at javafx.scene.Scene.impl_processMouseEvent(Scene.java:1762)
at javafx.scene.Scene$ScenePeerListener.mouseEvent(Scene.java:2494)
at com.sun.javafx.tk.quantum.GlassViewEventHandler$MouseEventNotification.run(GlassViewEventHandler.java:394)
at com.sun.javafx.tk.quantum.GlassViewEventHandler$MouseEventNotification.run(GlassViewEventHandler.java:295)
at java.security.AccessController.doPrivileged(Native Method)
at com.sun.javafx.tk.quantum.GlassViewEventHandler.lambda$handleMouseEvent$350(GlassViewEventHandler.java:432)
at com.sun.javafx.tk.quantum.QuantumToolkit.runWithoutRenderLock(QuantumToolkit.java:389)
at com.sun.javafx.tk.quantum.GlassViewEventHandler.handleMouseEvent(GlassViewEventHandler.java:431)
at com.sun.glass.ui.View.handleMouseEvent(View.java:555)
at com.sun.glass.ui.View.notifyMouse(View.java:937)
at com.sun.glass.ui.gtk.GtkApplication._runLoop(Native Method)
at com.sun.glass.ui.gtk.GtkApplication.lambda$null$208(GtkApplication.java:245)
at java.lang.Thread.run(Thread.java:748)
我在下面初始化火花上下文的代码行中收到此错误:
JavaSparkContext sc = JavaSparkContext.fromSparkContext(SparkContext.getOrCreate(conf));
编辑2:当我写本地而不是我自己的火花主IP时效果很好。但我需要在我自己的 IP 上运行它。那么我的主节点有什么问题呢?
编辑 3:我将整个错误堆栈上传到位于第一次编辑下的代码片段。
解决方案
当定义 lambda 表达式的类在运行时不可用时,就会发生这种情况。如果您尝试从本地 IDE 在远程集群上运行作业,则需要添加 setJars
SparkConf conf = new SparkConf()
.setAppName("app name")
.setJars(new String[]{"/fatJarPath/jar.path"})
.setMaster("spark://Remote_spark_ip:port");
或者,您可以构建 fat jar 并使用 spark-submit 提交您的工作。
推荐阅读
- django - Django 数据库问题
- compiler-construction - 第一种语言是如何编译的
- c# - c# combobox DisplayMember 显示数据表中的空行
- android - 一次打开两个应用程序只需单击一下
- node.js - 使用节点将数据写入文本文件
- alexa - Alexa:如何知道大响应在哪里被打断?
- highcharts - 如何沿着森伯斯特图表中的各个部分弯曲每个数据标签?
- javascript - 尝试对 GraphQL 数据进行 .map() 时,React Native 'undefined is not an object'
- laravel - 另一个“不能保存多个复选框”和“必须是数组类型,给定字符串”
- java - 使用java流,将两个具有相同键但不同值的映射合并到一个元组?