首页 > 解决方案 > reduce by key using a different map

问题描述

So I have a map

key:timestamp Val(IP,seconds)
(1421927423,(59.166.0.9,0.011))
(1421927423,(59.166.0.3,0.011))
(1421927423,(59.45.0.2,27.203556))
(1421927423,(59.166.0.8,0.018))
(1421927423,(59.166.0.8,1.256667))
(1421927423,(175.45.176.2,27.203556))
(1421927424,(59.166.0.8,0.018))
(1421927426,(59.166.0.8,0.018))

and then another map finding the max of x._2

(1421927423,(175.45.176.2,27.203556))
(1421927426,(59.166.0.8,0.018))

I want to then reduce map one based on map 1, if the key and the max seconds match to add it to the new map

标签: scaladictionary

解决方案


于是,我走了一条不同的路……

val file = sc.textFile("UNSW-NB15_1.csv")

val splitfile = file.map(x => x.split(","));

// create map  Key=Start time     Value =(IP, best arrival time)
val keyval = splitfile.map(x => (x(28), (x(0), (if( x(30).toDouble>x(31).toDouble ) {x(30).toDouble} else {x(31).toDouble}) ) )  ) 


// create map //Key=StartTime  Value=best arrival time               for finding max
val newResult = splitfile.map(x => (x(28), (if( x(30).toDouble>x(31).toDouble ) {x(30).toDouble} else {x(31).toDouble}) )  )      
//find max time for each key  Key=StartTime Value=Max time
val findMax = newResult.reduceByKey{case (a,b) => if (a>b){a} else {b} }



//Join 2 key/value pairs Key=StartTime Value=[(IP, ArrivalTime), MaxTime
val data = keyval.join(findMax)
//Compares MaxTime With ArrivalTime for each key and only leaves ones with max
val findIPs = data.filter{ case(x,y) => y._1._2==y._2 }
findIPs.collect.foreach(println)

这很愚蠢,但它让我到了那里......


推荐阅读