首页 > 解决方案 > Comparing 2 Scala 2D arrays: getting error: value sameElements is not a member of (String, String)

问题描述

Hi Im trying the following operation in Scala:

I have 2 dataframes. I want to compare their columns names and then column types. I started by extracting the column names Then I sorted the array and finally printed it

val df1colArr = df1.dtypes

val df2colArr = df2.dtypes


Sorting.quickSort(df1colArr)
Sorting.quickSort(df2colArr)


println(df1colArr.deep.mkString("\n"))
println(df2colArr.deep.mkString("\n"))

The output looks like this:

(age,IntegerType)
(color,StringType)
(dealer_id,StringType)
(first_name,StringType)
(id,IntegerType)
(last_name,StringType)
(loyalty_score,StringType)
(model,StringType)
(purchase_date,TimestampType)
(purchase_price,StringType)
(rank_dr,IntegerType)
(service_date,TimestampType)
(vin_num,StringType)

(age,IntegerType)
(color,StringType)
(dealer_id,StringType)
(first_name,StringType)
(id,IntegerType)
(last_name,StringType)
(loyalty_score,IntegerType)
(model,StringType)
(purchase_date,TimestampType)
(purchase_price,StringType)
(rank_dr,IntegerType)
(repeat_likely,IntegerType)
(service_date,TimestampType)
(vin_num,StringType)

Next I have a simple utility to compare 2 arrays above based on their value at index 0:

val col_similar: ( Array[(String,String)], Array[(String, String)] )=> String 
= (x,y) => {if (x(0).sameElements(y(0))) "similar" else "different"}

when I run the above code. I get the following error:

Error:(59, 105) value sameElements is not a member of (String, String)
val col_similar: ( Array[(String,String)], Array[(String, String)] ) => String 
= (x,y) => {if (x(0).sameElements(y(0))) "similar" else "different"}

Please help me understand why this code wont work.... Thanks so much

标签: scalaapache-spark

解决方案


x(0)是一对字符串。如果您想比较成对的数组xy,请执行以下操作:

if (x sameElements y) ... else ...

顺便说一句,我怀疑这种方法是否会扩展到实际的数据集——将整个数据帧收集到主节点通常是一个坏主意。也许你可以在这里找到一些更好的想法


推荐阅读