首页 > 解决方案 > 如何根据 Scala 中的条件返回字符串值过滤字符串列表?

问题描述

我有一个案例类,定义为

case class Contribuyente(
                    age: Int,
                    workclass: String,
                    education: String,
                    educationNum: Int,
                    maritalStatus: String,
                    occupation: String,
                    relationship: String,
                    race: String,
                    sex: String,
                    capitalGain: Int,
                    capitalLoss: Int,
                    hoursPerWeek: Int,
                    nativeCountry: String,
                    income: String
                  )

这是 kaggle 数据集分析的一部分。链接在这里:

https://www.kaggle.com/johnolafenwa/us-census-data

我想知道大多数收入高于“50k”的工人阶级(频率> 0.5)。

我的目标如下:

def trabajoMejorRemunerado(c: Seq[Contribuyente]): String = ???

我想返回一个字符串,但我只有 Unit。

到目前为止,我进行的测试是:

测试1:

def trabajoMejorRemunerado3(c: Seq[Contribuyente]): Unit = {
    var porcentaje: Double = 0
    for( w <- Vector("State-gov", "Self-emp-not-inc", "Private", "Federal-gov", "Local-gov", "Self-emp-inc", "Without-pay", "Never-worked")) {
      porcentaje = c.filter(_.income == ">50K").filter(_.workclass == w).length.toDouble / c.filter(_.workclass == w).length
      if (porcentaje > 0.5){
        w
      }
    }
  }

测试 2:

def trabajoMejorRemunerado2(c: Seq[Contribuyente]): Unit = 
{
    var porcentaje: Double = 0
    for(w <- c.map(_.workclass)) {
      porcentaje = c.filter(_.income == ">50K").filter(_.workclass == w).length.toDouble / c.filter(_.workclass == w).length
      println(porcentaje)
      if (porcentaje > 0.5){
        println(w)
      }
    }
  }

测试 3:

def trabajoMejorRemunerado4(c: Seq[Contribuyente]): Unit = c.map(_.workclass).foreach{w => if (c.filter(_.income == ">50K").filter(_.workclass == w).length.toDouble / c.filter(_.workclass == w).length > 0.5) println(w)}

测试 4:

def trabajoMejorRemunerado5(c: Seq[Contribuyente]): Unit = c.map(_.workclass).collectFirst{w => if (c.filter(_.income == ">50K").filter(_.workclass == w).length.toDouble / c.filter(_.workclass == w).length > 0.5) w.mkString("")}

有谁能够帮我?。谢谢指教。

标签: scala

解决方案


推荐阅读