Vote count: 0
I wonder whether I can match optional (nullable) fields using Row's unapply method in Spark 1.6 Scala API:
Consider the following example ( for simplicity, I only use 1 field):
case class MyRow(id: Option[Int])
val data = Seq(
MyRow(Some(1)),
MyRow(Some(2)),
MyRow(None))
val df = sc.parallelize(data).toDF()
df.show()
+----+
| id|
+----+
| 1|
| 2|
|null|
+----+
I can do:
val myUDF1 = udf((r: Row) =>
r match {
case r:Row if !r.isNullAt(0) => r.getInt(0)
case r:Row if r.isNullAt(0) => 999
}
)
df.withColumn("udf_result", myUDF(struct($"id")))
But this is kind uf ugly.
I've found out that I can also do:
val myUDF = udf((r: Row) =>
r match {
case Row(i: Int) => i
case _ => 999
}
)
Which gives the same result.
Both the above examples become ugly if Row consists of multiple nullable fields.
What I would like to have is this:
val myUDF = udf((r: Row) =>
r match {
case Row(i: Option[Int]) => i.getOrElse(999)
}
)
Unforunately, this does not work, I get a
scala.MatchError
Is there a way to do this?
asked 26 secs ago
Matching Option in Spark SQL Row.unapply
Aucun commentaire:
Enregistrer un commentaire