Scala比较两个分隔字符串并生成第三个分隔字符串



我有两个字符串str1=A#2021-04-02,B#2021-04-01,C#2021-04-02str2=A#2021-04-02#60.0,B#2021-04-02#80.0,C#2021-04-01#60.0

字符串的第一部分是group,第二部分是datestr2将具有额外的字段百分比。现在我想通过比较这两个字符串来生成一个字符串,比如如果组部分匹配,那么检查str2的日期部分是否大于str1的日期部分,并且str2的percentage部分是否应该是>= 75

输出字符串应该类似于str3=A#2021-04-02,B#2021-04-02,C#2021-04-02,因为对于组B,str2的日期大于str1percentage >= 75

如果str1=A#2021-04-02,B#2021-04-01,C#2021-04-02str2A#2021-04-02#60.0,B#2021-04-02#60.0,C#2021-04-01#60.0,则str3将是A#2021-04-02,B#2021-04-01,C#2021-04-02,因为百分比部分不>=75.

def parseString(s: String) = s.split(',').map(_.split('#'))
val str1: String = ???
val str2: String = ???

//Note:
// 1. collect would drop invalid parsed string silently
// 2. we are not parsing date and leaving it as string for simplicity - i.e. we assume all dates are valid string
// 3. `p.toDouble` can fail if p is not a valid double
val rdd1 = sc.parallelize(parseString(str1)).collect { case Array(g, d, _*) => g -> d }
val rdd2 = sc.parallelize(parseString(str2)).collect { case Array(g, d, p, _*) => g -> (d, p.toDouble) }
// 3. we assume a left outer join here base on your requirement to default to the left date if condition fail
val str3 = rdd1.leftOuterJoin(rdd2).map {
case (g, (d1, Some((d2, p)))) if d2 > d1 && p >= 75 => s"$g#$d2"
case (g, (d1, _)) => s"$g#$d1"
}.collect.mkString(",")