am面临戏剧表演问题,这是一种需要解决的经典问题。
我有一个字符串作为输入,比如:
输入:
val test: String = "1,1,12,1,1,12,13"
val test2: String = "1,1,12,1,1,12,13,1,1,12"
所需输出:
test --> 2{1,1,12},13
test2 --> 2{1,1,12},13,2{1},12
我采用了以下方法,通过以递归方式比较给定字符串的后缀和前缀
def shorten_prefix(pfx:String, sfx: String): (Integer, String, String) = {
var count = 1
var _sfx = sfx
while (_sfx.startsWith(pfx)) {
count = count + 1
_sfx = _sfx.splitAt(pfx.length)._2
}
val _tmp = s"$count{${pfx}}"
(count, _tmp, _sfx)
}
def find_shortest_repr(input: String): String= {
var possible_variants: mutable.ListBuffer[String] = new mutable.ListBuffer[String]()
if (input.isEmpty) {
return ""
}
val size = input.length
for (i <- 1 until size + 1){
val (prefixes, suffixes) = input.splitAt(i)
val (counter, shortened, new_ending) = shorten_prefix(prefixes, suffixes)
val shortest_ending: String = find_shortest_repr(new_ending)
val tmp = shortened ++ shortest_ending
possible_variants += (tmp)
}
possible_variants.minBy(f => f.length)
}
def compress(x: String)= {
val _tmp = find_shortest_repr(x)
val regex = "(,\})"
val subst = "},"
_tmp.replaceAll(regex, subst).dropRight(1)
}
println(compress(test))
println(compress(test2))
}
听起来字符串越长,计算所有可能的排列就越长,以找到最佳匹配。
有什么窍门或建议吗?
谢谢
这里有一个fp解决方案,您可以通过dp方法将其修改为O(n^3(时间和O(n^2(空间的复杂性之一:
注:此问题的递归关系为compression(ss,start,end(=最小化{compression(s,start,i(+compression(ss,i,end(for i in 1,2,..,len(ss(-1}
所以有n^n个程序状态
def compress(text: String): String = {
def size: ((Seq[String], Int)) => Int = {
case (ss, i) => if(i == 1) ss.map(_.size).sum + ss.size - 1
else ss.map(_.size).sum + ss.size + i.toString.size + 1
}
def length(res: Seq[(Seq[String], Int)]): Int = res.map(size).sum + res.size - 1
def cmpOne(items: Seq[String]): (Seq[String], Int) =
(1 to items.size).collectFirst {
case i if items.size % i == 0 && items.grouped(i).toSet.size == 1 => (items.take(i), items.size / i)
}.get
def cmpAll(items: Seq[String]): Seq[(Seq[String], Int)] = {
val totalRes = Seq(cmpOne(items))
if (items.size >= 2) {
val totalLen = length(totalRes)
val splitRes = (1 until items.size).map(i => cmpAll(items.take(i)) ++ cmpAll(items.drop(i))).minBy(length)
val splitLen = length(splitRes)
if (splitLen < totalLen) splitRes else totalRes
}
else totalRes
}
cmpAll(text.split(",")).map{ case(ss, i) => if(i == 1) ss.mkString(",") else i.toString + "{" + ss.mkString(",") + "}" }.mkString(",")
}
下面是一些代码。它是函数式和尾部递归的,但不知道性能,也不知道是否有更好的算法!
// Count number of times first element repeats at start of list
// Returns repeat count
def prefixCount[T](s: List[T]): Int = {
def loop(first: T, rem: List[T], matchCount: Int): Int =
rem match {
case hd :: tail if hd == first =>
loop(first, tail, matchCount + 1)
case _ =>
matchCount
}
s match {
case hd :: tail => loop(hd, tail, 1)
case _ => s.size
}
}
// Find the largest repeating group at start of sequence
// Returns (groupSize, groupCount)
def prefixGroup[T](seq: Seq[T]): (Int, Int) = {
def loop(len: Int): (Int, Int) =
if (len == 0) {
(1, 1)
} else {
val g = seq.grouped(len).toList
val c = prefixCount(g)
if (c > 1) {
(len, c)
} else {
loop(len-1)
}
}
seq.size match {
case 0 => (0,1)
case n => loop(n/2)
}
}
// Find all prefix sequences
def prefixAll[T](v: Seq[T]): List[(Seq[T], Int)] = {
def loop(rem: Seq[T], res: List[(Seq[T], Int)]): List[(Seq[T], Int)] =
rem match {
case Nil =>
res.reverse
case hd :: Nil =>
((rem, 1) +: res).reverse
case _ =>
val (groupSize, groupCount) = prefixGroup(rem)
loop(rem.drop(groupSize*groupCount), (rem.take(groupSize), groupCount) +: res)
}
loop(v, Nil)
}
def prefixToString[T](p: List[(Seq[T], Int)]): String =
p.map {
case (s, 1) => s.mkString(",")
case (s, l) => l.toString + "{" + s.mkString(",") + "}"
}.mkString(",")
def test(s: String) = {
val v = s.split(",").toVector
val a = prefixAll(v)
print(s"$s => ")
println(prefixToString(a))
}
val tests = List(
"",
"1",
"1,2,3,4,5,6",
"1,1,12,1,1,12,13",
"1,1,12,1,1,12,13,1,1,12",
"2,1,2,2,1,2,1,2,2",
)
tests.foreach(test)
输出为:
=>
1 => 1
1,2,3,4,5,6 => 1,2,3,4,5,6
1,1,12,1,1,12,13 => 2{1,1,12},13
1,1,12,1,1,12,13,1,1,12 => 2{1,1,12},13,2{1},12
2,1,2,2,1,2,1,2,2 => 2{2,1,2},1,2{2}
一些想法(其中一些也可以在Tim的解决方案中找到(:
- 在启动实际算法之前拆分输入字符串(假设输入始终以逗号分隔(
- 不需要搜索比剩余字符串长的前缀的进一步出现(例如"1,2,3,3"中的"1,3,2"(
- 您可能希望从尽可能长的前缀(大小为
input.length / 2
(开始,然后递减,而不是从尽可能短的前缀(长度为1(开始搜索,然后继续使用较长的前缀。这样,你就可以";快进";只要你找到匹配的