为什么一种回溯方法比另一种更快?



所以我在《破词II》的Leetcode中工作,并提出了两个类似但在记忆方面有所不同的回溯实现。然而,一个会通过规格,而另一个不会,因为超过了时间限制。有人能解释一下为什么方法1比方法2快吗?

对于上下文,基本上问题给了我一个字符串和一个字典。如果字符串中的单词也在字典中,则用空格分隔单词,并将结果字符串(由单词组成)放入结果数组中。字典里的单词可以被使用不止一次!

例如:

s = "pineapplepenapple"
wordDict = ["apple", "pen", "applepen", "pine", "pineapple"]
Output:
[
"pine apple pen apple",
"pineapple pen apple",
"pine applepen apple"
]
Explanation: Note that you are allowed to reuse a dictionary word.

方法1(有效!):

require 'set'
def word_break(s, word_dict)
word_dict = Set.new(word_dict)
bt(s, word_dict, 0, {})
end
def bt(s, word_dict, i, mem)
return mem[i] if mem[i]
return [''] if i == s.length
res = []
j = i
while j < s.length
word = s[i..j]
if word_dict.include?(word)
word_breaks = bt(s, word_dict, j + 1, mem)

word_breaks.each do |words|
new_combined_words = word
new_combined_words += ' ' + words if words.length > 0
res << new_combined_words
end
end
j += 1
end
# Memoizing here makes it fast!
mem[i] = res
end

方法2(不够快):

require 'set'
def word_break(s, word_dict)
word_dict = Set.new(word_dict)
bt(s, word_dict, 0, {})
end
def bt(s, word_dict, i, mem)
return mem[i] if mem[i]
return [''] if i== s.length
res = []
j = i
while j < s.length
word = s[i..j]
if word_dict.include?(word)
word_breaks = bt(s, word_dict, j + 1, mem)

word_breaks.each do |words|
new_combined_words = word
new_combined_words += ' ' + words if words.length > 0

# Memoizing here but it's too slow
(mem[i] ||= []) << new_combined_words

res << new_combined_words
end
end
j += 1
end

res
end

在方法1中,我最后用mem[i] = res进行记忆,而在方法2中,我在生成新的单词组合时动态地进行记忆。

任何帮助都将非常感激,谢谢!

mem[i]碰巧是空数组时,您永远不会在方法2中设置它。空值不被记忆

Cary在他的评论中建议应该解决这个问题。代码仍然会比方法1慢一点,但它可能会通过leetcode的测试。

UPD:即使有了建议的编辑,当bt(s, word_dict, j + 1, mem)返回一个空数组时,我们永远不会记住mem[i],这使得代码渐近指数。要解决此问题,请尝试以下操作:

mem[i] ||= []

if word_dict.include?(word)
word_breaks = bt(s, word_dict, j + 1, mem)

word_breaks.each do |words|
new_combined_words = word
new_combined_words += ' ' + words if words.length > 0
mem[i] << new_combined_words

res << new_combined_words
end
end

这不是一个答案,而是一个扩展的评论,旨在阐明问题。我对这两种方法进行了基准测试,无法重现#1比#2快的结果。word_break1(调用bt1)是#1;word_break2(调用bt2)是#2。

s = "nowisthetimetohavesomefun"
word_dict = %w|now is the time to have some fun no wist he so mefun ist he ti me|
require 'benchmark'
Benchmark.bm do |x|
x.report("#1") { word_break1(s, word_dict) }
x.report("#2") { word_break2(s, word_dict) }
end

从几次运行中得到以下结果:

user     system      total        real
#1  0.000342   0.000083   0.000425 (  0.000312)
#2  0.000264   0.000066   0.000330 (  0.000242)*
#1  0.000315   0.000075   0.000390 (  0.000288)
#2  0.000230   0.000066   0.000296 (  0.000208)*
#1  0.000292   0.000079   0.000371 (  0.000268)
#2  0.000255   0.000065   0.000320 (  0.000253)*
#1  0.000292   0.000090   0.000382 (  0.000261)*
#2  0.000337   0.000121   0.000458 (  0.000349)
#1  0.000301   0.000063   0.000364 (  0.000291)*
#2  0.000413   0.000134   0.000547 (  0.000385)
#1  0.000306   0.000079   0.000385 (  0.000280)
#2  0.000281   0.000082   0.000363 (  0.000255)*

两个方法都返回以下数组。

["no wist he ti me to have so me fun", "no wist he ti me to have so mefun",
"no wist he ti me to have some fun", "no wist he time to have so me fun",
"no wist he time to have so mefun", "no wist he time to have some fun",
"now is the ti me to have so me fun", "now is the ti me to have so mefun",
"now is the ti me to have some fun", "now is the time to have so me fun",
"now is the time to have so mefun", "now is the time to have some fun",
"now ist he ti me to have so me fun", "now ist he ti me to have so mefun",
"now ist he ti me to have some fun", "now ist he time to have so me fun",
"now ist he time to have so mefun", "now ist he time to have some fun"]

最新更新