我正试图从下面的文本(.srt字幕文件)中的每个重复集创建对象:
1
00:02:12,446 --> 00:02:14,406
The Hovitos are near.
2
00:02:15,740 --> 00:02:18,076
The poison is still fresh,
three days.
3
00:02:18,076 --> 00:02:19,744
They're following us.
例如,我可以取三行或四行,并将它们指定给新对象的属性。所以对于第一套,我可以有Sentence.create(number: 1, time_marker: '00:02:12', content: "The Hovitos are near.")
从script.each_line
开始,还有什么其他的一般结构可以让我走上正轨?我在这方面遇到了困难,任何帮助都将是美妙的!
编辑
到目前为止,我所拥有的一些混乱的未完成代码如下。它确实有效(我认为)。你会选择完全不同的路线吗?我对此没有任何经验。
number = nil
time_marker = nil
content = []
script = script.strip
script.each_line do |line|
line = line.strip
if line =~ /^d+$/
number = line.to_i
elsif line =~ /-->/
time_marker = line[0..7]
elsif line =~ /^bD/
content << line
else
if content.size > 1
content = content.join("n")
else
content = content[0]
end
Sentence.create(movie: @movie, number: number,
time_marker: time_marker, content: content)
content = []
end
end
这里有一种方法:
File.read('subtitles.srt').split(/^s*$/).each do |entry| # Read in the entire text and split on empty lines
sentence = entry.strip.split("n")
number = sentence[0] # First element after empty line is 'number'
time_marker = sentence[1][0..7] # Second element is 'time_marker'
content = sentence[2..-1].join("n") # Everything after that is 'content'
end
假设字幕在以下变量中:
subtitles = %q{1
00:02:12,446 --> 00:02:14,406
The Hovitos are near.
2
00:02:15,740 --> 00:02:18,076
The poison is still fresh,
three days.
3
00:02:18,076 --> 00:02:19,744
They're following us.}
然后,你可以这样做:
def split_subs subtitles
grouped, splitted = [], []
subtitles.split("n").push("n").each do |sub|
if sub.strip.empty?
splitted.push({
number: grouped[0],
time_marker: grouped[1].split(",").first,
content: grouped[2..-1].join(" ")
})
grouped = []
else
grouped.push sub.strip
end
end
splitted
end
puts split_subs(subtitles)
# output:
# ➲ ruby 23025546.rb [10:00:07] ▸▸▸▸▸▸▸▸▸▸
# {:number=>"1", :time_marker=>"00:02:12", :content=>"The Hovitos are near."}
# {:number=>"2", :time_marker=>"00:02:15", :content=>"The poison is still fresh, three days."}
# {:number=>"3", :time_marker=>"00:02:18", :content=>"They're following us."}