Ruby 根据键合并数组中的哈希值,哈希中另一个键的总和值



我有一个来自dynamo表的哈希数组,我需要按一个键分组并对另一个键的值求和。我的数组类似于:

data = [
  { 'state' => 'Florida', 'minutes_of_sun' => 10, 'timestamp' => 1497531600, 'region' => 'Southeast' },
  { 'state' => 'Florida', 'minutes_of_sun' => 7, 'timestamp' => 1497531600, 'region' => 'Southeast' },
  { 'state' => 'Florida', 'minutes_of_sun' => 2, 'timestamp' => 1497531600, 'region' => 'Southeast' },
  { 'state' => 'Georgia', 'minutes_of_sun' => 15, 'timestamp' => 1497531600, 'region' => 'Southeast' },
  { 'state' => 'Georgia', 'minutes_of_sun' => 5, 'timestamp' => 1497531600, 'region' => 'Southeast' }
]

我要寻找的最终结果是:

data = [
  { 'state' => 'Florida', 'minutes_of_sun' => 19, 'region' => 'Southeast' },
  { 'state' => 'Georgia', 'minutes_of_sun' => 20, 'region' => 'Southeast' }
]

我已经能够通过我在下面写的方法做到这一点,但它既慢又笨重。想知道是否有更快/更少的 LoC 方法来做到这一点?

def combine_data(data)
  combined_data = []
  data.each do |row|
    existing_data = combined_data.find { |key| key['state'] == row['state'] }
    if existing_data.present?
      existing_data['minutes_of_sun'] += row['minutes_of_sun']
    else
      combined_data << row
    end
  end
  combined_data
end

试试这个

data.group_by { |item| item['state'] }.values.map do |arr| 
  h = arr.first
  h.delete('timestamp')
  h.merge('minutes_of_sun' => arr.inject { |acc, h| acc + h['minutes_of_sun'] }) 
end
 => [{"state"=>"Florida", "minutes_of_sun"=>19, "region"=>"Southeast"}, {"state"=>"Georgia", "minutes_of_sun"=>20, "region"=>"Southeast"}]

来自 Ruby 2.4.0

data.group_by { |item| item['state'] }.values.map do |arr| 
  h = arr.first
  h.delete('timestamp')
  h.merge('minutes_of_sun' => arr.sum { |item| item['minutes_of_sun'] }) 
end
 => [{"state"=>"Florida", "minutes_of_sun"=>19, "region"=>"Southeast"}, {"state"=>"Georgia", "minutes_of_sun"=>20, "region"=>"Southeast"}]

您可以使用 Hash#update(又名 merge!(的形式,该形式使用块来确定合并的两个哈希中存在的键的值。有关该块中三个块变量的说明,请参阅文档。

data = [
  { 'state'=>'Florida', 'sun_min'=>10, 'stamp'=>149, 'region'=>'SE' },
  { 'state'=>'Georgia', 'sun_min'=>15, 'stamp'=>149, 'region'=>'SE' },
  { 'state'=>'Georgia', 'sun_min'=> 5, 'stamp'=>149, 'region'=>'SE' }
]
data.each_with_object({}) do |g,h|
  h.update(g['state']=>g.reject { |k,_| k=='stamp' }) do |_,o,n|
    o.merge('sun_min'=>o['sun_min']+n['sun_min'])
  end
end.values
  #=> [{"state"=>"Florida", "sun_min"=>10, "region"=>"SE"},
  #    {"state"=>"Georgia", "sun_min"=>20, "region"=>"SE"}]

请注意,如果没有.values这将返回

#=> {"Florida"=>{"state"=>"Florida", "sun_min"=>10, "region"=>"SE"},
#    "Georgia"=>{"state"=>"Georgia", "sun_min"=>20, "region"=>"SE"}}

最新更新