ruby中array# uniq方法的时间复杂度是多少?



谁能告诉我哪个算法是内部使用ruby使用Array#uniq方法从ruby数组中删除重复?

From the docs:

static VALUE
rb_ary_uniq(VALUE ary)
{
    VALUE hash, uniq, v;
    long i;
    if (RARRAY_LEN(ary) <= 1)
        return rb_ary_dup(ary);
    if (rb_block_given_p()) {
        hash = ary_make_hash_by(ary);
        uniq = ary_new(rb_obj_class(ary), RHASH_SIZE(hash));
        st_foreach(RHASH_TBL(hash), push_value, uniq);
    }
    else {
        hash = ary_make_hash(ary);
        uniq = ary_new(rb_obj_class(ary), RHASH_SIZE(hash));
        for (i=0; i<RARRAY_LEN(ary); i++) {
            st_data_t vv = (st_data_t)(v = rb_ary_elt(ary, i));
            if (st_delete(RHASH_TBL(hash), &vv, 0)) {
                rb_ary_push(uniq, v);
            }
        }
    }
    ary_recycle_hash(hash);
    return uniq;

具有O(N)复杂度

在内部使用Hash时平摊O(n)

这取决于您所谈论的"内部"。目前有7种可用于生产的Ruby实现,Ruby语言规范并没有规定任何特定的算法。所以,这真的取决于实现。

。,这是Rubinius使用的实现:

Rubinius.check_frozen
if block_given?
  im = Rubinius::IdentityMap.from(self, &block)
else
  im = Rubinius::IdentityMap.from(self)
end
return if im.size == size
array = im.to_array
@tuple = array.tuple
@start = array.start
@total = array.total
self

这个来自JRuby:

RubyHash hash = makeHash();
if (realLength == hash.size()) return makeShared();
RubyArray result = new RubyArray(context.runtime, getMetaClass(), hash.size()); 
int j = 0;
try {
    for (int i = 0; i < realLength; i++) {
        IRubyObject v = elt(i);
        if (hash.fastDelete(v)) result.values[j++] = v;
    }
} catch (ArrayIndexOutOfBoundsException aioob) {
    concurrentModification();
}
result.realLength = j;
return result;

It compares elements using their hash (provided by the Object#hash method) then compares hashes with Object#eql?.

来源:https://github.com/ruby/ruby/blob/trunk/array.c L3976

时间复杂度为线性时间,即O(n),因为它使用哈希来内部实现算法。

最新更新