RawComparator的意义是什么,我们在什么场景下使用它



什么是RawComparator及其意义?

是否必须为每个mapreduce程序使用RawComparator ?

RawComparator直接操作对象的字节表示

不是强制在每个map reduce程序中使用

MapReduce本质上是一个批处理系统,而不是适合于交互分析。您不可能运行查询并在几秒钟或更短的时间内获得结果。查询通常需要几分钟或更长时间,因此它最适合离线使用,在这种情况下,没有人坐在处理循环中等待结果。

如果你仍然想优化Map Reduce Job所花费的时间,那么你必须使用RawComparator。

使用RawComparator:

中间键值对已经从Mapper传递到Reducer。在这些值从Mapper到达Reducer之前,将执行shuffle和排序步骤。

排序得到了改进,因为RawComparator将按字节比较键。如果我们不使用RawComparator,中间键必须完全反序列化才能执行比较。

例子:

public class IndexPairComparator extends WritableComparator {
    protected IndexPairComparator() {
        super(IndexPair.class);
    }
    @Override
    public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {
        int i1 = readInt(b1, s1);
        int i2 = readInt(b2, s2);
        int comp = (i1 < i2) ? -1 : (i1 == i2) ? 0 : 1;
        if(0 != comp)
            return comp;
        int j1 = readInt(b1, s1+4);
        int j2 = readInt(b2, s2+4);
        comp = (j1 < j2) ? -1 : (j1 == j2) ? 0 : 1;
        return comp;
    }
}

在上面的例子中,我们没有直接实现RawComparator。相反,我们扩展了WritableComparator,它在内部实现了RawComparator。

有关详细信息,请参阅这篇RawComparator文章

我知道我在回答一个老问题。

下面是为WritableComparable对象编写RawComparator的另一个示例

public class CompositeWritable2 implements WritableComparable<CompositeWritable2> {
  private Text textData1;
  private LongWritable longData;
  private Text textData2;
  static {
    WritableComparator.define(CompositeWritable2.class, new Comparator());
  }
  /**
   * Empty constructor
   */
  public CompositeWritable2() {
    textData1 = new Text();
    longData = new LongWritable();
    textData2 = new Text();
  }
  /**
   * Comparator
   * 
   * @author CuriousCat
   */
  public static class Comparator extends WritableComparator {
    private static final Text.Comparator TEXT_COMPARATOR = new Text.Comparator();
    private static final LongWritable.Comparator LONG_COMPARATOR = new LongWritable.Comparator();
    public Comparator() {
      super(CompositeWritable2.class);
    }
    /*
     * (non-Javadoc)
     * 
     * @see org.apache.hadoop.io.WritableComparator#compare(byte[], int, int, byte[], int, int)
     */
    @Override
    public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {
      int cmp;
      try {
        // Find the length of the first text property
        int textData11Len = WritableUtils.decodeVIntSize(b1[s1]) + readVInt(b1, s1);
        int textData12Len = WritableUtils.decodeVIntSize(b2[s2]) + readVInt(b2, s2);
        // Compare the first text data as bytes
        cmp = TEXT_COMPARATOR.compare(b1, s1, textData11Len, b2, s2, textData12Len);
        if (cmp != 0) {
          return cmp;
        }
        // Read and compare the next 8 bytes starting from the length of first text property.
        // The reason for hard coding 8 is, because the second property is long.
        cmp = LONG_COMPARATOR.compare(b1, textData11Len, 8, b2, textData12Len, 8);
        if (cmp != 0) {
          return cmp;
        }
        // Move the index to the end of the second long property
        textData11Len += 8;
        textData12Len += 8;
        // Find the length of the second text property
        int textData21Len = WritableUtils.decodeVIntSize(b1[textData11Len]) + readVInt(b1, textData11Len);
        int textData22Len = WritableUtils.decodeVIntSize(b2[textData12Len]) + readVInt(b2, textData12Len);
        // Compare the second text data as bytes
        return TEXT_COMPARATOR.compare(b1, textData11Len, textData21Len, b2, textData12Len, textData22Len);
      } catch (IOException ex) {
        throw new IllegalArgumentException("Failed in CompositeWritable's RawComparator!", ex);
      }
    }
  }
  /**
   * @return the textData1
   */
  public Text getTextData1() {
    return textData1;
  }
  /**
   * @return the longData
   */
  public LongWritable getLongData() {
    return longData;
  }
  /**
   * @return the textData2
   */
  public Text getTextData2() {
    return textData2;
  }
  /**
   * Setter method
   */
  public void set(Text textData1, LongWritable longData, Text textData2) {
    this.textData1 = textData1;
    this.longData = longData;
    this.textData2 = textData2;
  }
  /*
   * (non-Javadoc)
   * 
   * @see org.apache.hadoop.io.Writable#write(java.io.DataOutput)
   */
  @Override
  public void write(DataOutput out) throws IOException {
    textData1.write(out);
    longData.write(out);
    textData2.write(out);
  }
  /*
   * (non-Javadoc)
   * 
   * @see org.apache.hadoop.io.Writable#readFields(java.io.DataInput)
   */
  @Override
  public void readFields(DataInput in) throws IOException {
    textData1.readFields(in);
    longData.readFields(in);
    textData2.readFields(in);
  }
  /*
   * (non-Javadoc)
   * 
   * @see java.lang.Comparable#compareTo(java.lang.Object)
   */
  @Override
  public int compareTo(CompositeWritable2 o) {    
    int cmp = textData1.compareTo(o.getTextData1());
    if (cmp != 0) {
      return cmp;
    }
    cmp = longData.compareTo(o.getLongData());
    if (cmp != 0) {
      return cmp;
    }
    return textData2.compareTo(o.getTextData2());
  }
}

相关内容

  • 没有找到相关文章