hadoop - implement customized rawcomparator -
i need improve mr jobs, 1 thing think implement customized rawcomparator, key class have lots of fields string besides int fields, not sure how parse out string fields out of byte[],
my key class
public generalkey { private int day; private int hour; private string type; private string name; .. }
my customized rawcomparator:
public class generalkeycomparator extends writablecomparator { private static final text.comparator text_comparator = new text.comparator(); protected generalkeycomparator() { super(generalkey.class); } @override public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) { int day1 = readint(b1, s1); int day2 = readint(b2, s2); int comp = (intday1 < intday2) ? -1 : (intday1 == intday2) ? 0 : 1; if (0 != comp) { return comp; } int hr1 = readint(b1, s1+4); int hr2 = readint(b2, s2+4); comp = (hr1 < hr2) ? -1 : (hr1 == hr2) ? 0 : 1; .... how compare string fields here??? return comp; }
google around found people tried :
try { int firstl1 = writableutils.decodevintsize(b1[s1]) + readint(b1, s1+8); int firstl2 = writableutils.decodevintsize(b2[s2]) + readvint(b2, s2+8); comp = text_comparator.compare(b1, s1, firstl1, b2, s2, firstl2); } catch (ioexception e) { throw new illegalargumentexception(e); }
but don't understand how work , don't think works in case, can help? thanks
added readfield() , write() methods here:
public void readfields(datainput input) throws ioexception { intday = input.readint(); hr = input.readint(); type = input.readutf(); name = input.readutf(); ... } @override public void write(dataoutput output) throws ioexception { output.writeint(intday); output.writeint(hr); output.writeutf(type); output.writeutf(name); ... }
you right. example found not work you. data fields in key example writablecomparables. have fundamental types (int, string) instead.
as using fundamental types, assume have implemented serialization / deserialization methods custom key type.
for third , fourth data field java strings, should able use compareto method on string class.
other option use writablecomparables instead of using fundamental types , use same technique found on google example.
Comments
Post a Comment