cascading.tuple
Interface StreamComparator<T extends InputStream>

Type Parameters:
T -

public interface StreamComparator<T extends InputStream>

The StreamComparator interface allows for two InputStream instances to be compared, typically bit-wise. This is most useful when defining custom types to be stored in Tuple instances and the values will need to be grouped on and/or secondary sorted (via GroupBy and/or CoGroup) and the underlying serialization implementation enables a useful bit-wise comparison without deserializing the custom type into memory.

Typically this interface is used to mark a Comparator as additionally supporting the ability to compare raw streams in tandem with comparing Object instances. Thus concrete implementations should implement this interface and the Comparator interface when being used as a "grouping" or "sorting" field Comparator

When used with Hadoop, a cascading.tuple.hadoop.io.BufferedInputStream is passed into the compare(java.io.InputStream, java.io.InputStream) method. This class gives access to the underlying byte[] array so each individual byte need to be InputStream.read(). So it is useful to declare an implementation as public class YourCustomComparator implements StreamComparator&lt;BufferedInputStream>, Comparator&lt;YourCustomType>, Serializable

Note the method cascading.tuple.hadoop.io.BufferedInputStream#skip(long) will need to be called with the number of bytes read from the underlying byte buffer before the compare() method returns.


Method Summary
 int compare(T lhsStream, T rhsStream)
           
 

Method Detail

compare

int compare(T lhsStream,
            T rhsStream)


Copyright © 2007-2014 Concurrent, Inc. All Rights Reserved.