001/* 002 * Copyright (c) 2016-2017 Chris K Wensel <chris@wensel.net>. All Rights Reserved. 003 * Copyright (c) 2007-2017 Xplenty, Inc. All Rights Reserved. 004 * 005 * Project and contact information: http://www.cascading.org/ 006 * 007 * This file is part of the Cascading project. 008 * 009 * Licensed under the Apache License, Version 2.0 (the "License"); 010 * you may not use this file except in compliance with the License. 011 * You may obtain a copy of the License at 012 * 013 * http://www.apache.org/licenses/LICENSE-2.0 014 * 015 * Unless required by applicable law or agreed to in writing, software 016 * distributed under the License is distributed on an "AS IS" BASIS, 017 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 018 * See the License for the specific language governing permissions and 019 * limitations under the License. 020 */ 021 022package cascading.tuple.collect; 023 024import java.util.Collection; 025import java.util.Map; 026 027import cascading.provider.CascadingFactory; 028import cascading.tuple.Tuple; 029 030/** 031 * Interface TupleMapFactory allows developers to plugin alternative implementations of a "tuple map" 032 * used to back in memory "join" and "co-group" operations. Typically these implementations are 033 * "spillable", in that to prevent using up all memory in the JVM, after some threshold is met or event 034 * is triggered, values are persisted to disk. 035 * <p> 036 * The {@link Map} classes returned must take a {@link cascading.tuple.Tuple} as a key, and a {@link Collection} of Tuples as 037 * a value. Further, {@link Map#get(Object)} must never return {@code null}, but on the first call to get() on the map 038 * an empty Collection must be created and stored. 039 * <p> 040 * That is, {@link Map#put(Object, Object)} is never called on the map instance internally, 041 * only {@code map.get(groupTuple).add(valuesTuple)}. 042 * <p> 043 * Using the {@link TupleCollectionFactory} to create the underlying Tuple Collections would allow that aspect 044 * to be pluggable as well. 045 * <p> 046 * If the Map implementation implements the {@link Spillable} interface, it will receive a {@link Spillable.SpillListener} 047 * instance that calls back to the appropriate logging mechanism for the platform. This instance should be passed 048 * down to any child Spillable types, namely an implementation of {@link SpillableTupleList}. 049 * <p> 050 * The default implementation for the Hadoop platform is the {@link cascading.tuple.hadoop.collect.HadoopTupleMapFactory} 051 * which created a {@link cascading.tuple.hadoop.collect.HadoopSpillableTupleMap} instance. 052 * <p> 053 * The class {@link SpillableTupleMap} may be used as a base class. 054 * 055 * @see SpillableTupleMap 056 * @see cascading.tuple.hadoop.collect.HadoopTupleMapFactory 057 * @see TupleCollectionFactory 058 * @see cascading.tuple.hadoop.collect.HadoopTupleCollectionFactory 059 */ 060public interface TupleMapFactory<Config> extends CascadingFactory<Config, Map<Tuple, Collection<Tuple>>> 061 { 062 String TUPLE_MAP_FACTORY = "cascading.factory.tuple.map.classname"; 063 }