Packages

c

org.apache.spark

ShuffleDependency

class ShuffleDependency[K, V, C] extends Dependency[Product2[K, V]]

Developer API

Represents a dependency on the output of a shuffle stage. Note that in the case of shuffle, the RDD is transient since we don't need it on the executor side.

Annotations
@DeveloperApi()
Source
Dependency.scala
Linear Supertypes
Dependency[Product2[K, V]], Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. ShuffleDependency
  2. Dependency
  3. Serializable
  4. Serializable
  5. AnyRef
  6. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new ShuffleDependency(_rdd: RDD[_ <: Product2[K, V]], partitioner: Partitioner, serializer: Serializer = SparkEnv.get.serializer, keyOrdering: Option[Ordering[K]] = None, aggregator: Option[Aggregator[K, V, C]] = None, mapSideCombine: Boolean = false, shuffleWriterProcessor: ShuffleWriteProcessor = new ShuffleWriteProcessor)(implicit arg0: ClassTag[K], arg1: ClassTag[V], arg2: ClassTag[C])

    _rdd

    the parent RDD

    partitioner

    partitioner used to partition the shuffle output

    serializer

    Serializer to use. If not set explicitly then the default serializer, as specified by spark.serializer config option, will be used.

    keyOrdering

    key ordering for RDD's shuffles

    aggregator

    map/reduce-side aggregator for RDD's shuffle

    mapSideCombine

    whether to perform partial aggregation (also known as map-side combine)

    shuffleWriterProcessor

    the processor to control the write behavior in ShuffleMapTask

Value Members

  1. val aggregator: Option[Aggregator[K, V, C]]
  2. val keyOrdering: Option[Ordering[K]]
  3. val mapSideCombine: Boolean
  4. val partitioner: Partitioner
  5. def rdd: RDD[Product2[K, V]]
    Definition Classes
    ShuffleDependencyDependency
  6. val serializer: Serializer
  7. val shuffleHandle: ShuffleHandle
  8. val shuffleId: Int
  9. val shuffleWriterProcessor: ShuffleWriteProcessor