Skip navigation links
A B C D E F G H I J M N O P R S T U V W Z 

A

AccumulatorCheckpointingSparkListener() - Constructor for class org.apache.beam.runners.spark.aggregators.AggregatorsAccumulator.AccumulatorCheckpointingSparkListener
 
AccumulatorCheckpointingSparkListener() - Constructor for class org.apache.beam.runners.spark.metrics.MetricsAccumulator.AccumulatorCheckpointingSparkListener
 
action() - Method in class org.apache.beam.runners.spark.translation.BoundedDataset
 
action() - Method in interface org.apache.beam.runners.spark.translation.Dataset
 
action() - Method in class org.apache.beam.runners.spark.translation.streaming.UnboundedDataset
 
add(int, GlobalWatermarkHolder.SparkWatermarks) - Static method in class org.apache.beam.runners.spark.util.GlobalWatermarkHolder
 
addAccumulator(NamedAggregators, NamedAggregators) - Method in class org.apache.beam.runners.spark.aggregators.AggAccumParam
 
addAll(Map<Integer, Queue<GlobalWatermarkHolder.SparkWatermarks>>) - Static method in class org.apache.beam.runners.spark.util.GlobalWatermarkHolder
 
addInPlace(NamedAggregators, NamedAggregators) - Method in class org.apache.beam.runners.spark.aggregators.AggAccumParam
 
advance() - Method in class org.apache.beam.runners.spark.io.MicrobatchSource.Reader
 
advance(JavaSparkContext) - Static method in class org.apache.beam.runners.spark.util.GlobalWatermarkHolder
Advances the watermarks to the next-in-line watermarks.
advanceNextBatchWatermarkToInfinity() - Method in class org.apache.beam.runners.spark.io.CreateStream
Advances the watermark in the next batch to the end-of-time.
advanceWatermark() - Method in class org.apache.beam.runners.spark.stateful.SparkTimerInternals
Advances the watermark.
advanceWatermarkForNextBatch(Instant) - Method in class org.apache.beam.runners.spark.io.CreateStream
Advances the watermark in the next batch.
AggAccumParam - Class in org.apache.beam.runners.spark.aggregators
Aggregator accumulator param.
AggAccumParam() - Constructor for class org.apache.beam.runners.spark.aggregators.AggAccumParam
 
AggregatorMetric - Class in org.apache.beam.runners.spark.metrics
An adapter between the NamedAggregators and Codahale's Metric interface.
AggregatorMetricSource - Class in org.apache.beam.runners.spark.metrics
A Spark Source that is tailored to expose an AggregatorMetric, wrapping an underlying NamedAggregators instance.
AggregatorMetricSource(String, NamedAggregators) - Constructor for class org.apache.beam.runners.spark.metrics.AggregatorMetricSource
 
AggregatorsAccumulator - Class in org.apache.beam.runners.spark.aggregators
For resilience, Accumulators are required to be wrapped in a Singleton.
AggregatorsAccumulator() - Constructor for class org.apache.beam.runners.spark.aggregators.AggregatorsAccumulator
 
AggregatorsAccumulator.AccumulatorCheckpointingSparkListener - Class in org.apache.beam.runners.spark.aggregators
Spark Listener which checkpoints NamedAggregators values for fault-tolerance.
apply(KV<String, Long>) - Method in class org.apache.beam.runners.spark.examples.WordCount.FormatAsTextFn
 
apply(WindowedValue<KV<K, Iterable<InputT>>>) - Method in class org.apache.beam.runners.spark.translation.SparkKeyedCombineFn
Applying the combine function directly on a key's grouped values - post grouping.
awaitTermination(Duration) - Method in class org.apache.beam.runners.spark.SparkPipelineResult
 
awaitTermination(Duration) - Method in class org.apache.beam.runners.spark.SparkRunnerDebugger.DebugSparkPipelineResult
 

B

BeamSparkRunnerRegistrator - Class in org.apache.beam.runners.spark.coders
Custom KryoRegistrators for Beam's Spark runner needs.
BeamSparkRunnerRegistrator() - Constructor for class org.apache.beam.runners.spark.coders.BeamSparkRunnerRegistrator
 
borrowDataset(PTransform<? extends PValue, ?>) - Method in class org.apache.beam.runners.spark.translation.EvaluationContext
 
borrowDataset(PValue) - Method in class org.apache.beam.runners.spark.translation.EvaluationContext
 
Bounded(SparkContext, BoundedSource<T>, SparkRuntimeContext, String) - Constructor for class org.apache.beam.runners.spark.io.SourceRDD.Bounded
 
BoundedDataset<T> - Class in org.apache.beam.runners.spark.translation
Holds an RDD or values for deferred conversion to an RDD if needed.
broadcast(JavaSparkContext) - Method in class org.apache.beam.runners.spark.util.SideInputBroadcast
 
ByteArray - Class in org.apache.beam.runners.spark.util
Serializable byte array.
ByteArray(byte[]) - Constructor for class org.apache.beam.runners.spark.util.ByteArray
 

C

cache(String) - Method in class org.apache.beam.runners.spark.translation.BoundedDataset
 
cache(String) - Method in interface org.apache.beam.runners.spark.translation.Dataset
 
cache() - Method in class org.apache.beam.runners.spark.translation.streaming.UnboundedDataset
 
cache(String) - Method in class org.apache.beam.runners.spark.translation.streaming.UnboundedDataset
 
call(Iterator<WindowedValue<InputT>>) - Method in class org.apache.beam.runners.spark.translation.MultiDoFnFunction
 
call(WindowedValue<KV<K, V>>) - Method in class org.apache.beam.runners.spark.translation.ReifyTimestampsAndWindowsFunction
 
call(WindowedValue<T>) - Method in class org.apache.beam.runners.spark.translation.SparkAssignWindowFn
 
call(WindowedValue<KV<K, Iterable<WindowedValue<InputT>>>>) - Method in class org.apache.beam.runners.spark.translation.SparkGroupAlsoByWindowViaOutputBufferFn
 
call() - Method in class org.apache.beam.runners.spark.translation.streaming.SparkRunnerStreamingContextFactory
 
call(WindowedValue<KV<K, Iterable<InputT>>>) - Method in class org.apache.beam.runners.spark.translation.TranslationUtils.CombineGroupedValues
 
call(Tuple2<TupleTag<V>, WindowedValue<?>>) - Method in class org.apache.beam.runners.spark.translation.TranslationUtils.TupleTagFilter
 
cancel() - Method in class org.apache.beam.runners.spark.SparkPipelineResult
 
Checkpoint - Class in org.apache.beam.runners.spark.translation.streaming
Checkpoint data to make it available in future pipeline runs.
Checkpoint() - Constructor for class org.apache.beam.runners.spark.translation.streaming.Checkpoint
 
Checkpoint.CheckpointDir - Class in org.apache.beam.runners.spark.translation.streaming
Checkpoint dir tree.
CheckpointDir(String) - Constructor for class org.apache.beam.runners.spark.translation.streaming.Checkpoint.CheckpointDir
 
clear() - Static method in class org.apache.beam.runners.spark.aggregators.AggregatorsAccumulator
 
clear() - Static method in class org.apache.beam.runners.spark.metrics.MetricsAccumulator
 
clear() - Static method in class org.apache.beam.runners.spark.util.GlobalWatermarkHolder
 
clearCache() - Static method in class org.apache.beam.runners.spark.io.MicrobatchSource
 
close() - Method in class org.apache.beam.runners.spark.io.MicrobatchSource.Reader
 
CoderHelpers - Class in org.apache.beam.runners.spark.coders
Serialization utility class.
CombineFunctionState(Combine.CombineFn<InputT, InterT, OutputT>, Coder<InputT>, SparkRuntimeContext) - Constructor for class org.apache.beam.runners.spark.aggregators.NamedAggregators.CombineFunctionState
 
combineGlobally(JavaRDD<WindowedValue<InputT>>, SparkGlobalCombineFn<InputT, AccumT, ?>, Coder<InputT>, Coder<AccumT>, WindowingStrategy<?, ?>) - Static method in class org.apache.beam.runners.spark.translation.GroupCombineFunctions
Apply a composite Combine.Globally transformation.
CombineGroupedValues(SparkKeyedCombineFn<K, InputT, ?, OutputT>) - Constructor for class org.apache.beam.runners.spark.translation.TranslationUtils.CombineGroupedValues
 
combinePerKey(JavaRDD<WindowedValue<KV<K, InputT>>>, SparkKeyedCombineFn<K, InputT, AccumT, ?>, Coder<K>, Coder<InputT>, Coder<AccumT>, WindowingStrategy<?, ?>) - Static method in class org.apache.beam.runners.spark.translation.GroupCombineFunctions
Apply a composite Combine.PerKey transformation.
compareTo(ByteArray) - Method in class org.apache.beam.runners.spark.util.ByteArray
 
CompositeSource - Class in org.apache.beam.runners.spark.metrics
Composite source made up of several MetricRegistry instances.
CompositeSource(String, MetricRegistry...) - Constructor for class org.apache.beam.runners.spark.metrics.CompositeSource
 
compute(Partition, TaskContext) - Method in class org.apache.beam.runners.spark.io.SourceRDD.Bounded
 
compute(Partition, TaskContext) - Method in class org.apache.beam.runners.spark.io.SourceRDD.Unbounded
 
computeOutputs() - Method in class org.apache.beam.runners.spark.translation.EvaluationContext
Computes the outputs for all RDDs that are leaves in the DAG and do not have any actions (like saving to a file) registered on them (i.e.
ConsoleIO - Class in org.apache.beam.runners.spark.io
Print to console.
ConsoleIO.Write - Class in org.apache.beam.runners.spark.io
Write on the console.
ConsoleIO.Write.Unbound<T> - Class in org.apache.beam.runners.spark.io
PTransform writing PCollection on the console.
contains(PCollectionView<T>) - Method in class org.apache.beam.runners.spark.util.SparkSideInputReader
 
CountWords() - Constructor for class org.apache.beam.runners.spark.examples.WordCount.CountWords
 
create(PipelineOptions) - Method in class org.apache.beam.runners.spark.SparkContextOptions.EmptyListenersList
 
create(PipelineOptions) - Method in class org.apache.beam.runners.spark.SparkPipelineOptions.TmpCheckpointDirFactory
 
create() - Static method in class org.apache.beam.runners.spark.SparkRunner
Creates and returns a new SparkRunner with default options.
create(SparkPipelineOptions) - Static method in class org.apache.beam.runners.spark.SparkRunner
Creates and returns a new SparkRunner with specified options.
create(PipelineOptions) - Method in class org.apache.beam.runners.spark.TestSparkPipelineOptions.DefaultStopPipelineWatermarkFactory
 
create(byte[], Coder<T>) - Static method in class org.apache.beam.runners.spark.util.SideInputBroadcast
 
CreateStream<T> - Class in org.apache.beam.runners.spark.io
Create an input stream from Queue.
CsvSink - Class in org.apache.beam.runners.spark.metrics.sink
A Spark Sink that is tailored to report AggregatorMetric metrics to a CSV file.
CsvSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.beam.runners.spark.metrics.sink.CsvSink
 
ctxt - Variable in class org.apache.beam.runners.spark.SparkRunner.Evaluator
 
ctxtForInput(WindowedValue<?>) - Method in class org.apache.beam.runners.spark.translation.SparkAbstractCombineFn
 
current() - Method in class org.apache.beam.runners.spark.aggregators.NamedAggregators.CombineFunctionState
 
current() - Method in interface org.apache.beam.runners.spark.aggregators.NamedAggregators.State
 
currentInputWatermarkTime() - Method in class org.apache.beam.runners.spark.stateful.SparkTimerInternals
 
currentOutputWatermarkTime() - Method in class org.apache.beam.runners.spark.stateful.SparkTimerInternals
 
currentProcessingTime() - Method in class org.apache.beam.runners.spark.stateful.SparkTimerInternals
 
currentSynchronizedProcessingTime() - Method in class org.apache.beam.runners.spark.stateful.SparkTimerInternals
 

D

Dataset - Interface in org.apache.beam.runners.spark.translation
Holder for Spark RDD/DStream.
DefaultStopPipelineWatermarkFactory() - Constructor for class org.apache.beam.runners.spark.TestSparkPipelineOptions.DefaultStopPipelineWatermarkFactory
 
deleteTimer(StateNamespace, String, TimeDomain) - Method in class org.apache.beam.runners.spark.stateful.SparkTimerInternals
 
deleteTimer(TimerInternals.TimerData) - Method in class org.apache.beam.runners.spark.stateful.SparkTimerInternals
 
deleteTimer(StateNamespace, String) - Method in class org.apache.beam.runners.spark.stateful.SparkTimerInternals
 
deserializeTimers(Collection<byte[]>, TimerInternals.TimerDataCoder) - Static method in class org.apache.beam.runners.spark.stateful.SparkTimerInternals
 
dStreamValues(JavaPairDStream<T1, T2>) - Static method in class org.apache.beam.runners.spark.translation.TranslationUtils
Transform a pair stream into a value stream.

E

emptyBatch() - Method in class org.apache.beam.runners.spark.io.CreateStream
Adds an empty batch.
EmptyCheckpointMark - Class in org.apache.beam.runners.spark.io
Passing null values to Spark's Java API may cause problems because of Guava preconditions.
EmptyListenersList() - Constructor for class org.apache.beam.runners.spark.SparkContextOptions.EmptyListenersList
 
emptyVoidFunction() - Static method in class org.apache.beam.runners.spark.translation.TranslationUtils
 
enterCompositeTransform(TransformHierarchy.Node) - Method in class org.apache.beam.runners.spark.SparkNativePipelineVisitor
 
enterCompositeTransform(TransformHierarchy.Node) - Method in class org.apache.beam.runners.spark.SparkRunner.Evaluator
 
equals(Object) - Method in class org.apache.beam.runners.spark.io.EmptyCheckpointMark
 
equals(Object) - Method in class org.apache.beam.runners.spark.io.MicrobatchSource
 
equals(Object) - Method in class org.apache.beam.runners.spark.util.ByteArray
 
evaluate(TransformT, EvaluationContext) - Method in interface org.apache.beam.runners.spark.translation.TransformEvaluator
 
EvaluationContext - Class in org.apache.beam.runners.spark.translation
The EvaluationContext allows us to define pipeline instructions and translate between PObject<T>s or PCollection<T>s and Ts or DStreams/RDDs of Ts.
EvaluationContext(JavaSparkContext, Pipeline, PipelineOptions) - Constructor for class org.apache.beam.runners.spark.translation.EvaluationContext
 
EvaluationContext(JavaSparkContext, Pipeline, PipelineOptions, JavaStreamingContext) - Constructor for class org.apache.beam.runners.spark.translation.EvaluationContext
 
Evaluator(SparkPipelineTranslator, EvaluationContext) - Constructor for class org.apache.beam.runners.spark.SparkRunner.Evaluator
 
expand(PCollection<String>) - Method in class org.apache.beam.runners.spark.examples.WordCount.CountWords
 
expand(PCollection<T>) - Method in class org.apache.beam.runners.spark.io.ConsoleIO.Write.Unbound
 
expand(PBegin) - Method in class org.apache.beam.runners.spark.io.CreateStream
 
expand(PCollection<?>) - Method in class org.apache.beam.runners.spark.translation.StorageLevelPTransform
 
expand(PInput) - Method in class org.apache.beam.runners.spark.util.SinglePrimitiveOutputPTransform
 
ExtractWordsFn() - Constructor for class org.apache.beam.runners.spark.examples.WordCount.ExtractWordsFn
 

F

finalizeCheckpoint() - Method in class org.apache.beam.runners.spark.io.EmptyCheckpointMark
 
FormatAsTextFn() - Constructor for class org.apache.beam.runners.spark.examples.WordCount.FormatAsTextFn
 
forRegistry(MetricRegistry) - Static method in class org.apache.beam.runners.spark.metrics.WithMetricsSupport
 
forStreamFromSources(List<Integer>, Broadcast<Map<Integer, GlobalWatermarkHolder.SparkWatermarks>>) - Static method in class org.apache.beam.runners.spark.stateful.SparkTimerInternals
Build the TimerInternals according to the feeding streams.
fromByteArray(byte[], Coder<T>) - Static method in class org.apache.beam.runners.spark.coders.CoderHelpers
Utility method for deserializing a byte array using the specified coder.
fromByteArrays(Collection<byte[]>, Coder<T>) - Static method in class org.apache.beam.runners.spark.coders.CoderHelpers
Utility method for deserializing a Iterable of byte arrays using the specified coder.
fromByteFunction(Coder<T>) - Static method in class org.apache.beam.runners.spark.coders.CoderHelpers
A function wrapper for converting a byte array to an object.
fromByteFunction(Coder<K>, Coder<V>) - Static method in class org.apache.beam.runners.spark.coders.CoderHelpers
A function wrapper for converting a byte array pair to a key-value pair.
fromByteFunctionIterable(Coder<K>, Coder<V>) - Static method in class org.apache.beam.runners.spark.coders.CoderHelpers
A function wrapper for converting a byte array pair to a key-value pair, where values are Iterable.
fromOptions(PipelineOptions) - Static method in class org.apache.beam.runners.spark.SparkRunner
Creates and returns a new SparkRunner with specified options.
fromOptions(PipelineOptions) - Static method in class org.apache.beam.runners.spark.SparkRunnerDebugger
 
fromOptions(PipelineOptions) - Static method in class org.apache.beam.runners.spark.TestSparkRunner
 
functionToFlatMapFunction(Function<InputT, OutputT>) - Static method in class org.apache.beam.runners.spark.translation.TranslationUtils
A utility method that adapts Function to a FlatMapFunction with an Iterator input.

G

get() - Static method in class org.apache.beam.runners.spark.io.EmptyCheckpointMark
 
get(PValue) - Method in class org.apache.beam.runners.spark.translation.EvaluationContext
Retrieve an object of Type T associated with the PValue passed in.
get() - Static method in class org.apache.beam.runners.spark.util.GlobalWatermarkHolder
Returns the Broadcast containing the GlobalWatermarkHolder.SparkWatermarks mapped to their sources.
get(PCollectionView<T>, BoundedWindow) - Method in class org.apache.beam.runners.spark.util.SparkSideInputReader
 
getBatches() - Method in class org.apache.beam.runners.spark.io.CreateStream
Get the underlying queue representing the mock stream of micro-batches.
getBatchIntervalMillis() - Method in interface org.apache.beam.runners.spark.SparkPipelineOptions
 
getBeamCheckpointDir() - Method in class org.apache.beam.runners.spark.translation.streaming.Checkpoint.CheckpointDir
 
getCacheCandidates() - Method in class org.apache.beam.runners.spark.translation.EvaluationContext
Get the map of cache candidates hold by the evaluation context.
getCheckpointDir() - Method in interface org.apache.beam.runners.spark.SparkPipelineOptions
 
getCheckpointDurationMillis() - Method in interface org.apache.beam.runners.spark.SparkPipelineOptions
 
getCheckpointMark() - Method in class org.apache.beam.runners.spark.io.MicrobatchSource.Reader
 
getCheckpointMarkCoder() - Method in class org.apache.beam.runners.spark.io.MicrobatchSource
 
getCoderRegistry() - Method in class org.apache.beam.runners.spark.translation.SparkRuntimeContext
 
getCombineFn() - Method in class org.apache.beam.runners.spark.aggregators.NamedAggregators.CombineFunctionState
 
getCombineFn() - Method in interface org.apache.beam.runners.spark.aggregators.NamedAggregators.State
 
getCounters(MetricFilter) - Method in class org.apache.beam.runners.spark.metrics.WithMetricsSupport
 
getCurrent() - Method in class org.apache.beam.runners.spark.io.MicrobatchSource.Reader
 
getCurrentSource() - Method in class org.apache.beam.runners.spark.io.MicrobatchSource.Reader
 
getCurrentTimestamp() - Method in class org.apache.beam.runners.spark.io.MicrobatchSource.Reader
 
getCurrentTransform() - Method in class org.apache.beam.runners.spark.translation.EvaluationContext
 
getDefaultOutputCoder() - Method in class org.apache.beam.runners.spark.io.CreateStream
 
getDefaultOutputCoder() - Method in class org.apache.beam.runners.spark.io.MicrobatchSource
 
getDefaultOutputCoder() - Method in class org.apache.beam.runners.spark.translation.StorageLevelPTransform
 
getEnableSparkMetricSinks() - Method in interface org.apache.beam.runners.spark.SparkPipelineOptions
 
getExpectedAssertions() - Method in interface org.apache.beam.runners.spark.TestSparkPipelineOptions
 
getGauges(MetricFilter) - Method in class org.apache.beam.runners.spark.metrics.WithMetricsSupport
 
getHighWatermark() - Method in class org.apache.beam.runners.spark.util.GlobalWatermarkHolder.SparkWatermarks
 
getHistograms(MetricFilter) - Method in class org.apache.beam.runners.spark.metrics.WithMetricsSupport
 
getId() - Method in class org.apache.beam.runners.spark.io.MicrobatchSource
 
getInput(PTransform<T, ?>) - Method in class org.apache.beam.runners.spark.translation.EvaluationContext
 
getInputFile() - Method in interface org.apache.beam.runners.spark.examples.WordCount.WordCountOptions
 
getInputs(PTransform<?, ?>) - Method in class org.apache.beam.runners.spark.translation.EvaluationContext
 
getInstance() - Static method in class org.apache.beam.runners.spark.aggregators.AggregatorsAccumulator
 
getInstance() - Static method in class org.apache.beam.runners.spark.metrics.MetricsAccumulator
 
getListeners() - Method in interface org.apache.beam.runners.spark.SparkContextOptions
 
getLowWatermark() - Method in class org.apache.beam.runners.spark.util.GlobalWatermarkHolder.SparkWatermarks
 
getMaxRecordsPerBatch() - Method in interface org.apache.beam.runners.spark.SparkPipelineOptions
 
getMeters(MetricFilter) - Method in class org.apache.beam.runners.spark.metrics.WithMetricsSupport
 
getMinReadTimeMillis() - Method in interface org.apache.beam.runners.spark.SparkPipelineOptions
 
getNum() - Method in class org.apache.beam.runners.spark.io.ConsoleIO.Write.Unbound
 
getOptions() - Method in class org.apache.beam.runners.spark.translation.EvaluationContext
 
getOrCreateReader(PipelineOptions, CheckpointMarkT) - Method in class org.apache.beam.runners.spark.io.MicrobatchSource
 
getOutput() - Method in interface org.apache.beam.runners.spark.examples.WordCount.WordCountOptions
 
getOutput(PTransform<?, T>) - Method in class org.apache.beam.runners.spark.translation.EvaluationContext
 
getOutputs(PTransform<?, ?>) - Method in class org.apache.beam.runners.spark.translation.EvaluationContext
 
getPartitions() - Method in class org.apache.beam.runners.spark.io.SourceRDD.Bounded
 
getPartitions() - Method in class org.apache.beam.runners.spark.io.SourceRDD.Unbounded
 
getPipeline() - Method in class org.apache.beam.runners.spark.translation.EvaluationContext
 
getPipelineOptions() - Method in class org.apache.beam.runners.spark.SparkRunnerRegistrar.Options
 
getPipelineOptions() - Method in class org.apache.beam.runners.spark.translation.SparkRuntimeContext
 
getPipelineRunners() - Method in class org.apache.beam.runners.spark.SparkRunnerRegistrar.Runner
 
getProvidedSparkContext() - Method in interface org.apache.beam.runners.spark.SparkContextOptions
 
getPViews() - Method in class org.apache.beam.runners.spark.translation.EvaluationContext
Retrun the current views creates in the pipepline.
getRDD() - Method in class org.apache.beam.runners.spark.translation.BoundedDataset
 
getReadDurationMillis() - Method in class org.apache.beam.runners.spark.io.SparkUnboundedSource.Metadata
 
getReadTimePercentage() - Method in interface org.apache.beam.runners.spark.SparkPipelineOptions
 
getRootCheckpointDir() - Method in class org.apache.beam.runners.spark.translation.streaming.Checkpoint.CheckpointDir
 
getRuntimeContext() - Method in class org.apache.beam.runners.spark.translation.EvaluationContext
 
getSideInputs(List<PCollectionView<?>>, JavaSparkContext, SparkPCollectionView) - Static method in class org.apache.beam.runners.spark.translation.TranslationUtils
Create SideInputs as Broadcast variables.
getSparkCheckpointDir() - Method in class org.apache.beam.runners.spark.translation.streaming.Checkpoint.CheckpointDir
 
getSparkContext() - Method in class org.apache.beam.runners.spark.translation.EvaluationContext
 
getSparkContext(SparkPipelineOptions) - Static method in class org.apache.beam.runners.spark.translation.SparkContextFactory
 
getSparkMaster() - Method in interface org.apache.beam.runners.spark.SparkPipelineOptions
 
getState() - Method in class org.apache.beam.runners.spark.SparkPipelineResult
 
getStopPipelineWatermark() - Method in interface org.apache.beam.runners.spark.TestSparkPipelineOptions
 
getStorageLevel() - Method in interface org.apache.beam.runners.spark.SparkPipelineOptions
 
getStreamingContext() - Method in class org.apache.beam.runners.spark.translation.EvaluationContext
 
getSynchronizedProcessingTime() - Method in class org.apache.beam.runners.spark.util.GlobalWatermarkHolder.SparkWatermarks
 
getTimers(MetricFilter) - Method in class org.apache.beam.runners.spark.metrics.WithMetricsSupport
 
getTimes() - Method in class org.apache.beam.runners.spark.io.CreateStream
Get times so they can be pushed into the GlobalWatermarkHolder.
getUsesProvidedSparkContext() - Method in interface org.apache.beam.runners.spark.SparkPipelineOptions
 
getValue(String, Class<T>) - Method in class org.apache.beam.runners.spark.aggregators.NamedAggregators
 
getValue() - Method in class org.apache.beam.runners.spark.util.ByteArray
 
getValue() - Method in class org.apache.beam.runners.spark.util.SideInputBroadcast
 
getWatermark() - Method in class org.apache.beam.runners.spark.io.MicrobatchSource.Reader
 
global(Broadcast<Map<Integer, GlobalWatermarkHolder.SparkWatermarks>>) - Static method in class org.apache.beam.runners.spark.stateful.SparkTimerInternals
Build a global TimerInternals for all feeding streams.
GlobalWatermarkHolder - Class in org.apache.beam.runners.spark.util
A Broadcast variable to hold the global watermarks for a micro-batch.
GlobalWatermarkHolder() - Constructor for class org.apache.beam.runners.spark.util.GlobalWatermarkHolder
 
GlobalWatermarkHolder.SparkWatermarks - Class in org.apache.beam.runners.spark.util
A GlobalWatermarkHolder.SparkWatermarks holds the watermarks and batch time relevant to a micro-batch input from a specific source.
GlobalWatermarkHolder.WatermarksListener - Class in org.apache.beam.runners.spark.util
Advance the WMs onBatchCompleted event.
GraphiteSink - Class in org.apache.beam.runners.spark.metrics.sink
A Spark Sink that is tailored to report AggregatorMetric metrics to Graphite.
GraphiteSink(Properties, MetricRegistry, SecurityManager) - Constructor for class org.apache.beam.runners.spark.metrics.sink.GraphiteSink
 
groupAlsoByWindow(JavaDStream<WindowedValue<KV<K, Iterable<WindowedValue<InputT>>>>>, Coder<K>, Coder<WindowedValue<InputT>>, WindowingStrategy<?, W>, SparkRuntimeContext, List<Integer>) - Static method in class org.apache.beam.runners.spark.stateful.SparkGroupAlsoByWindowViaWindowSet
 
groupByKeyOnly(JavaRDD<WindowedValue<KV<K, V>>>, Coder<K>, WindowedValue.WindowedValueCoder<V>) - Static method in class org.apache.beam.runners.spark.translation.GroupCombineFunctions
An implementation of GroupByKeyViaGroupByKeyOnly.GroupByKeyOnly for the Spark runner.
GroupCombineFunctions - Class in org.apache.beam.runners.spark.translation
A set of group/combine functions to apply to Spark RDDs.
GroupCombineFunctions() - Constructor for class org.apache.beam.runners.spark.translation.GroupCombineFunctions
 

H

hashCode() - Method in class org.apache.beam.runners.spark.io.EmptyCheckpointMark
 
hashCode() - Method in class org.apache.beam.runners.spark.io.MicrobatchSource
 
hashCode() - Method in class org.apache.beam.runners.spark.util.ByteArray
 
hasTranslation(Class<? extends PTransform<?, ?>>) - Method in interface org.apache.beam.runners.spark.translation.SparkPipelineTranslator
 
hasTranslation(Class<? extends PTransform<?, ?>>) - Method in class org.apache.beam.runners.spark.translation.streaming.StreamingTransformTranslator.Translator
 
hasTranslation(Class<? extends PTransform<?, ?>>) - Method in class org.apache.beam.runners.spark.translation.TransformTranslator.Translator
 

I

init(SparkPipelineOptions, JavaSparkContext) - Static method in class org.apache.beam.runners.spark.aggregators.AggregatorsAccumulator
Init aggregators accumulator if it has not been initiated.
init(SparkPipelineOptions, JavaSparkContext) - Static method in class org.apache.beam.runners.spark.metrics.MetricsAccumulator
Init metrics accumulator if it has not been initiated.
initAccumulators(SparkPipelineOptions, JavaSparkContext) - Static method in class org.apache.beam.runners.spark.SparkRunner
Init Metrics/Aggregators accumulators.
initialSystemTimeAt(Instant) - Method in class org.apache.beam.runners.spark.io.CreateStream
Set the initial synchronized processing time.
isBoundedCollection(Collection<PValue>) - Method in class org.apache.beam.runners.spark.SparkRunner.Evaluator
 
isEmpty() - Method in class org.apache.beam.runners.spark.util.SparkSideInputReader
 
isForceStreaming() - Method in interface org.apache.beam.runners.spark.TestSparkPipelineOptions
 
isIntersecting(IntervalWindow, IntervalWindow) - Static method in class org.apache.beam.runners.spark.translation.SparkAbstractCombineFn
 

J

javaSparkContext - Variable in class org.apache.beam.runners.spark.SparkPipelineResult
 

M

main(String[]) - Static method in class org.apache.beam.runners.spark.examples.WordCount
 
mapSourceFunction(SparkRuntimeContext, String) - Static method in class org.apache.beam.runners.spark.stateful.StateSpecFunctions
A StateSpec function to support reading from an UnboundedSource.
merge(NamedAggregators.State<InputT, InterT, OutputT>) - Method in class org.apache.beam.runners.spark.aggregators.NamedAggregators.CombineFunctionState
 
merge(NamedAggregators) - Method in class org.apache.beam.runners.spark.aggregators.NamedAggregators
Merges another NamedAggregators instance with this instance.
merge(NamedAggregators.State<InputT, InterT, OutputT>) - Method in interface org.apache.beam.runners.spark.aggregators.NamedAggregators.State
 
merge(IntervalWindow, IntervalWindow) - Static method in class org.apache.beam.runners.spark.translation.SparkAbstractCombineFn
 
Metadata(long, Instant, Instant, long, MetricsContainerStepMap) - Constructor for class org.apache.beam.runners.spark.io.SparkUnboundedSource.Metadata
 
metricRegistry() - Method in class org.apache.beam.runners.spark.metrics.AggregatorMetricSource
 
metricRegistry() - Method in class org.apache.beam.runners.spark.metrics.CompositeSource
 
metricRegistry() - Method in class org.apache.beam.runners.spark.metrics.SparkBeamMetricSource
 
metrics() - Method in class org.apache.beam.runners.spark.SparkPipelineResult
 
MetricsAccumulator - Class in org.apache.beam.runners.spark.metrics
For resilience, Accumulators are required to be wrapped in a Singleton.
MetricsAccumulator() - Constructor for class org.apache.beam.runners.spark.metrics.MetricsAccumulator
 
MetricsAccumulator.AccumulatorCheckpointingSparkListener - Class in org.apache.beam.runners.spark.metrics
Spark Listener which checkpoints MetricsContainerStepMap values for fault-tolerance.
MicrobatchSource<T,CheckpointMarkT extends UnboundedSource.CheckpointMark> - Class in org.apache.beam.runners.spark.io
A Source that accommodates Spark's micro-batch oriented nature and wraps an UnboundedSource.
MicrobatchSource.Reader - Class in org.apache.beam.runners.spark.io
Mostly based on BoundedReadFromUnboundedSource's UnboundedToBoundedSourceAdapter, with some adjustments for Spark specifics.
MultiDoFnFunction<InputT,OutputT> - Class in org.apache.beam.runners.spark.translation
DoFunctions ignore outputs that are not the main output.
MultiDoFnFunction(Accumulator<NamedAggregators>, Accumulator<MetricsContainerStepMap>, String, DoFn<InputT, OutputT>, SparkRuntimeContext, TupleTag<OutputT>, List<TupleTag<?>>, Map<TupleTag<?>, KV<WindowingStrategy<?, ?>, SideInputBroadcast<?>>>, WindowingStrategy<?, ?>, boolean) - Constructor for class org.apache.beam.runners.spark.translation.MultiDoFnFunction
 

N

NamedAggregators - Class in org.apache.beam.runners.spark.aggregators
This class wraps a map of named aggregators.
NamedAggregators() - Constructor for class org.apache.beam.runners.spark.aggregators.NamedAggregators
Constructs a new NamedAggregators instance.
NamedAggregators(String, NamedAggregators.State<?, ?, ?>) - Constructor for class org.apache.beam.runners.spark.aggregators.NamedAggregators
Constructs a new named aggregators instance that contains a mapping from the specified `named` to the associated initial state.
NamedAggregators.CombineFunctionState<InputT,InterT,OutputT> - Class in org.apache.beam.runners.spark.aggregators
 
NamedAggregators.State<InputT,InterT,OutputT> - Interface in org.apache.beam.runners.spark.aggregators
 
nextBatch(TimestampedValue<T>...) - Method in class org.apache.beam.runners.spark.io.CreateStream
Enqueue next micro-batch elements.
nextBatch(T...) - Method in class org.apache.beam.runners.spark.io.CreateStream
For non-timestamped elements.

O

of(Coder<T>, Duration) - Static method in class org.apache.beam.runners.spark.io.CreateStream
Set the batch interval for the stream.
of(NamedAggregators) - Static method in class org.apache.beam.runners.spark.metrics.AggregatorMetric
 
onBatchCompleted(JavaStreamingListenerBatchCompleted) - Method in class org.apache.beam.runners.spark.aggregators.AggregatorsAccumulator.AccumulatorCheckpointingSparkListener
 
onBatchCompleted(JavaStreamingListenerBatchCompleted) - Method in class org.apache.beam.runners.spark.metrics.MetricsAccumulator.AccumulatorCheckpointingSparkListener
 
onBatchCompleted(JavaStreamingListenerBatchCompleted) - Method in class org.apache.beam.runners.spark.util.GlobalWatermarkHolder.WatermarksListener
 
Options() - Constructor for class org.apache.beam.runners.spark.SparkRunnerRegistrar.Options
 
org.apache.beam.runners.spark - package org.apache.beam.runners.spark
Internal implementation of the Beam runner for Apache Spark.
org.apache.beam.runners.spark.aggregators - package org.apache.beam.runners.spark.aggregators
Provides internal utilities for implementing Beam aggregators using Spark accumulators.
org.apache.beam.runners.spark.aggregators.metrics - package org.apache.beam.runners.spark.aggregators.metrics
Defines classes for integrating with Spark's metrics mechanism (Sinks, Sources, etc.).
org.apache.beam.runners.spark.coders - package org.apache.beam.runners.spark.coders
Beam coders and coder-related utilities for running on Apache Spark.
org.apache.beam.runners.spark.examples - package org.apache.beam.runners.spark.examples
 
org.apache.beam.runners.spark.io - package org.apache.beam.runners.spark.io
Spark-specific transforms for I/O.
org.apache.beam.runners.spark.metrics - package org.apache.beam.runners.spark.metrics
Provides internal utilities for implementing Beam metrics using Spark accumulators.
org.apache.beam.runners.spark.metrics.sink - package org.apache.beam.runners.spark.metrics.sink
Spark sinks that supports beam metrics and aggregators.
org.apache.beam.runners.spark.stateful - package org.apache.beam.runners.spark.stateful
Spark-specific stateful operators.
org.apache.beam.runners.spark.translation - package org.apache.beam.runners.spark.translation
Internal translators for running Beam pipelines on Spark.
org.apache.beam.runners.spark.translation.streaming - package org.apache.beam.runners.spark.translation.streaming
Internal utilities to translate Beam pipelines to Spark streaming.
org.apache.beam.runners.spark.util - package org.apache.beam.runners.spark.util
Internal utilities to translate Beam pipelines to Spark.
out() - Static method in class org.apache.beam.runners.spark.io.ConsoleIO.Write
 
out(int) - Static method in class org.apache.beam.runners.spark.io.ConsoleIO.Write
 

P

pairFunctionToPairFlatMapFunction(PairFunction<T, K, V>) - Static method in class org.apache.beam.runners.spark.translation.TranslationUtils
A utility method that adapts PairFunction to a PairFlatMapFunction with an Iterator input.
partitioner() - Method in class org.apache.beam.runners.spark.io.SourceRDD.Unbounded
 
pipelineExecution - Variable in class org.apache.beam.runners.spark.SparkPipelineResult
 
processElement(DoFn<String, String>.ProcessContext) - Method in class org.apache.beam.runners.spark.examples.WordCount.ExtractWordsFn
 
putDataset(PTransform<?, ? extends PValue>, Dataset) - Method in class org.apache.beam.runners.spark.translation.EvaluationContext
 
putDataset(PValue, Dataset) - Method in class org.apache.beam.runners.spark.translation.EvaluationContext
 
putPView(PCollectionView<?>, Iterable<WindowedValue<?>>, Coder<Iterable<WindowedValue<?>>>) - Method in class org.apache.beam.runners.spark.translation.EvaluationContext
Adds/Replaces a view to the current views creates in the pipepline.

R

read(JavaStreamingContext, SparkRuntimeContext, UnboundedSource<T, CheckpointMarkT>, String) - Static method in class org.apache.beam.runners.spark.io.SparkUnboundedSource
 
read(FileSystem, Path) - Static method in class org.apache.beam.runners.spark.translation.streaming.Checkpoint
 
readObject(FileSystem, Path) - Static method in class org.apache.beam.runners.spark.translation.streaming.Checkpoint
 
registerClasses(Kryo) - Method in class org.apache.beam.runners.spark.coders.BeamSparkRunnerRegistrator
 
ReifyTimestampsAndWindowsFunction<K,V> - Class in org.apache.beam.runners.spark.translation
Simple Function to bring the windowing information into the value from the implicit background representation of the PCollection.
ReifyTimestampsAndWindowsFunction() - Constructor for class org.apache.beam.runners.spark.translation.ReifyTimestampsAndWindowsFunction
 
rejectSplittable(DoFn<?, ?>) - Static method in class org.apache.beam.runners.spark.translation.TranslationUtils
 
rejectStateAndTimers(DoFn<?, ?>) - Static method in class org.apache.beam.runners.spark.translation.TranslationUtils
Reject state and timers DoFn.
render() - Method in class org.apache.beam.runners.spark.aggregators.NamedAggregators.CombineFunctionState
 
render() - Method in interface org.apache.beam.runners.spark.aggregators.NamedAggregators.State
 
renderAll() - Method in class org.apache.beam.runners.spark.aggregators.NamedAggregators
 
reshuffle(JavaRDD<WindowedValue<KV<K, V>>>, Coder<K>, WindowedValue.WindowedValueCoder<V>) - Static method in class org.apache.beam.runners.spark.translation.GroupCombineFunctions
An implementation of Reshuffle for the Spark runner.
run(Pipeline) - Method in class org.apache.beam.runners.spark.SparkRunner
 
run(Pipeline) - Method in class org.apache.beam.runners.spark.SparkRunnerDebugger
 
run(Pipeline) - Method in class org.apache.beam.runners.spark.TestSparkRunner
 
Runner() - Constructor for class org.apache.beam.runners.spark.SparkRunnerRegistrar.Runner
 
runtimeContext - Variable in class org.apache.beam.runners.spark.translation.SparkAbstractCombineFn
 

S

serializeTimers(Collection<TimerInternals.TimerData>, TimerInternals.TimerDataCoder) - Static method in class org.apache.beam.runners.spark.stateful.SparkTimerInternals
 
setBatchIntervalMillis(Long) - Method in interface org.apache.beam.runners.spark.SparkPipelineOptions
 
setCheckpointDir(String) - Method in interface org.apache.beam.runners.spark.SparkPipelineOptions
 
setCheckpointDurationMillis(Long) - Method in interface org.apache.beam.runners.spark.SparkPipelineOptions
 
setCurrentTransform(AppliedPTransform<?, ?, ?>) - Method in class org.apache.beam.runners.spark.translation.EvaluationContext
 
setEnableSparkMetricSinks(Boolean) - Method in interface org.apache.beam.runners.spark.SparkPipelineOptions
 
setExpectedAssertions(Integer) - Method in interface org.apache.beam.runners.spark.TestSparkPipelineOptions
 
setForceStreaming(boolean) - Method in interface org.apache.beam.runners.spark.TestSparkPipelineOptions
 
setInputFile(String) - Method in interface org.apache.beam.runners.spark.examples.WordCount.WordCountOptions
 
setListeners(List<JavaStreamingListener>) - Method in interface org.apache.beam.runners.spark.SparkContextOptions
 
setMaxRecordsPerBatch(Long) - Method in interface org.apache.beam.runners.spark.SparkPipelineOptions
 
setMinReadTimeMillis(Long) - Method in interface org.apache.beam.runners.spark.SparkPipelineOptions
 
setName(String) - Method in class org.apache.beam.runners.spark.translation.BoundedDataset
 
setName(String) - Method in interface org.apache.beam.runners.spark.translation.Dataset
 
setName(String) - Method in class org.apache.beam.runners.spark.translation.streaming.UnboundedDataset
 
setOutput(String) - Method in interface org.apache.beam.runners.spark.examples.WordCount.WordCountOptions
 
setProvidedSparkContext(JavaSparkContext) - Method in interface org.apache.beam.runners.spark.SparkContextOptions
 
setReadTimePercentage(Double) - Method in interface org.apache.beam.runners.spark.SparkPipelineOptions
 
setSparkMaster(String) - Method in interface org.apache.beam.runners.spark.SparkPipelineOptions
 
setStopPipelineWatermark(Long) - Method in interface org.apache.beam.runners.spark.TestSparkPipelineOptions
 
setStorageLevel(String) - Method in interface org.apache.beam.runners.spark.SparkPipelineOptions
 
setTimer(TimerInternals.TimerData) - Method in class org.apache.beam.runners.spark.stateful.SparkTimerInternals
 
setTimer(StateNamespace, String, Instant, TimeDomain) - Method in class org.apache.beam.runners.spark.stateful.SparkTimerInternals
 
setUsesProvidedSparkContext(boolean) - Method in interface org.apache.beam.runners.spark.SparkPipelineOptions
 
shouldDefer(TransformHierarchy.Node) - Method in class org.apache.beam.runners.spark.SparkRunner.Evaluator
 
SideInputBroadcast<T> - Class in org.apache.beam.runners.spark.util
Broadcast helper for side inputs.
sideInputs - Variable in class org.apache.beam.runners.spark.translation.SparkAbstractCombineFn
 
SinglePrimitiveOutputPTransform<T> - Class in org.apache.beam.runners.spark.util
A PTransform wrapping another transform.
SinglePrimitiveOutputPTransform(PTransform<PInput, PCollection<T>>) - Constructor for class org.apache.beam.runners.spark.util.SinglePrimitiveOutputPTransform
 
skipAssignWindows(Window.Assign<T>, EvaluationContext) - Static method in class org.apache.beam.runners.spark.translation.TranslationUtils
Checks if the window transformation should be applied or skipped.
sortByWindows(Iterable<WindowedValue<T>>) - Static method in class org.apache.beam.runners.spark.translation.SparkAbstractCombineFn
 
sourceName() - Method in class org.apache.beam.runners.spark.metrics.AggregatorMetricSource
 
sourceName() - Method in class org.apache.beam.runners.spark.metrics.CompositeSource
 
sourceName() - Method in class org.apache.beam.runners.spark.metrics.SparkBeamMetricSource
 
SourceRDD - Class in org.apache.beam.runners.spark.io
Classes implementing Beam Source RDDs.
SourceRDD() - Constructor for class org.apache.beam.runners.spark.io.SourceRDD
 
SourceRDD.Bounded<T> - Class in org.apache.beam.runners.spark.io
A SourceRDD.Bounded reads input from a BoundedSource and creates a Spark RDD.
SourceRDD.Unbounded<T,CheckpointMarkT extends UnboundedSource.CheckpointMark> - Class in org.apache.beam.runners.spark.io
A SourceRDD.Unbounded is the implementation of a micro-batch in a SourceDStream.
SparkAbstractCombineFn - Class in org.apache.beam.runners.spark.translation
An abstract for the SparkRunner implementation of Combine.CombineFn.
SparkAbstractCombineFn(SparkRuntimeContext, Map<TupleTag<?>, KV<WindowingStrategy<?, ?>, SideInputBroadcast<?>>>, WindowingStrategy<?, ?>) - Constructor for class org.apache.beam.runners.spark.translation.SparkAbstractCombineFn
 
SparkAssignWindowFn<T,W extends BoundedWindow> - Class in org.apache.beam.runners.spark.translation
An implementation of Window.Assign for the Spark runner.
SparkAssignWindowFn(WindowFn<? super T, W>) - Constructor for class org.apache.beam.runners.spark.translation.SparkAssignWindowFn
 
SparkBeamMetricSource - Class in org.apache.beam.runners.spark.metrics
A Spark Source that is tailored to expose a SparkBeamMetric, wrapping an underlying MetricResults instance.
SparkBeamMetricSource(String) - Constructor for class org.apache.beam.runners.spark.metrics.SparkBeamMetricSource
 
SparkContextFactory - Class in org.apache.beam.runners.spark.translation
The Spark context factory.
SparkContextOptions - Interface in org.apache.beam.runners.spark
A custom PipelineOptions to work with properties related to JavaSparkContext.
SparkContextOptions.EmptyListenersList - Class in org.apache.beam.runners.spark
Returns an empty list, top avoid handling null.
SparkGlobalCombineFn<InputT,AccumT,OutputT> - Class in org.apache.beam.runners.spark.translation
A CombineFnBase.GlobalCombineFn with a CombineWithContext.Context for the SparkRunner.
SparkGlobalCombineFn(CombineWithContext.CombineFnWithContext<InputT, AccumT, OutputT>, SparkRuntimeContext, Map<TupleTag<?>, KV<WindowingStrategy<?, ?>, SideInputBroadcast<?>>>, WindowingStrategy<?, ?>) - Constructor for class org.apache.beam.runners.spark.translation.SparkGlobalCombineFn
 
SparkGroupAlsoByWindowViaOutputBufferFn<K,InputT,W extends BoundedWindow> - Class in org.apache.beam.runners.spark.translation
An implementation of GroupByKeyViaGroupByKeyOnly.GroupAlsoByWindow for the Spark runner.
SparkGroupAlsoByWindowViaOutputBufferFn(WindowingStrategy<?, W>, StateInternalsFactory<K>, SystemReduceFn<K, InputT, Iterable<InputT>, Iterable<InputT>, W>, SparkRuntimeContext, Accumulator<NamedAggregators>) - Constructor for class org.apache.beam.runners.spark.translation.SparkGroupAlsoByWindowViaOutputBufferFn
 
SparkGroupAlsoByWindowViaWindowSet - Class in org.apache.beam.runners.spark.stateful
An implementation of GroupByKeyViaGroupByKeyOnly.GroupAlsoByWindow logic for grouping by windows and controlling trigger firings and pane accumulation.
SparkGroupAlsoByWindowViaWindowSet() - Constructor for class org.apache.beam.runners.spark.stateful.SparkGroupAlsoByWindowViaWindowSet
 
SparkKeyedCombineFn<K,InputT,AccumT,OutputT> - Class in org.apache.beam.runners.spark.translation
SparkKeyedCombineFn(CombineWithContext.CombineFnWithContext<InputT, AccumT, OutputT>, SparkRuntimeContext, Map<TupleTag<?>, KV<WindowingStrategy<?, ?>, SideInputBroadcast<?>>>, WindowingStrategy<?, ?>) - Constructor for class org.apache.beam.runners.spark.translation.SparkKeyedCombineFn
 
SparkNativePipelineVisitor - Class in org.apache.beam.runners.spark
Pipeline visitor for translating a Beam pipeline into equivalent Spark operations.
SparkPCollectionView - Class in org.apache.beam.runners.spark.translation
SparkPCollectionView is used to pass serialized views to lambdas.
SparkPCollectionView() - Constructor for class org.apache.beam.runners.spark.translation.SparkPCollectionView
 
SparkPipelineOptions - Interface in org.apache.beam.runners.spark
Spark runner PipelineOptions handles Spark execution-related configurations, such as the master address, batch-interval, and other user-related knobs.
SparkPipelineOptions.TmpCheckpointDirFactory - Class in org.apache.beam.runners.spark
Returns the default checkpoint directory of /tmp/${job.name}.
SparkPipelineResult - Class in org.apache.beam.runners.spark
Represents a Spark pipeline execution result.
SparkPipelineTranslator - Interface in org.apache.beam.runners.spark.translation
Translator to support translation between Beam transformations and Spark transformations.
SparkRunner - Class in org.apache.beam.runners.spark
The SparkRunner translate operations defined on a pipeline to a representation executable by Spark, and then submitting the job to Spark to be executed.
SparkRunner.Evaluator - Class in org.apache.beam.runners.spark
Evaluator on the pipeline.
SparkRunnerDebugger - Class in org.apache.beam.runners.spark
Pipeline runner which translates a Beam pipeline into equivalent Spark operations, without running them.
SparkRunnerDebugger.DebugSparkPipelineResult - Class in org.apache.beam.runners.spark
PipelineResult of running a Pipeline using SparkRunnerDebugger Use SparkRunnerDebugger.DebugSparkPipelineResult.getDebugString() to get a String representation of the Pipeline translated into Spark native operations.
SparkRunnerRegistrar - Class in org.apache.beam.runners.spark
SparkRunnerRegistrar.Options - Class in org.apache.beam.runners.spark
Registers the SparkPipelineOptions.
SparkRunnerRegistrar.Runner - Class in org.apache.beam.runners.spark
Registers the SparkRunner.
SparkRunnerStreamingContextFactory - Class in org.apache.beam.runners.spark.translation.streaming
A JavaStreamingContext factory for resilience.
SparkRunnerStreamingContextFactory(Pipeline, SparkPipelineOptions, Checkpoint.CheckpointDir) - Constructor for class org.apache.beam.runners.spark.translation.streaming.SparkRunnerStreamingContextFactory
 
SparkRuntimeContext - Class in org.apache.beam.runners.spark.translation
The SparkRuntimeContext allows us to define useful features on the client side before our data flow program is launched.
SparkSideInputReader - Class in org.apache.beam.runners.spark.util
A SideInputReader for thw SparkRunner.
SparkSideInputReader(Map<TupleTag<?>, KV<WindowingStrategy<?, ?>, SideInputBroadcast<?>>>) - Constructor for class org.apache.beam.runners.spark.util.SparkSideInputReader
 
SparkTimerInternals - Class in org.apache.beam.runners.spark.stateful
An implementation of TimerInternals for the SparkRunner.
SparkUnboundedSource - Class in org.apache.beam.runners.spark.io
A "composite" InputDStream implementation for UnboundedSources.
SparkUnboundedSource() - Constructor for class org.apache.beam.runners.spark.io.SparkUnboundedSource
 
SparkUnboundedSource.Metadata - Class in org.apache.beam.runners.spark.io
A metadata holder for an input stream partition.
SparkWatermarks(Instant, Instant, Instant) - Constructor for class org.apache.beam.runners.spark.util.GlobalWatermarkHolder.SparkWatermarks
 
start() - Method in class org.apache.beam.runners.spark.io.MicrobatchSource.Reader
 
state - Variable in class org.apache.beam.runners.spark.SparkPipelineResult
 
StateSpecFunctions - Class in org.apache.beam.runners.spark.stateful
A class containing StateSpec mappingFunctions.
StateSpecFunctions() - Constructor for class org.apache.beam.runners.spark.stateful.StateSpecFunctions
 
stop() - Method in class org.apache.beam.runners.spark.SparkPipelineResult
 
stop() - Method in class org.apache.beam.runners.spark.SparkRunnerDebugger.DebugSparkPipelineResult
 
stopSparkContext(JavaSparkContext) - Static method in class org.apache.beam.runners.spark.translation.SparkContextFactory
 
StorageLevelPTransform - Class in org.apache.beam.runners.spark.translation
Get RDD storage level for the input PCollection (mostly used for testing purpose).
StorageLevelPTransform() - Constructor for class org.apache.beam.runners.spark.translation.StorageLevelPTransform
 
StreamingTransformTranslator - Class in org.apache.beam.runners.spark.translation.streaming
Supports translation between a Beam transform, and Spark's operations on DStreams.
StreamingTransformTranslator.Translator - Class in org.apache.beam.runners.spark.translation.streaming
Translator matches Beam transformation with the appropriate evaluator.

T

TEST_REUSE_SPARK_CONTEXT - Static variable in class org.apache.beam.runners.spark.translation.SparkContextFactory
If the property beam.spark.test.reuseSparkContext is set to true then the Spark context will be reused for beam pipelines.
TestSparkPipelineOptions - Interface in org.apache.beam.runners.spark
TestSparkPipelineOptions.DefaultStopPipelineWatermarkFactory - Class in org.apache.beam.runners.spark
A factory to provide the default watermark to stop a pipeline that reads from an unbounded source.
TestSparkRunner - Class in org.apache.beam.runners.spark
The SparkRunner translate operations defined on a pipeline to a representation executable by Spark, and then submitting the job to Spark to be executed.
TmpCheckpointDirFactory() - Constructor for class org.apache.beam.runners.spark.SparkPipelineOptions.TmpCheckpointDirFactory
 
toByteArray(T, Coder<T>) - Static method in class org.apache.beam.runners.spark.coders.CoderHelpers
Utility method for serializing an object using the specified coder.
toByteArrays(Iterable<T>, Coder<T>) - Static method in class org.apache.beam.runners.spark.coders.CoderHelpers
Utility method for serializing a Iterable of values using the specified coder.
toByteFunction(Coder<T>) - Static method in class org.apache.beam.runners.spark.coders.CoderHelpers
A function wrapper for converting an object to a bytearray.
toByteFunction(Coder<K>, Coder<V>) - Static method in class org.apache.beam.runners.spark.coders.CoderHelpers
A function wrapper for converting a key-value pair to a byte array pair.
toNativeString() - Method in interface org.apache.beam.runners.spark.translation.TransformEvaluator
 
toPairByKeyInWindowedValue() - Static method in class org.apache.beam.runners.spark.translation.TranslationUtils
Extract key from a WindowedValue KV into a pair.
toPairFlatMapFunction() - Static method in class org.apache.beam.runners.spark.translation.TranslationUtils
KV to pair flatmap function.
toPairFunction() - Static method in class org.apache.beam.runners.spark.translation.TranslationUtils
KV to pair function.
toString() - Method in class org.apache.beam.runners.spark.aggregators.NamedAggregators
 
toString() - Method in class org.apache.beam.runners.spark.util.GlobalWatermarkHolder.SparkWatermarks
 
TransformEvaluator<TransformT extends PTransform<?,?>> - Interface in org.apache.beam.runners.spark.translation
Describe a PTransform evaluator.
TransformTranslator - Class in org.apache.beam.runners.spark.translation
Supports translation between a Beam transform, and Spark's operations on RDDs.
TransformTranslator.Translator - Class in org.apache.beam.runners.spark.translation
Translator matches Beam transformation with the appropriate evaluator.
translate(TransformHierarchy.Node, TransformT, Class<TransformT>) - Method in class org.apache.beam.runners.spark.SparkRunner.Evaluator
Determine if this Node belongs to a Bounded branch of the pipeline, or Unbounded, and translate with the proper translator.
translateBounded(Class<TransformT>) - Method in interface org.apache.beam.runners.spark.translation.SparkPipelineTranslator
 
translateBounded(Class<TransformT>) - Method in class org.apache.beam.runners.spark.translation.streaming.StreamingTransformTranslator.Translator
 
translateBounded(Class<TransformT>) - Method in class org.apache.beam.runners.spark.translation.TransformTranslator.Translator
 
translateUnbounded(Class<TransformT>) - Method in interface org.apache.beam.runners.spark.translation.SparkPipelineTranslator
 
translateUnbounded(Class<TransformT>) - Method in class org.apache.beam.runners.spark.translation.streaming.StreamingTransformTranslator.Translator
 
translateUnbounded(Class<TransformT>) - Method in class org.apache.beam.runners.spark.translation.TransformTranslator.Translator
 
TranslationUtils - Class in org.apache.beam.runners.spark.translation
A set of utilities to help translating Beam transformations into Spark transformations.
TranslationUtils.CombineGroupedValues<K,InputT,OutputT> - Class in org.apache.beam.runners.spark.translation
A SparkKeyedCombineFn function applied to grouped KVs.
TranslationUtils.TupleTagFilter<V> - Class in org.apache.beam.runners.spark.translation
A utility class to filter TupleTags.
translator - Variable in class org.apache.beam.runners.spark.SparkRunner.Evaluator
 
Translator(SparkPipelineTranslator) - Constructor for class org.apache.beam.runners.spark.translation.streaming.StreamingTransformTranslator.Translator
 
Translator() - Constructor for class org.apache.beam.runners.spark.translation.TransformTranslator.Translator
 
TupleTagFilter(TupleTag<V>) - Constructor for class org.apache.beam.runners.spark.translation.TranslationUtils.TupleTagFilter
 

U

Unbounded(SparkContext, SparkRuntimeContext, MicrobatchSource<T, CheckpointMarkT>, int) - Constructor for class org.apache.beam.runners.spark.io.SourceRDD.Unbounded
 
UnboundedDataset<T> - Class in org.apache.beam.runners.spark.translation.streaming
DStream holder Can also crate a DStream from a supplied queue of values, but mainly for testing.
UnboundedDataset(JavaDStream<WindowedValue<T>>, List<Integer>) - Constructor for class org.apache.beam.runners.spark.translation.streaming.UnboundedDataset
 
unpersist() - Method in class org.apache.beam.runners.spark.util.SideInputBroadcast
 
unwindowFunction() - Static method in class org.apache.beam.runners.spark.translation.WindowingHelpers
A Spark function for extracting the value from a WindowedValue.
unwindowValueFunction() - Static method in class org.apache.beam.runners.spark.translation.WindowingHelpers
Same as unwindowFunction but for non-RDD values - not an RDD transformation!
update(InputT) - Method in class org.apache.beam.runners.spark.aggregators.NamedAggregators.CombineFunctionState
 
update(InputT) - Method in interface org.apache.beam.runners.spark.aggregators.NamedAggregators.State
 
updateCacheCandidates(Pipeline, SparkPipelineTranslator, EvaluationContext) - Static method in class org.apache.beam.runners.spark.SparkRunner
Evaluator that update/populate the cache candidates.

V

validate() - Method in class org.apache.beam.runners.spark.io.MicrobatchSource
 
visitPrimitiveTransform(TransformHierarchy.Node) - Method in class org.apache.beam.runners.spark.SparkRunner.Evaluator
 

W

waitUntilFinish() - Method in class org.apache.beam.runners.spark.SparkPipelineResult
 
waitUntilFinish(Duration) - Method in class org.apache.beam.runners.spark.SparkPipelineResult
 
WatermarksListener(JavaStreamingContext) - Constructor for class org.apache.beam.runners.spark.util.GlobalWatermarkHolder.WatermarksListener
 
windowFunction() - Static method in class org.apache.beam.runners.spark.translation.WindowingHelpers
A Spark function for converting a value to a WindowedValue.
WindowingHelpers - Class in org.apache.beam.runners.spark.translation
Helper functions for working with windows.
windowingStrategy - Variable in class org.apache.beam.runners.spark.translation.SparkAbstractCombineFn
 
windowValueFunction() - Static method in class org.apache.beam.runners.spark.translation.WindowingHelpers
Same as windowFunction but for non-RDD values - not an RDD transformation!
WithMetricsSupport - Class in org.apache.beam.runners.spark.metrics
A MetricRegistry decorator-like that supports AggregatorMetric and SparkBeamMetric as Gauges.
WordCount - Class in org.apache.beam.runners.spark.examples
Duplicated from beam-examples-java to avoid dependency.
WordCount() - Constructor for class org.apache.beam.runners.spark.examples.WordCount
 
WordCount.CountWords - Class in org.apache.beam.runners.spark.examples
A PTransform that converts a PCollection containing lines of text into a PCollection of formatted word counts.
WordCount.ExtractWordsFn - Class in org.apache.beam.runners.spark.examples
Concept #2: You can make your pipeline code less verbose by defining your DoFns statically out- of-line.
WordCount.FormatAsTextFn - Class in org.apache.beam.runners.spark.examples
A SimpleFunction that converts a Word and Count into a printable string.
WordCount.WordCountOptions - Interface in org.apache.beam.runners.spark.examples
Options supported by WordCount.
write(FileSystem, Path, byte[]) - Static method in class org.apache.beam.runners.spark.translation.streaming.Checkpoint
 
writeObject(FileSystem, Path, Object) - Static method in class org.apache.beam.runners.spark.translation.streaming.Checkpoint
 

Z

zero(NamedAggregators) - Method in class org.apache.beam.runners.spark.aggregators.AggAccumParam
 
A B C D E F G H I J M N O P R S T U V W Z 
Skip navigation links

Copyright © 2016–2017 The Apache Software Foundation. All rights reserved.