pyspark.Broadcast

class pyspark.Broadcast(sc=None, value=None, pickle_registry=None, path=None, sock_file=None)[source]

A broadcast variable created with SparkContext.broadcast(). Access its value through value.

Examples:

>>> from pyspark.context import SparkContext
>>> sc = SparkContext('local', 'test')
>>> b = sc.broadcast([1, 2, 3, 4, 5])
>>> b.value
[1, 2, 3, 4, 5]
>>> sc.parallelize([0, 0]).flatMap(lambda x: b.value).collect()
[1, 2, 3, 4, 5, 1, 2, 3, 4, 5]
>>> b.unpersist()
>>> large_broadcast = sc.broadcast(range(10000))
__init__(sc=None, value=None, pickle_registry=None, path=None, sock_file=None)[source]

Should not be called directly by users – use SparkContext.broadcast() instead.

Methods

Attributes