pyspark.sql.DataFrame.select¶
-
DataFrame.
select
(*cols)[source]¶ Projects a set of expressions and returns a new
DataFrame
.- Parameters
cols – list of column names (string) or expressions (
Column
). If one of the column names is ‘*’, that column is expanded to include all columns in the currentDataFrame
.
>>> df.select('*').collect() [Row(age=2, name='Alice'), Row(age=5, name='Bob')] >>> df.select('name', 'age').collect() [Row(name='Alice', age=2), Row(name='Bob', age=5)] >>> df.select(df.name, (df.age + 10).alias('age')).collect() [Row(name='Alice', age=12), Row(name='Bob', age=15)]
New in version 1.3.