Manipulation/selection#

DataFrame.__getitem__(key)

Get part of the DataFrame as a new DataFrame, Series, or scalar.

DataFrame.bottom_k(k, *, by[, reverse])

Return the k smallest rows.

DataFrame.cast(dtypes, *[, strict])

Cast DataFrame column(s) to the specified dtype(s).

DataFrame.clear([n])

Create an empty (n=0) or n-row null-filled (n>0) copy of the DataFrame.

DataFrame.clone()

Create a copy of this DataFrame.

DataFrame.drop(*columns[, strict])

Remove columns from the dataframe.

DataFrame.drop_in_place(name)

Drop a single column in-place and return the dropped column.

DataFrame.drop_nans([subset])

Drop all rows that contain one or more NaN values.

DataFrame.drop_nulls([subset])

Drop all rows that contain null values.

DataFrame.explode(columns, *more_columns)

Explode the dataframe to long format by exploding the given columns.

DataFrame.extend(other)

Extend the memory backed by this DataFrame with the values from other.

DataFrame.fill_nan(value)

Fill floating point NaN values by an Expression evaluation.

DataFrame.fill_null([value, strategy, ...])

Fill null values using the specified value or strategy.

DataFrame.filter(*predicates, **constraints)

Filter the rows in the DataFrame based on one or more predicate expressions.

DataFrame.gather_every(n[, offset])

Take every nth row in the DataFrame and return as a new DataFrame.

DataFrame.get_column(name, *[, default])

Get a single column by name.

DataFrame.get_column_index(name)

Find the index of a column by name.

DataFrame.get_columns()

Get the DataFrame as a List of Series.

DataFrame.group_by(*by[, maintain_order])

Start a group by operation.

DataFrame.group_by_dynamic(index_column, *, ...)

Group based on a time value (or index value of type Int32, Int64).

DataFrame.head([n])

Get the first n rows.

DataFrame.hstack(columns, *[, in_place])

Return a new DataFrame grown horizontally by stacking multiple Series to it.

DataFrame.insert_column(index, column)

Insert a Series at a certain column index.

DataFrame.interpolate()

Interpolate intermediate values.

DataFrame.item([row, column])

Return the DataFrame as a scalar, or return the element at the given row/column.

DataFrame.iter_columns()

Returns an iterator over the columns of this DataFrame.

DataFrame.iter_rows(*[, named, buffer_size])

Returns an iterator over the DataFrame of rows of python-native values.

DataFrame.iter_slices([n_rows])

Returns a non-copying iterator of slices over the underlying DataFrame.

DataFrame.join(other[, on, how, left_on, ...])

Join in SQL-like fashion.

DataFrame.join_asof(other, *[, left_on, ...])

Perform an asof join.

DataFrame.join_where(other, *predicates[, ...])

Perform a join based on one or multiple (in)equality predicates.

DataFrame.limit([n])

Get the first n rows.

DataFrame.melt([id_vars, value_vars, ...])

Unpivot a DataFrame from wide to long format.

DataFrame.merge_sorted(other, key)

Take two sorted DataFrames and merge them by the sorted key.

DataFrame.partition_by(by, *more_by[, ...])

Group by the given columns and return the groups as separate dataframes.

DataFrame.pipe(function, *args, **kwargs)

Offers a structured way to apply a sequence of user-defined functions (UDFs).

DataFrame.pivot(on, *[, index, values, ...])

Create a spreadsheet-style pivot table as a DataFrame.

DataFrame.rechunk()

Rechunk the data in this DataFrame to a contiguous allocation.

DataFrame.rename(mapping, *[, strict])

Rename column names.

DataFrame.replace_column(index, column)

Replace a column at an index location.

DataFrame.reverse()

Reverse the DataFrame.

DataFrame.rolling(index_column, *, period[, ...])

Create rolling groups based on a temporal or integer column.

DataFrame.row([index, by_predicate, named])

Get the values of a single row, either by index or by predicate.

DataFrame.rows(*[, named])

Returns all data in the DataFrame as a list of rows of python-native values.

DataFrame.rows_by_key(key, *[, named, ...])

Returns all data as a dictionary of python-native values keyed by some column.

DataFrame.sample([n, fraction, ...])

Sample from this DataFrame.

DataFrame.select(*exprs, **named_exprs)

Select columns from this DataFrame.

DataFrame.select_seq(*exprs, **named_exprs)

Select columns from this DataFrame.

DataFrame.set_sorted(column, *[, descending])

Indicate that one or multiple columns are sorted.

DataFrame.shift([n, fill_value])

Shift values by the given number of indices.

DataFrame.shrink_to_fit(*[, in_place])

Shrink DataFrame memory usage.

DataFrame.slice(offset[, length])

Get a slice of this DataFrame.

DataFrame.sort(by, *more_by[, descending, ...])

Sort the dataframe by the given columns.

DataFrame.sql(query, *[, table_name])

Execute a SQL query against the DataFrame.

DataFrame.tail([n])

Get the last n rows.

DataFrame.to_dummies([columns, separator, ...])

Convert categorical variables into dummy/indicator variables.

DataFrame.to_series([index])

Select column as Series at index location.

DataFrame.top_k(k, *, by[, reverse])

Return the k largest rows.

DataFrame.transpose(*[, include_header, ...])

Transpose a DataFrame over the diagonal.

DataFrame.unique([subset, keep, maintain_order])

Drop duplicate rows from this dataframe.

DataFrame.unnest(columns, *more_columns)

Decompose struct columns into separate columns for each of their fields.

DataFrame.unpivot([on, index, ...])

Unpivot a DataFrame from wide to long format.

DataFrame.unstack(step[, how, columns, ...])

Unstack a long table to a wide form without doing an aggregation.

DataFrame.update(other[, on, how, left_on, ...])

Update the values in this DataFrame with the values in other.

DataFrame.upsample(time_column, *, every[, ...])

Upsample a DataFrame at a regular frequency.

DataFrame.vstack(other, *[, in_place])

Grow this DataFrame vertically by stacking a DataFrame to it.

DataFrame.with_columns(*exprs, **named_exprs)

Add columns to this DataFrame.

DataFrame.with_columns_seq(*exprs, **named_exprs)

Add columns to this DataFrame.

DataFrame.with_row_count([name, offset])

Add a column at index 0 that counts the rows.

DataFrame.with_row_index([name, offset])

Add a row index as the first column in the DataFrame.