Struct DataFrame

Source

pub struct DataFrame { /* private fields */ }

Expand description

A contiguous growable collection of Series that have the same length.

§Use declarations

All the common tools can be found in crate::prelude (or in polars::prelude).

use polars_core::prelude::*; // if the crate polars-core is used directly
// use polars::prelude::*;      if the crate polars is used

§Initialization

§Default

A DataFrame can be initialized empty:

let df = DataFrame::default();
assert!(df.is_empty());

§Wrapping a `Vec<Series>`

A DataFrame is built upon a Vec<Series> where the Series have the same length.

let s1 = Column::new("Fruit".into(), ["Apple", "Apple", "Pear"]);
let s2 = Column::new("Color".into(), ["Red", "Yellow", "Green"]);

let df: PolarsResult<DataFrame> = DataFrame::new(vec![s1, s2]);

§Using a macro

The df! macro is a convenient method:

let df: PolarsResult<DataFrame> = df!("Fruit" => ["Apple", "Apple", "Pear"],
                                      "Color" => ["Red", "Yellow", "Green"]);

§Using a CSV file

See the polars_io::csv::CsvReader.

§Indexing

§By a number

The Index<usize> is implemented for the DataFrame.

let df = df!("Fruit" => ["Apple", "Apple", "Pear"],
             "Color" => ["Red", "Yellow", "Green"])?;

assert_eq!(df[0], Column::new("Fruit".into(), &["Apple", "Apple", "Pear"]));
assert_eq!(df[1], Column::new("Color".into(), &["Red", "Yellow", "Green"]));

§By a `Series` name

let df = df!("Fruit" => ["Apple", "Apple", "Pear"],
             "Color" => ["Red", "Yellow", "Green"])?;

assert_eq!(df["Fruit"], Column::new("Fruit".into(), &["Apple", "Apple", "Pear"]));
assert_eq!(df["Color"], Column::new("Color".into(), &["Red", "Yellow", "Green"]));

Implementations§

Source §

impl DataFrame

Source

pub fn sample_n( &self, n: &Series, with_replacement: bool, shuffle: bool, seed: Option<u64>, ) -> PolarsResult<Self>

Available on crate feature random only.

Sample n datapoints from this DataFrame.

Source

pub fn sample_n_literal( &self, n: usize, with_replacement: bool, shuffle: bool, seed: Option<u64>, ) -> PolarsResult<Self>

Available on crate feature random only.

Source

pub fn sample_frac( &self, frac: &Series, with_replacement: bool, shuffle: bool, seed: Option<u64>, ) -> PolarsResult<Self>

Available on crate feature random only.

Sample a fraction between 0.0-1.0 of this DataFrame.

Source §

impl DataFrame

Source

pub fn group_by_with_series( &self, by: Vec<Column>, multithreaded: bool, sorted: bool, ) -> PolarsResult<GroupBy<'_>>

Available on crate feature algorithm_group_by only.

Source

pub fn group_by<I, S>(&self, by: I) -> PolarsResult<GroupBy<'_>>
where I: IntoIterator<Item = S>, S: Into<PlSmallStr>,

Available on crate feature algorithm_group_by only.

Group DataFrame using a Series column.

§Example

use polars_core::prelude::*;
fn group_by_sum(df: &DataFrame) -> PolarsResult<DataFrame> {
    df.group_by(["column_name"])?
    .select(["agg_column_name"])
    .sum()
}

Source

pub fn group_by_stable<I, S>(&self, by: I) -> PolarsResult<GroupBy<'_>>
where I: IntoIterator<Item = S>, S: Into<PlSmallStr>,

Available on crate feature algorithm_group_by only.

Group DataFrame using a Series column. The groups are ordered by their smallest row index.

Source §

impl DataFrame

Source

pub fn get_row(&self, idx: usize) -> PolarsResult<Row<'_>>

Available on crate features rows or object only.

Get a row from a DataFrame. Use of this is discouraged as it will likely be slow.

Source

pub fn get_row_amortized<'a>( &'a self, idx: usize, row: &mut Row<'a>, ) -> PolarsResult<()>

Available on crate features rows or object only.

Amortize allocations by reusing a row. The caller is responsible to make sure that the row has at least the capacity for the number of columns in the DataFrame

Source

pub unsafe fn get_row_amortized_unchecked<'a>( &'a self, idx: usize, row: &mut Row<'a>, )

Available on crate features rows or object only.

Amortize allocations by reusing a row. The caller is responsible to make sure that the row has at least the capacity for the number of columns in the DataFrame

§Safety

Does not do any bounds checking.

Source

pub fn from_rows_and_schema( rows: &[Row<'_>], schema: &Schema, ) -> PolarsResult<Self>

Available on crate features rows or object only.

Create a new DataFrame from rows.

This should only be used when you have row wise data, as this is a lot slower than creating the Series in a columnar fashion

Source

pub fn from_rows_iter_and_schema<'a, I>( rows: I, schema: &Schema, ) -> PolarsResult<Self>
where I: Iterator<Item = &'a Row<'a>>,

Available on crate features rows or object only.

Create a new DataFrame from an iterator over rows.

This should only be used when you have row wise data, as this is a lot slower than creating the Series in a columnar fashion.

Source

pub fn try_from_rows_iter_and_schema<'a, I>( rows: I, schema: &Schema, ) -> PolarsResult<Self>
where I: Iterator<Item = PolarsResult<&'a Row<'a>>>,

Available on crate features rows or object only.

Create a new DataFrame from an iterator over rows. This should only be used when you have row wise data, as this is a lot slower than creating the Series in a columnar fashion

Source

pub fn from_rows(rows: &[Row<'_>]) -> PolarsResult<Self>

Available on crate features rows or object only.

Create a new DataFrame from rows. This should only be used when you have row wise data, as this is a lot slower than creating the Series in a columnar fashion

Source §

impl DataFrame

Source

pub fn transpose( &mut self, keep_names_as: Option<&str>, new_col_names: Option<Either<String, Vec<String>>>, ) -> PolarsResult<DataFrame>

Available on crate features rows or object only.

Source

pub fn transpose_impl( &mut self, keep_names_as: Option<&str>, new_col_names: Option<Either<PlSmallStr, Vec<PlSmallStr>>>, ) -> PolarsResult<DataFrame>

Available on crate features rows or object only.

Transpose a DataFrame. This is a very expensive operation.

Source §

impl DataFrame

Source

pub fn clear_schema(&mut self)

Source

pub fn materialized_column_iter(&self) -> impl ExactSizeIterator<Item = &Series>

Source

pub fn par_materialized_column_iter( &self, ) -> impl ParallelIterator<Item = &Series>

Source

pub fn estimated_size(&self) -> usize

Returns an estimation of the total (heap) allocated size of the DataFrame in bytes.

§Implementation

This estimation is the sum of the size of its buffers, validity, including nested arrays. Multiple arrays may share buffers and bitmaps. Therefore, the size of 2 arrays is not the sum of the sizes computed from this function. In particular, [StructArray]’s size is an upper bound.

When an array is sliced, its allocated size remains constant because the buffer unchanged. However, this function will yield a smaller number. This is because this function returns the visible size of the buffer, not its total capacity.

FFI buffers are included in this estimation.

Source

pub fn _apply_columns(&self, func: &dyn Fn(&Column) -> Column) -> Vec<Column>

Source

pub fn _apply_columns_par( &self, func: &(dyn Fn(&Column) -> Column + Send + Sync), ) -> Vec<Column>

Source

pub fn new(columns: Vec<Column>) -> PolarsResult<Self>

Create a DataFrame from a Vector of Series.

§Example

let s0 = Column::new("days".into(), [0, 1, 2].as_ref());
let s1 = Column::new("temp".into(), [22.1, 19.9, 7.].as_ref());

let df = DataFrame::new(vec![s0, s1])?;

Source

pub fn new_with_broadcast(columns: Vec<Column>) -> PolarsResult<Self>

Converts a sequence of columns into a DataFrame, broadcasting length-1 columns to match the other columns.

Source

pub fn new_with_broadcast_len( columns: Vec<Column>, broadcast_len: usize, ) -> PolarsResult<Self>

Converts a sequence of columns into a DataFrame, broadcasting length-1 columns to broadcast_len.

Source

pub unsafe fn new_with_broadcast_no_namecheck( columns: Vec<Column>, broadcast_len: usize, ) -> PolarsResult<Self>

Converts a sequence of columns into a DataFrame, broadcasting length-1 columns to match the other columns.

§Safety

Does not check that the column names are unique (which they must be).

Source

pub const fn empty() -> Self

Creates an empty DataFrame usable in a compile time context (such as static initializers).

§Example

use polars_core::prelude::DataFrame;
static EMPTY: DataFrame = DataFrame::empty();

Source

pub fn empty_with_schema(schema: &Schema) -> Self

Create an empty DataFrame with empty columns as per the schema.

Source

pub fn empty_with_arrow_schema(schema: &ArrowSchema) -> Self

Create an empty DataFrame with empty columns as per the schema.

Source

pub fn full_null(schema: &Schema, height: usize) -> Self

Create a new DataFrame with the given schema, only containing nulls.

Source

pub fn pop(&mut self) -> Option<Column>

Removes the last Series from the DataFrame and returns it, or None if it is empty.

§Example

let s1 = Column::new("Ocean".into(), ["Atlantic", "Indian"]);
let s2 = Column::new("Area (km²)".into(), [106_460_000, 70_560_000]);
let mut df = DataFrame::new(vec![s1.clone(), s2.clone()])?;

assert_eq!(df.pop(), Some(s2));
assert_eq!(df.pop(), Some(s1));
assert_eq!(df.pop(), None);
assert!(df.is_empty());

Source

pub fn with_row_index( &self, name: PlSmallStr, offset: Option<IdxSize>, ) -> PolarsResult<Self>

Add a new column at index 0 that counts the rows.

§Example

let df1: DataFrame = df!("Name" => ["James", "Mary", "John", "Patricia"])?;
assert_eq!(df1.shape(), (4, 1));

let df2: DataFrame = df1.with_row_index("Id".into(), None)?;
assert_eq!(df2.shape(), (4, 2));
println!("{}", df2);

Output:

 shape: (4, 2)
 +-----+----------+
 | Id  | Name     |
 | --- | ---      |
 | u32 | str      |
 +=====+==========+
 | 0   | James    |
 +-----+----------+
 | 1   | Mary     |
 +-----+----------+
 | 2   | John     |
 +-----+----------+
 | 3   | Patricia |
 +-----+----------+

Source

pub fn with_row_index_mut( &mut self, name: PlSmallStr, offset: Option<IdxSize>, ) -> &mut Self

Add a row index column in place.

Source

pub unsafe fn new_no_checks_height_from_first(columns: Vec<Column>) -> DataFrame

Create a new DataFrame but does not check the length or duplicate occurrence of the Series.

Calculates the height from the first column or 0 if no columns are given.

§Safety

It is the callers responsibility to uphold the contract of all Series having an equal length and a unique name, if not this may panic down the line.

Source

pub unsafe fn new_no_checks(height: usize, columns: Vec<Column>) -> DataFrame

Create a new DataFrame but does not check the length or duplicate occurrence of the Series.

It is advised to use DataFrame::new in favor of this method.

§Safety

It is the callers responsibility to uphold the contract of all Series having an equal length and a unique name, if not this may panic down the line.

Source

pub const unsafe fn _new_no_checks_impl( height: usize, columns: Vec<Column>, ) -> DataFrame

This will not panic even in debug mode - there are some (rare) use cases where a DataFrame is temporarily constructed containing duplicates for dispatching to functions. A DataFrame constructed with this method is generally highly unsafe and should not be long-lived.

Source

pub unsafe fn new_no_length_checks( columns: Vec<Column>, ) -> PolarsResult<DataFrame>

Create a new DataFrame but does not check the length of the Series, only check for duplicates.

It is advised to use DataFrame::new in favor of this method.

§Safety

It is the callers responsibility to uphold the contract of all Series having an equal length, if not this may panic down the line.

Source

pub fn shrink_to_fit(&mut self)

Shrink the capacity of this DataFrame to fit its length.

Source

pub fn as_single_chunk(&mut self) -> &mut Self

Aggregate all the chunks in the DataFrame to a single chunk.

Source

pub fn as_single_chunk_par(&mut self) -> &mut Self

Aggregate all the chunks in the DataFrame to a single chunk in parallel. This may lead to more peak memory consumption.

Source

pub fn rechunk_mut(&mut self)

Rechunks all columns to only have a single chunk.

Source

pub fn rechunk_to_record_batch( self, compat_level: CompatLevel, ) -> RecordBatchT<Box<dyn Array>>

Rechunks all columns to only have a single chunk and turns it into a [RecordBatchT].

Source

pub fn should_rechunk(&self) -> bool

Returns true if the chunks of the columns do not align and re-chunking should be done

Source

pub fn align_chunks_par(&mut self) -> &mut Self

Ensure all the chunks in the DataFrame are aligned.

Source

pub fn align_chunks(&mut self) -> &mut Self

Source

pub fn schema(&self) -> &SchemaRef

Get the DataFrame schema.

§Example

let df: DataFrame = df!("Thing" => ["Observable universe", "Human stupidity"],
                        "Diameter (m)" => [8.8e26, f64::INFINITY])?;

let f1: Field = Field::new("Thing".into(), DataType::String);
let f2: Field = Field::new("Diameter (m)".into(), DataType::Float64);
let sc: Schema = Schema::from_iter(vec![f1, f2]);

assert_eq!(&**df.schema(), &sc);

Source

pub fn get_columns(&self) -> &[Column]

Get a reference to the DataFrame columns.

§Example

let df: DataFrame = df!("Name" => ["Adenine", "Cytosine", "Guanine", "Thymine"],
                        "Symbol" => ["A", "C", "G", "T"])?;
let columns: &[Column] = df.get_columns();

assert_eq!(columns[0].name(), "Name");
assert_eq!(columns[1].name(), "Symbol");

Source

pub unsafe fn get_columns_mut(&mut self) -> &mut Vec<Column>

Get mutable access to the underlying columns.

§Safety

The caller must ensure the length of all Series remains equal to height or DataFrame::set_height is called afterwards with the appropriate height. The caller must ensure that the cached schema is cleared if it modifies the schema by calling DataFrame::clear_schema.

Source

pub fn clear_columns(&mut self)

Remove all the columns in the DataFrame but keep the height.

Source

pub unsafe fn column_extend_unchecked( &mut self, iter: impl IntoIterator<Item = Column>, )

Extend the columns without checking for name collisions or height.

§Safety

The caller needs to ensure that:

Column names are unique within the resulting DataFrame.
The length of each appended column matches the height of the DataFrame. For DataFrame]s with no columns (ZCDFs), it is important that the height is set afterwards with DataFrame::set_height.

Source

pub fn take_columns(self) -> Vec<Column>

Take ownership of the underlying columns vec.

Source

pub fn iter(&self) -> impl ExactSizeIterator<Item = &Series>

Iterator over the columns as Series.

§Example

let s1 = Column::new("Name".into(), ["Pythagoras' theorem", "Shannon entropy"]);
let s2 = Column::new("Formula".into(), ["a²+b²=c²", "H=-Σ[P(x)log|P(x)|]"]);
let df: DataFrame = DataFrame::new(vec![s1.clone(), s2.clone()])?;

let mut iterator = df.iter();

assert_eq!(iterator.next(), Some(s1.as_materialized_series()));
assert_eq!(iterator.next(), Some(s2.as_materialized_series()));
assert_eq!(iterator.next(), None);

Source

pub fn get_column_names(&self) -> Vec<&PlSmallStr>

§Example

let df: DataFrame = df!("Language" => ["Rust", "Python"],
                        "Designer" => ["Graydon Hoare", "Guido van Rossum"])?;

assert_eq!(df.get_column_names(), &["Language", "Designer"]);

Source

pub fn get_column_names_owned(&self) -> Vec<PlSmallStr>

Get the Vec<PlSmallStr> representing the column names.

Source

pub fn get_column_names_str(&self) -> Vec<&str>

Source

pub fn set_column_names<I, S>(&mut self, names: I) -> PolarsResult<()>
where I: IntoIterator<Item = S>, S: Into<PlSmallStr>,

Set the column names.

§Example

let mut df: DataFrame = df!("Mathematical set" => ["ℕ", "ℤ", "𝔻", "ℚ", "ℝ", "ℂ"])?;
df.set_column_names(["Set"])?;

assert_eq!(df.get_column_names(), &["Set"]);

Source

pub fn dtypes(&self) -> Vec<DataType>

Get the data types of the columns in the DataFrame.

§Example

let venus_air: DataFrame = df!("Element" => ["Carbon dioxide", "Nitrogen"],
                               "Fraction" => [0.965, 0.035])?;

assert_eq!(venus_air.dtypes(), &[DataType::String, DataType::Float64]);

Source

pub fn first_col_n_chunks(&self) -> usize

The number of chunks for the first column.

Source

pub fn max_n_chunks(&self) -> usize

The highest number of chunks for any column.

Source

pub fn fields(&self) -> Vec<Field>

Get a reference to the schema fields of the DataFrame.

§Example

let earth: DataFrame = df!("Surface type" => ["Water", "Land"],
                           "Fraction" => [0.708, 0.292])?;

let f1: Field = Field::new("Surface type".into(), DataType::String);
let f2: Field = Field::new("Fraction".into(), DataType::Float64);

assert_eq!(earth.fields(), &[f1, f2]);

Source

pub fn shape(&self) -> (usize, usize)

Get (height, width) of the DataFrame.

§Example

let df0: DataFrame = DataFrame::default();
let df1: DataFrame = df!("1" => [1, 2, 3, 4, 5])?;
let df2: DataFrame = df!("1" => [1, 2, 3, 4, 5],
                         "2" => [1, 2, 3, 4, 5])?;

assert_eq!(df0.shape(), (0 ,0));
assert_eq!(df1.shape(), (5, 1));
assert_eq!(df2.shape(), (5, 2));

Source

pub fn width(&self) -> usize

Get the width of the DataFrame which is the number of columns.

§Example

let df0: DataFrame = DataFrame::default();
let df1: DataFrame = df!("Series 1" => [0; 0])?;
let df2: DataFrame = df!("Series 1" => [0; 0],
                         "Series 2" => [0; 0])?;

assert_eq!(df0.width(), 0);
assert_eq!(df1.width(), 1);
assert_eq!(df2.width(), 2);

Source

pub fn height(&self) -> usize

Get the height of the DataFrame which is the number of rows.

§Example

let df0: DataFrame = DataFrame::default();
let df1: DataFrame = df!("Currency" => ["€", "$"])?;
let df2: DataFrame = df!("Currency" => ["€", "$", "¥", "£", "₿"])?;

assert_eq!(df0.height(), 0);
assert_eq!(df1.height(), 2);
assert_eq!(df2.height(), 5);

Source

pub fn size(&self) -> usize

Returns the size as number of rows * number of columns

Source

pub fn is_empty(&self) -> bool

Returns true if the DataFrame contains no rows.

§Example

let df1: DataFrame = DataFrame::default();
assert!(df1.is_empty());

let df2: DataFrame = df!("First name" => ["Forever"],
                         "Last name" => ["Alone"])?;
assert!(!df2.is_empty());

Source

pub unsafe fn set_height(&mut self, height: usize)

Set the height (i.e. number of rows) of this DataFrame.

§Safety

This needs to be equal to the length of all the columns.

Source

pub fn hstack(&self, columns: &[Column]) -> PolarsResult<Self>

Add multiple Series to a DataFrame. The added Series are required to have the same length.

§Example

let df1: DataFrame = df!("Element" => ["Copper", "Silver", "Gold"])?;
let s1 = Column::new("Proton".into(), [29, 47, 79]);
let s2 = Column::new("Electron".into(), [29, 47, 79]);

let df2: DataFrame = df1.hstack(&[s1, s2])?;
assert_eq!(df2.shape(), (3, 3));
println!("{}", df2);

Output:

shape: (3, 3)
+---------+--------+----------+
| Element | Proton | Electron |
| ---     | ---    | ---      |
| str     | i32    | i32      |
+=========+========+==========+
| Copper  | 29     | 29       |
+---------+--------+----------+
| Silver  | 47     | 47       |
+---------+--------+----------+
| Gold    | 79     | 79       |
+---------+--------+----------+

Source

pub fn vstack(&self, other: &DataFrame) -> PolarsResult<Self>

Concatenate a DataFrame to this DataFrame and return as newly allocated DataFrame.

If many vstack operations are done, it is recommended to call DataFrame::align_chunks_par.

§Example

let df1: DataFrame = df!("Element" => ["Copper", "Silver", "Gold"],
                         "Melting Point (K)" => [1357.77, 1234.93, 1337.33])?;
let df2: DataFrame = df!("Element" => ["Platinum", "Palladium"],
                         "Melting Point (K)" => [2041.4, 1828.05])?;

let df3: DataFrame = df1.vstack(&df2)?;

assert_eq!(df3.shape(), (5, 2));
println!("{}", df3);

Output:

shape: (5, 2)
+-----------+-------------------+
| Element   | Melting Point (K) |
| ---       | ---               |
| str       | f64               |
+===========+===================+
| Copper    | 1357.77           |
+-----------+-------------------+
| Silver    | 1234.93           |
+-----------+-------------------+
| Gold      | 1337.33           |
+-----------+-------------------+
| Platinum  | 2041.4            |
+-----------+-------------------+
| Palladium | 1828.05           |
+-----------+-------------------+

Source

pub fn vstack_mut(&mut self, other: &DataFrame) -> PolarsResult<&mut Self>

Concatenate a DataFrame to this DataFrame

If many vstack operations are done, it is recommended to call DataFrame::align_chunks_par.

§Example

let mut df1: DataFrame = df!("Element" => ["Copper", "Silver", "Gold"],
                         "Melting Point (K)" => [1357.77, 1234.93, 1337.33])?;
let df2: DataFrame = df!("Element" => ["Platinum", "Palladium"],
                         "Melting Point (K)" => [2041.4, 1828.05])?;

df1.vstack_mut(&df2)?;

assert_eq!(df1.shape(), (5, 2));
println!("{}", df1);

Output:

shape: (5, 2)
+-----------+-------------------+
| Element   | Melting Point (K) |
| ---       | ---               |
| str       | f64               |
+===========+===================+
| Copper    | 1357.77           |
+-----------+-------------------+
| Silver    | 1234.93           |
+-----------+-------------------+
| Gold      | 1337.33           |
+-----------+-------------------+
| Platinum  | 2041.4            |
+-----------+-------------------+
| Palladium | 1828.05           |
+-----------+-------------------+

Source

pub fn vstack_mut_unchecked(&mut self, other: &DataFrame)

Concatenate a DataFrame to this DataFrame

If many vstack operations are done, it is recommended to call DataFrame::align_chunks_par.

§Panics

Panics if the schema’s don’t match.

Source

pub fn vstack_mut_owned_unchecked(&mut self, other: DataFrame)

Concatenate a DataFrame to this DataFrame

If many vstack operations are done, it is recommended to call DataFrame::align_chunks_par.

§Panics

Panics if the schema’s don’t match.

Source

pub fn extend(&mut self, other: &DataFrame) -> PolarsResult<()>

Extend the memory backed by this DataFrame with the values from other.

Different from vstack which adds the chunks from other to the chunks of this DataFrame extend appends the data from other to the underlying memory locations and thus may cause a reallocation.

If this does not cause a reallocation, the resulting data structure will not have any extra chunks and thus will yield faster queries.

Prefer extend over vstack when you want to do a query after a single append. For instance during online operations where you add n rows and rerun a query.

Prefer vstack over extend when you want to append many times before doing a query. For instance when you read in multiple files and when to store them in a single DataFrame. In the latter case, finish the sequence of append operations with a rechunk.

Source

pub fn drop_in_place(&mut self, name: &str) -> PolarsResult<Column>

Remove a column by name and return the column removed.

§Example

let mut df: DataFrame = df!("Animal" => ["Tiger", "Lion", "Great auk"],
                            "IUCN" => ["Endangered", "Vulnerable", "Extinct"])?;

let s1: PolarsResult<Column> = df.drop_in_place("Average weight");
assert!(s1.is_err());

let s2: Column = df.drop_in_place("Animal")?;
assert_eq!(s2, Column::new("Animal".into(), &["Tiger", "Lion", "Great auk"]));

Source

pub fn drop_nulls<S>(&self, subset: Option<&[S]>) -> PolarsResult<Self>
where for<'a> &'a S: Into<PlSmallStr>,

Return a new DataFrame where all null values are dropped.

§Example

let df1: DataFrame = df!("Country" => ["Malta", "Liechtenstein", "North Korea"],
                        "Tax revenue (% GDP)" => [Some(32.7), None, None])?;
assert_eq!(df1.shape(), (3, 2));

let df2: DataFrame = df1.drop_nulls::<String>(None)?;
assert_eq!(df2.shape(), (1, 2));
println!("{}", df2);

Output:

shape: (1, 2)
+---------+---------------------+
| Country | Tax revenue (% GDP) |
| ---     | ---                 |
| str     | f64                 |
+=========+=====================+
| Malta   | 32.7                |
+---------+---------------------+

Source

pub fn drop(&self, name: &str) -> PolarsResult<Self>

Drop a column by name. This is a pure method and will return a new DataFrame instead of modifying the current one in place.

§Example

let df1: DataFrame = df!("Ray type" => ["α", "β", "X", "γ"])?;
let df2: DataFrame = df1.drop("Ray type")?;

assert!(df2.is_empty());

Source

pub fn drop_many<I, S>(&self, names: I) -> Self
where I: IntoIterator<Item = S>, S: Into<PlSmallStr>,

Drop columns that are in names.

Source

pub fn drop_many_amortized(&self, names: &PlHashSet<PlSmallStr>) -> DataFrame

Drop columns that are in names without allocating a HashSet.

Source

pub fn insert_column<S: IntoColumn>( &mut self, index: usize, column: S, ) -> PolarsResult<&mut Self>

Insert a new column at a given index.

Source

pub fn with_column<C: IntoColumn>( &mut self, column: C, ) -> PolarsResult<&mut Self>

Add a new column to this DataFrame or replace an existing one.

Source

pub unsafe fn with_column_unchecked(&mut self, column: Column) -> &mut Self

Adds a column to the DataFrame without doing any checks on length or duplicates.

§Safety

The caller must ensure self.width() == 0 || column.len() == self.height() .

Source

pub fn _add_series( &mut self, series: Vec<Series>, schema: &Schema, ) -> PolarsResult<()>

Source

pub fn _add_columns( &mut self, columns: Vec<Column>, schema: &Schema, ) -> PolarsResult<()>

Source

pub fn with_column_and_schema<C: IntoColumn>( &mut self, column: C, schema: &Schema, ) -> PolarsResult<&mut Self>

Add a new column to this DataFrame or replace an existing one. Uses an existing schema to amortize lookups. If the schema is incorrect, we will fallback to linear search.

Note: Schema can be both input or output_schema

Source

pub fn get(&self, idx: usize) -> Option<Vec<AnyValue<'_>>>

Get a row in the DataFrame. Beware this is slow.

§Example

fn example(df: &mut DataFrame, idx: usize) -> Option<Vec<AnyValue>> {
    df.get(idx)
}

Source

pub fn select_at_idx(&self, idx: usize) -> Option<&Column>

Select a Series by index.

§Example

let df: DataFrame = df!("Star" => ["Sun", "Betelgeuse", "Sirius A", "Sirius B"],
                        "Absolute magnitude" => [4.83, -5.85, 1.42, 11.18])?;

let s1: Option<&Column> = df.select_at_idx(0);
let s2 = Column::new("Star".into(), ["Sun", "Betelgeuse", "Sirius A", "Sirius B"]);

assert_eq!(s1, Some(&s2));

Source

pub fn select_by_range<R>(&self, range: R) -> PolarsResult<Self>
where R: RangeBounds<usize>,

Select column(s) from this DataFrame by range and return a new DataFrame

§Examples

let df = df! {
    "0" => [0, 0, 0],
    "1" => [1, 1, 1],
    "2" => [2, 2, 2]
}?;

assert!(df.select(["0", "1"])?.equals(&df.select_by_range(0..=1)?));
assert!(df.equals(&df.select_by_range(..)?));

Source

pub fn get_column_index(&self, name: &str) -> Option<usize>

Get column index of a Series by name.

§Example

let df: DataFrame = df!("Name" => ["Player 1", "Player 2", "Player 3"],
                        "Health" => [100, 200, 500],
                        "Mana" => [250, 100, 0],
                        "Strength" => [30, 150, 300])?;

assert_eq!(df.get_column_index("Name"), Some(0));
assert_eq!(df.get_column_index("Health"), Some(1));
assert_eq!(df.get_column_index("Mana"), Some(2));
assert_eq!(df.get_column_index("Strength"), Some(3));
assert_eq!(df.get_column_index("Haste"), None);

Source

pub fn try_get_column_index(&self, name: &str) -> PolarsResult<usize>

Get column index of a Series by name.

Source

pub fn column(&self, name: &str) -> PolarsResult<&Column>

Select a single column by name.

§Example

let s1 = Column::new("Password".into(), ["123456", "[]B$u$g$s$B#u#n#n#y[]{}"]);
let s2 = Column::new("Robustness".into(), ["Weak", "Strong"]);
let df: DataFrame = DataFrame::new(vec![s1.clone(), s2])?;

assert_eq!(df.column("Password")?, &s1);

Source

pub fn columns<I, S>(&self, names: I) -> PolarsResult<Vec<&Column>>
where I: IntoIterator<Item = S>, S: AsRef<str>,

Selected multiple columns by name.

§Example

let df: DataFrame = df!("Latin name" => ["Oncorhynchus kisutch", "Salmo salar"],
                        "Max weight (kg)" => [16.0, 35.89])?;
let sv: Vec<&Column> = df.columns(["Latin name", "Max weight (kg)"])?;

assert_eq!(&df[0], sv[0]);
assert_eq!(&df[1], sv[1]);

Source

pub fn select<I, S>(&self, selection: I) -> PolarsResult<Self>
where I: IntoIterator<Item = S>, S: Into<PlSmallStr>,

Select column(s) from this DataFrame and return a new DataFrame.

§Examples

fn example(df: &DataFrame) -> PolarsResult<DataFrame> {
    df.select(["foo", "bar"])
}

Source

pub fn _select_impl(&self, cols: &[PlSmallStr]) -> PolarsResult<Self>

Source

pub fn _select_impl_unchecked(&self, cols: &[PlSmallStr]) -> PolarsResult<Self>

Source

pub fn select_with_schema<I, S>( &self, selection: I, schema: &SchemaRef, ) -> PolarsResult<Self>
where I: IntoIterator<Item = S>, S: Into<PlSmallStr>,

Select with a known schema. The schema names must match the column names of this DataFrame.

Source

pub fn select_with_schema_unchecked<I, S>( &self, selection: I, schema: &Schema, ) -> PolarsResult<Self>
where I: IntoIterator<Item = S>, S: Into<PlSmallStr>,

Select with a known schema without checking for duplicates in selection. The schema names must match the column names of this DataFrame.

Source

pub fn _select_with_schema_impl( &self, cols: &[PlSmallStr], schema: &Schema, check_duplicates: bool, ) -> PolarsResult<Self>

The schema names must match the column names of this DataFrame.

Source

pub fn select_physical<I, S>(&self, selection: I) -> PolarsResult<Self>
where I: IntoIterator<Item = S>, S: Into<PlSmallStr>,

Source

pub fn select_columns( &self, selection: impl IntoVec<PlSmallStr>, ) -> PolarsResult<Vec<Column>>

Select column(s) from this DataFrame and return them into a Vec.

§Example

let df: DataFrame = df!("Name" => ["Methane", "Ethane", "Propane"],
                        "Carbon" => [1, 2, 3],
                        "Hydrogen" => [4, 6, 8])?;
let sv: Vec<Column> = df.select_columns(["Carbon", "Hydrogen"])?;

assert_eq!(df["Carbon"], sv[0]);
assert_eq!(df["Hydrogen"], sv[1]);

Source

pub fn filter(&self, mask: &BooleanChunked) -> PolarsResult<Self>

Take the DataFrame rows by a boolean mask.

§Example

fn example(df: &DataFrame) -> PolarsResult<DataFrame> {
    let mask = df.column("sepal_width")?.is_not_null();
    df.filter(&mask)
}

Source

pub fn _filter_seq(&self, mask: &BooleanChunked) -> PolarsResult<Self>

Same as filter but does not parallelize.

Source

pub fn take(&self, indices: &IdxCa) -> PolarsResult<Self>

Take DataFrame rows by index values.

§Example

fn example(df: &DataFrame) -> PolarsResult<DataFrame> {
    let idx = IdxCa::new("idx".into(), [0, 1, 9]);
    df.take(&idx)
}

Source

pub unsafe fn take_unchecked(&self, idx: &IdxCa) -> Self

§Safety

The indices must be in-bounds.

Source

pub unsafe fn take_unchecked_impl( &self, idx: &IdxCa, allow_threads: bool, ) -> Self

§Safety

The indices must be in-bounds.

Source

pub unsafe fn take_slice_unchecked(&self, idx: &[IdxSize]) -> Self

§Safety

The indices must be in-bounds.

Source

pub unsafe fn take_slice_unchecked_impl( &self, idx: &[IdxSize], allow_threads: bool, ) -> Self

§Safety

The indices must be in-bounds.

Source

pub fn rename( &mut self, column: &str, name: PlSmallStr, ) -> PolarsResult<&mut Self>

Rename a column in the DataFrame.

§Example

fn example(df: &mut DataFrame) -> PolarsResult<&mut DataFrame> {
    let original_name = "foo";
    let new_name = "bar";
    df.rename(original_name, new_name.into())
}

Source

pub fn sort_in_place( &mut self, by: impl IntoVec<PlSmallStr>, sort_options: SortMultipleOptions, ) -> PolarsResult<&mut Self>

Sort DataFrame in place.

See DataFrame::sort for more instruction.

Source

pub fn _to_metadata(&self) -> DataFrame

Create a DataFrame that has fields for all the known runtime metadata for each column.

This dataframe does not necessarily have a specified schema and may be changed at any point. It is primarily used for debugging.

Source

pub fn sort( &self, by: impl IntoVec<PlSmallStr>, sort_options: SortMultipleOptions, ) -> PolarsResult<Self>

Return a sorted clone of this DataFrame.

In many cases the output chunks will be continuous in memory but this is not guaranteed

§Example

Sort by a single column with default options:

fn sort_by_sepal_width(df: &DataFrame) -> PolarsResult<DataFrame> {
    df.sort(["sepal_width"], Default::default())
}

Sort by a single column with specific order:

fn sort_with_specific_order(df: &DataFrame, descending: bool) -> PolarsResult<DataFrame> {
    df.sort(
        ["sepal_width"],
        SortMultipleOptions::new()
            .with_order_descending(descending)
    )
}

Sort by multiple columns with specifying order for each column:

fn sort_by_multiple_columns_with_specific_order(df: &DataFrame) -> PolarsResult<DataFrame> {
    df.sort(
        ["sepal_width", "sepal_length"],
        SortMultipleOptions::new()
            .with_order_descending_multi([false, true])
    )
}

See SortMultipleOptions for more options.

Also see DataFrame::sort_in_place.

Source

pub fn replace<S: IntoSeries>( &mut self, column: &str, new_col: S, ) -> PolarsResult<&mut Self>

Replace a column with a Series.

§Example

let mut df: DataFrame = df!("Country" => ["United States", "China"],
                        "Area (km²)" => [9_833_520, 9_596_961])?;
let s: Series = Series::new("Country".into(), ["USA", "PRC"]);

assert!(df.replace("Nation", s.clone()).is_err());
assert!(df.replace("Country", s).is_ok());

Source

pub fn replace_or_add<S: IntoSeries>( &mut self, column: PlSmallStr, new_col: S, ) -> PolarsResult<&mut Self>

Replace or update a column. The difference between this method and DataFrame::with_column is that now the value of column: &str determines the name of the column and not the name of the Series passed to this method.

Source

pub fn replace_column<C: IntoColumn>( &mut self, index: usize, new_column: C, ) -> PolarsResult<&mut Self>

Replace column at index idx with a Series.

§Example

# use polars_core::prelude::*;
let s0 = Series::new("foo".into(), ["ham", "spam", "egg"]);
let s1 = Series::new("ascii".into(), [70, 79, 79]);
let mut df = DataFrame::new(vec![s0, s1])?;

// Add 32 to get lowercase ascii values
df.replace_column(1, df.select_at_idx(1).unwrap() + 32);
# Ok::<(), PolarsError>(())

Source

pub fn apply<F, C>(&mut self, name: &str, f: F) -> PolarsResult<&mut Self>
where F: FnOnce(&Column) -> C, C: IntoColumn,

Apply a closure to a column. This is the recommended way to do in place modification.

§Example

let s0 = Column::new("foo".into(), ["ham", "spam", "egg"]);
let s1 = Column::new("names".into(), ["Jean", "Claude", "van"]);
let mut df = DataFrame::new(vec![s0, s1])?;

fn str_to_len(str_val: &Column) -> Column {
    str_val.str()
        .unwrap()
        .into_iter()
        .map(|opt_name: Option<&str>| {
            opt_name.map(|name: &str| name.len() as u32)
         })
        .collect::<UInt32Chunked>()
        .into_column()
}

// Replace the names column by the length of the names.
df.apply("names", str_to_len);

Results in:

+--------+-------+
| foo    |       |
| ---    | names |
| str    | u32   |
+========+=======+
| "ham"  | 4     |
+--------+-------+
| "spam" | 6     |
+--------+-------+
| "egg"  | 3     |
+--------+-------+

Source

pub fn apply_at_idx<F, C>( &mut self, idx: usize, f: F, ) -> PolarsResult<&mut Self>
where F: FnOnce(&Column) -> C, C: IntoColumn,

Apply a closure to a column at index idx. This is the recommended way to do in place modification.

§Example

let s0 = Column::new("foo".into(), ["ham", "spam", "egg"]);
let s1 = Column::new("ascii".into(), [70, 79, 79]);
let mut df = DataFrame::new(vec![s0, s1])?;

// Add 32 to get lowercase ascii values
df.apply_at_idx(1, |s| s + 32);

Results in:

+--------+-------+
| foo    | ascii |
| ---    | ---   |
| str    | i32   |
+========+=======+
| "ham"  | 102   |
+--------+-------+
| "spam" | 111   |
+--------+-------+
| "egg"  | 111   |
+--------+-------+

Source

pub fn try_apply_at_idx<F, C>( &mut self, idx: usize, f: F, ) -> PolarsResult<&mut Self>
where F: FnOnce(&Column) -> PolarsResult<C>, C: IntoColumn,

Apply a closure that may fail to a column at index idx. This is the recommended way to do in place modification.

§Example

This is the idiomatic way to replace some values a column of a DataFrame given range of indexes.

let s0 = Column::new("foo".into(), ["ham", "spam", "egg", "bacon", "quack"]);
let s1 = Column::new("values".into(), [1, 2, 3, 4, 5]);
let mut df = DataFrame::new(vec![s0, s1])?;

let idx = vec![0, 1, 4];

df.try_apply("foo", |c| {
    c.str()?
    .scatter_with(idx, |opt_val| opt_val.map(|string| format!("{}-is-modified", string)))
});

Results in:

+---------------------+--------+
| foo                 | values |
| ---                 | ---    |
| str                 | i32    |
+=====================+========+
| "ham-is-modified"   | 1      |
+---------------------+--------+
| "spam-is-modified"  | 2      |
+---------------------+--------+
| "egg"               | 3      |
+---------------------+--------+
| "bacon"             | 4      |
+---------------------+--------+
| "quack-is-modified" | 5      |
+---------------------+--------+

Source

pub fn try_apply<F, C>(&mut self, column: &str, f: F) -> PolarsResult<&mut Self>
where F: FnOnce(&Series) -> PolarsResult<C>, C: IntoColumn,

Apply a closure that may fail to a column. This is the recommended way to do in place modification.

§Example

This is the idiomatic way to replace some values a column of a DataFrame given a boolean mask.

let s0 = Column::new("foo".into(), ["ham", "spam", "egg", "bacon", "quack"]);
let s1 = Column::new("values".into(), [1, 2, 3, 4, 5]);
let mut df = DataFrame::new(vec![s0, s1])?;

// create a mask
let values = df.column("values")?.as_materialized_series();
let mask = values.lt_eq(1)? | values.gt_eq(5_i32)?;

df.try_apply("foo", |c| {
    c.str()?
    .set(&mask, Some("not_within_bounds"))
});

Results in:

+---------------------+--------+
| foo                 | values |
| ---                 | ---    |
| str                 | i32    |
+=====================+========+
| "not_within_bounds" | 1      |
+---------------------+--------+
| "spam"              | 2      |
+---------------------+--------+
| "egg"               | 3      |
+---------------------+--------+
| "bacon"             | 4      |
+---------------------+--------+
| "not_within_bounds" | 5      |
+---------------------+--------+

Source

pub fn slice(&self, offset: i64, length: usize) -> Self

Slice the DataFrame along the rows.

§Example

let df: DataFrame = df!("Fruit" => ["Apple", "Grape", "Grape", "Fig", "Fig"],
                        "Color" => ["Green", "Red", "White", "White", "Red"])?;
let sl: DataFrame = df.slice(2, 3);

assert_eq!(sl.shape(), (3, 2));
println!("{}", sl);

Output:

shape: (3, 2)
+-------+-------+
| Fruit | Color |
| ---   | ---   |
| str   | str   |
+=======+=======+
| Grape | White |
+-------+-------+
| Fig   | White |
+-------+-------+
| Fig   | Red   |
+-------+-------+

Source

pub fn split_at(&self, offset: i64) -> (Self, Self)

Split DataFrame at the given offset.

Source

pub fn head(&self, length: Option<usize>) -> Self

Get the head of the DataFrame.

§Example

let countries: DataFrame =
    df!("Rank by GDP (2021)" => [1, 2, 3, 4, 5],
        "Continent" => ["North America", "Asia", "Asia", "Europe", "Europe"],
        "Country" => ["United States", "China", "Japan", "Germany", "United Kingdom"],
        "Capital" => ["Washington", "Beijing", "Tokyo", "Berlin", "London"])?;
assert_eq!(countries.shape(), (5, 4));

println!("{}", countries.head(Some(3)));

Output:

shape: (3, 4)
+--------------------+---------------+---------------+------------+
| Rank by GDP (2021) | Continent     | Country       | Capital    |
| ---                | ---           | ---           | ---        |
| i32                | str           | str           | str        |
+====================+===============+===============+============+
| 1                  | North America | United States | Washington |
+--------------------+---------------+---------------+------------+
| 2                  | Asia          | China         | Beijing    |
+--------------------+---------------+---------------+------------+
| 3                  | Asia          | Japan         | Tokyo      |
+--------------------+---------------+---------------+------------+

Source

pub fn tail(&self, length: Option<usize>) -> Self

Get the tail of the DataFrame.

§Example

let countries: DataFrame =
    df!("Rank (2021)" => [105, 106, 107, 108, 109],
        "Apple Price (€/kg)" => [0.75, 0.70, 0.70, 0.65, 0.52],
        "Country" => ["Kosovo", "Moldova", "North Macedonia", "Syria", "Turkey"])?;
assert_eq!(countries.shape(), (5, 3));

println!("{}", countries.tail(Some(2)));

Output:

shape: (2, 3)
+-------------+--------------------+---------+
| Rank (2021) | Apple Price (€/kg) | Country |
| ---         | ---                | ---     |
| i32         | f64                | str     |
+=============+====================+=========+
| 108         | 0.63               | Syria   |
+-------------+--------------------+---------+
| 109         | 0.63               | Turkey  |
+-------------+--------------------+---------+

Source

pub fn iter_chunks( &self, compat_level: CompatLevel, parallel: bool, ) -> RecordBatchIter<'_> ⓘ

Iterator over the rows in this DataFrame as Arrow RecordBatches.

§Panics

Panics if the DataFrame that is passed is not rechunked.

This responsibility is left to the caller as we don’t want to take mutable references here, but we also don’t want to rechunk here, as this operation is costly and would benefit the caller as well.

Source

pub fn iter_chunks_physical(&self) -> PhysRecordBatchIter<'_> ⓘ

Iterator over the rows in this DataFrame as Arrow RecordBatches as physical values.

§Panics

Panics if the DataFrame that is passed is not rechunked.

This responsibility is left to the caller as we don’t want to take mutable references here, but we also don’t want to rechunk here, as this operation is costly and would benefit the caller as well.

Source

pub fn reverse(&self) -> Self

Get a DataFrame with all the columns in reversed order.

Source

pub fn shift(&self, periods: i64) -> Self

Shift the values by a given period and fill the parts that will be empty due to this operation with Nones.

See the method on Series for more info on the shift operation.

Source

pub fn fill_null(&self, strategy: FillNullStrategy) -> PolarsResult<Self>

Replace None values with one of the following strategies:

Forward fill (replace None with the previous value)
Backward fill (replace None with the next value)
Mean fill (replace None with the mean of the whole array)
Min fill (replace None with the minimum of the whole array)
Max fill (replace None with the maximum of the whole array)

See the method on Series for more info on the fill_null operation.

Source

pub fn pipe<F, B>(self, f: F) -> PolarsResult
where F: Fn(DataFrame) -> PolarsResult,

Pipe different functions/ closure operations that work on a DataFrame together.

Source

pub fn pipe_mut<F, B>(&mut self, f: F) -> PolarsResult
where F: Fn(&mut DataFrame) -> PolarsResult,

Pipe different functions/ closure operations that work on a DataFrame together.

Source

pub fn pipe_with_args<F, B, Args>(self, f: F, args: Args) -> PolarsResult
where F: Fn(DataFrame, Args) -> PolarsResult,

Pipe different functions/ closure operations that work on a DataFrame together.

Source

pub fn unique_stable( &self, subset: Option<&[String]>, keep: UniqueKeepStrategy, slice: Option<(i64, usize)>, ) -> PolarsResult<DataFrame>

Available on crate feature algorithm_group_by only.

Drop duplicate rows from a DataFrame. This fails when there is a column of type List in DataFrame

Stable means that the order is maintained. This has a higher cost than an unstable distinct.

§Example

let df = df! {
              "flt" => [1., 1., 2., 2., 3., 3.],
              "int" => [1, 1, 2, 2, 3, 3, ],
              "str" => ["a", "a", "b", "b", "c", "c"]
          }?;

println!("{}", df.unique_stable(None, UniqueKeepStrategy::First, None)?);

Returns

+-----+-----+-----+
| flt | int | str |
| --- | --- | --- |
| f64 | i32 | str |
+=====+=====+=====+
| 1   | 1   | "a" |
+-----+-----+-----+
| 2   | 2   | "b" |
+-----+-----+-----+
| 3   | 3   | "c" |
+-----+-----+-----+

Source

pub fn unique<I, S>( &self, subset: Option<&[String]>, keep: UniqueKeepStrategy, slice: Option<(i64, usize)>, ) -> PolarsResult<DataFrame>

Available on crate feature algorithm_group_by only.

Unstable distinct. See DataFrame::unique_stable.

Source

pub fn unique_impl( &self, maintain_order: bool, subset: Option<Vec<PlSmallStr>>, keep: UniqueKeepStrategy, slice: Option<(i64, usize)>, ) -> PolarsResult<Self>

Available on crate feature algorithm_group_by only.

Source

pub fn is_unique(&self) -> PolarsResult<BooleanChunked>

Available on crate feature algorithm_group_by only.

Get a mask of all the unique rows in the DataFrame.

§Example

let df: DataFrame = df!("Company" => ["Apple", "Microsoft"],
                        "ISIN" => ["US0378331005", "US5949181045"])?;
let ca: ChunkedArray<BooleanType> = df.is_unique()?;

assert!(ca.all());

Source

pub fn is_duplicated(&self) -> PolarsResult<BooleanChunked>

Available on crate feature algorithm_group_by only.

Get a mask of all the duplicated rows in the DataFrame.

§Example

let df: DataFrame = df!("Company" => ["Alphabet", "Alphabet"],
                        "ISIN" => ["US02079K3059", "US02079K1079"])?;
let ca: ChunkedArray<BooleanType> = df.is_duplicated()?;

assert!(!ca.all());

Source

pub fn null_count(&self) -> Self

Create a new DataFrame that shows the null counts per column.

Source

pub fn hash_rows( &mut self, hasher_builder: Option<PlRandomState>, ) -> PolarsResult<UInt64Chunked>

Available on crate feature row_hash only.

Hash and combine the row values

Source

pub fn get_supertype(&self) -> Option<PolarsResult<DataType>>

Get the supertype of the columns in this DataFrame

Source

pub fn partition_by<I, S>( &self, cols: I, include_key: bool, ) -> PolarsResult<Vec<DataFrame>>
where I: IntoIterator<Item = S>, S: Into<PlSmallStr>,

Available on crate feature partition_by only.

Split into multiple DataFrames partitioned by groups

Source

pub fn partition_by_stable<I, S>( &self, cols: I, include_key: bool, ) -> PolarsResult<Vec<DataFrame>>
where I: IntoIterator<Item = S>, S: Into<PlSmallStr>,

Available on crate feature partition_by only.

Split into multiple DataFrames partitioned by groups Order of the groups are maintained.

Source

pub fn append_record_batch( &mut self, rb: RecordBatchT<ArrayRef>, ) -> PolarsResult<()>

Source §

impl DataFrame

Source

pub fn serialize_into_writer( &mut self, writer: &mut dyn Write, ) -> PolarsResult<()>

Available on crate feature serde only.

Source

pub fn serialize_to_bytes(&mut self) -> PolarsResult<Vec<u8>>

Available on crate feature serde only.

Source

pub fn deserialize_from_reader(reader: &mut dyn Read) -> PolarsResult<Self>

Available on crate feature serde only.

Source §

impl DataFrame

Source

pub fn schema_equal(&self, other: &DataFrame) -> PolarsResult<()>

Check if DataFrame’ schemas are equal.

Source

pub fn equals(&self, other: &DataFrame) -> bool

Check if DataFrames are equal. Note that None == None evaluates to false

§Example

let df1: DataFrame = df!("Atomic number" => &[1, 51, 300],
                        "Element" => &[Some("Hydrogen"), Some("Antimony"), None])?;
let df2: DataFrame = df!("Atomic number" => &[1, 51, 300],
                        "Element" => &[Some("Hydrogen"), Some("Antimony"), None])?;

assert!(!df1.equals(&df2));

Source

pub fn equals_missing(&self, other: &DataFrame) -> bool

Check if all values in DataFrames are equal where None == None evaluates to true.

§Example

let df1: DataFrame = df!("Atomic number" => &[1, 51, 300],
                        "Element" => &[Some("Hydrogen"), Some("Antimony"), None])?;
let df2: DataFrame = df!("Atomic number" => &[1, 51, 300],
                        "Element" => &[Some("Hydrogen"), Some("Antimony"), None])?;

assert!(df1.equals_missing(&df2));

Trait Implementations§

Source §

impl Add<&DataFrame> for &DataFrame

Available on crate feature dataframe_arithmetic only.

Source §

type Output = Result<DataFrame, PolarsError>

The resulting type after applying the + operator.

Source §

fn add(self, rhs: &DataFrame) -> Self::Output

Performs the + operation. Read more

Source §

impl Add<&Series> for &DataFrame

Available on crate feature dataframe_arithmetic only.

Source §

type Output = Result<DataFrame, PolarsError>

The resulting type after applying the + operator.

Source §

fn add(self, rhs: &Series) -> Self::Output

Performs the + operation. Read more

Source §

impl Add<&Series> for DataFrame

Available on crate feature dataframe_arithmetic only.

Source §

type Output = Result<DataFrame, PolarsError>

The resulting type after applying the + operator.

Source §

fn add(self, rhs: &Series) -> Self::Output

Performs the + operation. Read more

Source §

impl Clone for DataFrame

Source §

fn clone(&self) -> DataFrame

Returns a copy of the value. Read more

1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more

Source §

impl Container for DataFrame

Source §

fn slice(&self, offset: i64, len: usize) -> Self

Source §

fn split_at(&self, offset: i64) -> (Self, Self)

Source §

fn len(&self) -> usize

Source §

fn iter_chunks(&self) -> impl Iterator<Item = Self>

Source §

fn n_chunks(&self) -> usize

Source §

fn chunk_lengths(&self) -> impl Iterator<Item = usize>

Source §

impl Debug for DataFrame

Source §

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Source §

impl Default for DataFrame

Source §

fn default() -> Self

Returns the “default value” for a type. Read more

Source §

impl<'de> Deserialize<'de> for DataFrame

Available on crate feature serde only.

Source §

fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more

Source §

impl Display for DataFrame

Source §

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Source §

impl Div<&DataFrame> for &DataFrame

Available on crate feature dataframe_arithmetic only.

Source §

type Output = Result<DataFrame, PolarsError>

The resulting type after applying the / operator.

Source §

fn div(self, rhs: &DataFrame) -> Self::Output

Performs the / operation. Read more

Source §

impl Div<&Series> for &DataFrame

Available on crate feature dataframe_arithmetic only.

Source §

type Output = Result<DataFrame, PolarsError>

The resulting type after applying the / operator.

Source §

fn div(self, rhs: &Series) -> Self::Output

Performs the / operation. Read more

Source §

impl Div<&Series> for DataFrame

Available on crate feature dataframe_arithmetic only.

Source §

type Output = Result<DataFrame, PolarsError>

The resulting type after applying the / operator.

Source §

fn div(self, rhs: &Series) -> Self::Output

Performs the / operation. Read more

Source §

impl From<DataFrame> for Vec<Column>

Source §

fn from(df: DataFrame) -> Self

Converts to this type from the input type.

Source §

impl FromIterator<Column> for DataFrame

Source §

fn from_iter<T: IntoIterator<Item = Column>>(iter: T) -> Self

§Panics

Panics if Column have different lengths.

Source §

impl FromIterator<Series> for DataFrame

Source §

fn from_iter<T: IntoIterator<Item = Series>>(iter: T) -> Self

§Panics

Panics if Series have different lengths.

Source §

impl Index<&str> for DataFrame

Source §

type Output = Column

The returned type after indexing.

Source §

fn index(&self, index: &str) -> &Self::Output

Performs the indexing (container[index]) operation. Read more

Source §

impl Index<Range<usize>> for DataFrame

Source §

type Output = [Column]

The returned type after indexing.

Source §

fn index(&self, index: Range<usize>) -> &Self::Output

Performs the indexing (container[index]) operation. Read more

Source §

impl Index<RangeFrom<usize>> for DataFrame

Source §

type Output = [Column]

The returned type after indexing.

Source §

fn index(&self, index: RangeFrom<usize>) -> &Self::Output

Performs the indexing (container[index]) operation. Read more

Source §

impl Index<RangeFull> for DataFrame

Source §

type Output = [Column]

The returned type after indexing.

Source §

fn index(&self, index: RangeFull) -> &Self::Output

Performs the indexing (container[index]) operation. Read more

Source §

impl Index<RangeInclusive<usize>> for DataFrame

Source §

type Output = [Column]

The returned type after indexing.

Source §

fn index(&self, index: RangeInclusive<usize>) -> &Self::Output

Performs the indexing (container[index]) operation. Read more

Source §

impl Index<RangeTo<usize>> for DataFrame

Source §

type Output = [Column]

The returned type after indexing.

Source §

fn index(&self, index: RangeTo<usize>) -> &Self::Output

Performs the indexing (container[index]) operation. Read more

Source §

impl Index<RangeToInclusive<usize>> for DataFrame

Source §

type Output = [Column]

The returned type after indexing.

Source §

fn index(&self, index: RangeToInclusive<usize>) -> &Self::Output

Performs the indexing (container[index]) operation. Read more

Source §

impl Index<usize> for DataFrame

Source §

type Output = Column

The returned type after indexing.

Source §

fn index(&self, index: usize) -> &Self::Output

Performs the indexing (container[index]) operation. Read more

Source §

impl Mul<&DataFrame> for &DataFrame

Available on crate feature dataframe_arithmetic only.

Source §

type Output = Result<DataFrame, PolarsError>

The resulting type after applying the * operator.

Source §

fn mul(self, rhs: &DataFrame) -> Self::Output

Performs the * operation. Read more

Source §

impl Mul<&Series> for &DataFrame

Available on crate feature dataframe_arithmetic only.

Source §

type Output = Result<DataFrame, PolarsError>

The resulting type after applying the * operator.

Source §

fn mul(self, rhs: &Series) -> Self::Output

Performs the * operation. Read more

Source §

impl Mul<&Series> for DataFrame

Available on crate feature dataframe_arithmetic only.

Source §

type Output = Result<DataFrame, PolarsError>

The resulting type after applying the * operator.

Source §

fn mul(self, rhs: &Series) -> Self::Output

Performs the * operation. Read more

Source §

impl PartialEq for DataFrame

Source §

fn eq(&self, other: &Self) -> bool

Tests for self and other values to be equal, and is used by ==.

1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.

Source §

impl Rem<&DataFrame> for &DataFrame

Available on crate feature dataframe_arithmetic only.

Source §

type Output = Result<DataFrame, PolarsError>

The resulting type after applying the % operator.

Source §

fn rem(self, rhs: &DataFrame) -> Self::Output

Performs the % operation. Read more

Source §

impl Rem<&Series> for &DataFrame

Available on crate feature dataframe_arithmetic only.

Source §

type Output = Result<DataFrame, PolarsError>

The resulting type after applying the % operator.

Source §

fn rem(self, rhs: &Series) -> Self::Output

Performs the % operation. Read more

Source §

impl Rem<&Series> for DataFrame

Available on crate feature dataframe_arithmetic only.

Source §

type Output = Result<DataFrame, PolarsError>

The resulting type after applying the % operator.

Source §

fn rem(self, rhs: &Series) -> Self::Output

Performs the % operation. Read more

Source §

impl Serialize for DataFrame

Available on crate feature serde only.

Source §

fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where S: Serializer,

Serialize this value into the given Serde serializer. Read more

Source §

impl Sub<&DataFrame> for &DataFrame

Available on crate feature dataframe_arithmetic only.

Source §

type Output = Result<DataFrame, PolarsError>

The resulting type after applying the - operator.

Source §

fn sub(self, rhs: &DataFrame) -> Self::Output

Performs the - operation. Read more

Source §

impl Sub<&Series> for &DataFrame

Available on crate feature dataframe_arithmetic only.

Source §

type Output = Result<DataFrame, PolarsError>

The resulting type after applying the - operator.

Source §

fn sub(self, rhs: &Series) -> Self::Output

Performs the - operation. Read more

Source §

impl Sub<&Series> for DataFrame

Available on crate feature dataframe_arithmetic only.

Source §

type Output = Result<DataFrame, PolarsError>

The resulting type after applying the - operator.

Source §

fn sub(self, rhs: &Series) -> Self::Output

Performs the - operation. Read more

Source §

impl TryExtend<RecordBatchT<Box<dyn Array>>> for DataFrame

Source §

fn try_extend<I: IntoIterator<Item = RecordBatchT<Box<dyn Array>>>>( &mut self, iter: I, ) -> PolarsResult<()>

Fallible version of Extend::extend.

Source §

impl TryExtend<Result<RecordBatchT<Box<dyn Array>>, PolarsError>> for DataFrame

Source §

fn try_extend<I: IntoIterator<Item = PolarsResult<RecordBatchT<Box<dyn Array>>>>>( &mut self, iter: I, ) -> PolarsResult<()>

Fallible version of Extend::extend.

Source §

impl TryFrom<(RecordBatchT<Box<dyn Array>>, &Schema<Field>)> for DataFrame

Source §

type Error = PolarsError

The type returned in the event of a conversion error.

Source §

fn try_from(arg: (RecordBatch, &ArrowSchema)) -> PolarsResult<DataFrame>

Performs the conversion.

Source §

impl TryFrom<StructArray> for DataFrame

Source §

type Error = PolarsError

The type returned in the event of a conversion error.

Source §

fn try_from(arr: StructArray) -> PolarsResult<Self>

Performs the conversion.

Auto Trait Implementations§

§

impl !Freeze for DataFrame

§

impl !RefUnwindSafe for DataFrame

§

impl Send for DataFrame

§

impl Sync for DataFrame

§

impl Unpin for DataFrame

§

impl !UnwindSafe for DataFrame

Blanket Implementations§

Source §

impl<T> Any for T
where T: 'static + ?Sized,

Source §

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

Source §

impl<T> Borrow<T> for T
where T: ?Sized,

Source §

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

Source §

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source §

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

Source §

impl<T> CloneToUninit for T
where T: Clone,

Source §

unsafe fn clone_to_uninit(&self, dst: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)

Performs copy-assignment from self to dst. Read more

Source §

impl<T> DynClone for T
where T: Clone,

Source §

fn __clone_box(&self, _: Private) -> *mut ()

Source §

impl<T> From<T> for T

Source §

fn from(t: T) -> T

Returns the argument unchanged.

Source §

impl<T, U> Into for T
where U: From<T>,

Source §

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source §

impl<T> IntoEither for T

Source §

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

§

impl<T> Pointable for T

§

const ALIGN: usize

The alignment of pointer.

§

type Init = T

The type for initializers.

§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more

§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more

§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more

§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more

§

impl<T> ToCompactString for T
where T: Display,

§

fn try_to_compact_string(&self) -> Result<CompactString, ToCompactStringError>

Fallible version of [ToCompactString::to_compact_string()] Read more

§

fn to_compact_string(&self) -> CompactString

Converts the given value to a [CompactString]. Read more

Source §

impl<T> ToOwned for T
where T: Clone,

Source §

type Owned = T

The resulting type after obtaining ownership.

Source §

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more

Source §

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more

Source §

impl<T> ToString for T
where T: Display + ?Sized,

Source §

fn to_string(&self) -> String

Converts the given value to a String. Read more

Source §

impl<T, U> TryFrom for T
where U: Into<T>,

Source §

type Error = Infallible

The type returned in the event of a conversion error.

Source §

fn try_from(value: U) -> Result<T, <T as TryFrom>::Error>

Performs the conversion.

Source §

impl<T, U> TryInto for T
where U: TryFrom<T>,

Source §

type Error = >::Error

The type returned in the event of a conversion error.

Source §

fn try_into(self) -> Result<U, >::Error>

Performs the conversion.

§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

§

fn vzip(self) -> V

Source §

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,

Source §

impl<T, Rhs, Output> NumOps<Rhs, Output> for T
where T: Sub<Rhs, Output = Output> + Mul<Rhs, Output = Output> + Div<Rhs, Output = Output> + Add<Rhs, Output = Output> + Rem<Rhs, Output = Output>,

Struct DataFrameCopy item path

§Use declarations

§Initialization

§Default

§Wrapping a Vec<Series>

§Using a macro

§Using a CSV file

§Indexing

§By a number

§By a Series name

Implementations§

impl DataFrame

pub fn to_ndarray<N>( &self, ordering: IndexOrder, ) -> PolarsResult<Array2<N::Native>>where N: PolarsNumericType,

impl DataFrame

pub fn sample_n( &self, n: &Series, with_replacement: bool, shuffle: bool, seed: Option<u64>, ) -> PolarsResult<Self>

pub fn sample_n_literal( &self, n: usize, with_replacement: bool, shuffle: bool, seed: Option<u64>, ) -> PolarsResult<Self>

pub fn sample_frac( &self, frac: &Series, with_replacement: bool, shuffle: bool, seed: Option<u64>, ) -> PolarsResult<Self>

impl DataFrame

pub fn split_chunks(&mut self) -> impl Iterator<Item = DataFrame> + '_

pub fn split_chunks_by_n(self, n: usize, parallel: bool) -> Vec<DataFrame>

impl DataFrame

pub fn explode_impl(&self, columns: Vec<Column>) -> PolarsResult<DataFrame>

pub fn explode<I, S>(&self, columns: I) -> PolarsResult<DataFrame>where I: IntoIterator<Item = S>, S: Into<PlSmallStr>,

§Example

impl DataFrame

pub fn group_by_with_series( &self, by: Vec<Column>, multithreaded: bool, sorted: bool, ) -> PolarsResult<GroupBy<'_>>

pub fn group_by<I, S>(&self, by: I) -> PolarsResult<GroupBy<'_>>where I: IntoIterator<Item = S>, S: Into<PlSmallStr>,

§Example

pub fn group_by_stable<I, S>(&self, by: I) -> PolarsResult<GroupBy<'_>>where I: IntoIterator<Item = S>, S: Into<PlSmallStr>,

impl DataFrame

pub unsafe fn hstack_mut_unchecked(&mut self, columns: &[Column]) -> &mut Self

§Safety

pub fn hstack_mut(&mut self, columns: &[Column]) -> PolarsResult<&mut Self>

§Example

impl DataFrame

pub fn get_row(&self, idx: usize) -> PolarsResult<Row<'_>>

pub fn get_row_amortized<'a>( &'a self, idx: usize, row: &mut Row<'a>, ) -> PolarsResult<()>

pub unsafe fn get_row_amortized_unchecked<'a>( &'a self, idx: usize, row: &mut Row<'a>, )

§Safety

pub fn from_rows_and_schema( rows: &[Row<'_>], schema: &Schema, ) -> PolarsResult<Self>

pub fn from_rows_iter_and_schema<'a, I>( rows: I, schema: &Schema, ) -> PolarsResult<Self>where I: Iterator<Item = &'a Row<'a>>,

pub fn try_from_rows_iter_and_schema<'a, I>( rows: I, schema: &Schema, ) -> PolarsResult<Self>where I: Iterator<Item = PolarsResult<&'a Row<'a>>>,

pub fn from_rows(rows: &[Row<'_>]) -> PolarsResult<Self>

impl DataFrame

pub fn transpose( &mut self, keep_names_as: Option<&str>, new_col_names: Option<Either<String, Vec<String>>>, ) -> PolarsResult<DataFrame>

pub fn transpose_impl( &mut self, keep_names_as: Option<&str>, new_col_names: Option<Either<PlSmallStr, Vec<PlSmallStr>>>, ) -> PolarsResult<DataFrame>

impl DataFrame

pub fn clear_schema(&mut self)

pub fn materialized_column_iter(&self) -> impl ExactSizeIterator<Item = &Series>

pub fn par_materialized_column_iter( &self, ) -> impl ParallelIterator<Item = &Series>

pub fn estimated_size(&self) -> usize

§Implementation

pub fn _apply_columns(&self, func: &dyn Fn(&Column) -> Column) -> Vec<Column>

pub fn _apply_columns_par( &self, func: &(dyn Fn(&Column) -> Column + Send + Sync), ) -> Vec<Column>

pub fn new(columns: Vec<Column>) -> PolarsResult<Self>

§Example

pub fn new_with_broadcast(columns: Vec<Column>) -> PolarsResult<Self>

pub fn new_with_broadcast_len( columns: Vec<Column>, broadcast_len: usize, ) -> PolarsResult<Self>

pub unsafe fn new_with_broadcast_no_namecheck( columns: Vec<Column>, broadcast_len: usize, ) -> PolarsResult<Self>

§Safety

pub const fn empty() -> Self

§Example

pub fn empty_with_schema(schema: &Schema) -> Self

pub fn empty_with_arrow_schema(schema: &ArrowSchema) -> Self

pub fn full_null(schema: &Schema, height: usize) -> Self

pub fn pop(&mut self) -> Option<Column>

§Example

pub fn with_row_index( &self, name: PlSmallStr, offset: Option<IdxSize>, ) -> PolarsResult<Self>

§Example

pub fn with_row_index_mut( &mut self, name: PlSmallStr, offset: Option<IdxSize>, ) -> &mut Self

pub unsafe fn new_no_checks_height_from_first(columns: Vec<Column>) -> DataFrame

§Safety

pub unsafe fn new_no_checks(height: usize, columns: Vec<Column>) -> DataFrame

§Safety

pub const unsafe fn _new_no_checks_impl( height: usize, columns: Vec<Column>, ) -> DataFrame

pub unsafe fn new_no_length_checks( columns: Vec<Column>, ) -> PolarsResult<DataFrame>

§Safety

pub fn shrink_to_fit(&mut self)

pub fn as_single_chunk(&mut self) -> &mut Self

pub fn as_single_chunk_par(&mut self) -> &mut Self

Struct DataFrame

§Wrapping a `Vec<Series>`

§By a `Series` name

pub fn to_ndarray<N>( &self, ordering: IndexOrder, ) -> PolarsResult<Array2<N::Native>>
where N: PolarsNumericType,

pub fn explode<I, S>(&self, columns: I) -> PolarsResult<DataFrame>
where I: IntoIterator<Item = S>, S: Into<PlSmallStr>,

pub fn group_by<I, S>(&self, by: I) -> PolarsResult<GroupBy<'_>>
where I: IntoIterator<Item = S>, S: Into<PlSmallStr>,

pub fn group_by_stable<I, S>(&self, by: I) -> PolarsResult<GroupBy<'_>>
where I: IntoIterator<Item = S>, S: Into<PlSmallStr>,

pub fn from_rows_iter_and_schema<'a, I>( rows: I, schema: &Schema, ) -> PolarsResult<Self>
where I: Iterator<Item = &'a Row<'a>>,

pub fn try_from_rows_iter_and_schema<'a, I>( rows: I, schema: &Schema, ) -> PolarsResult<Self>
where I: Iterator<Item = PolarsResult<&'a Row<'a>>>,

pub fn set_column_names<I, S>(&mut self, names: I) -> PolarsResult<()>
where I: IntoIterator<Item = S>, S: Into<PlSmallStr>,

pub fn drop_nulls<S>(&self, subset: Option<&[S]>) -> PolarsResult<Self>
where for<'a> &'a S: Into<PlSmallStr>,

pub fn drop_many<I, S>(&self, names: I) -> Self
where I: IntoIterator<Item = S>, S: Into<PlSmallStr>,