Schema#

class polars.Schema(
schema: Mapping[str, SchemaInitDataType] | Iterable[tuple[str, SchemaInitDataType] | ArrowSchemaExportable] | ArrowSchemaExportable | None = None,
*,
check_dtypes: bool = True,
)[source]

Ordered mapping of column names to their data type.

Parameters:
schema

The schema definition given by column names and their associated Polars data type. Accepts a mapping, or an iterable of tuples, or any object implementing the __arrow_c_schema__ PyCapsule interface (e.g. pyarrow schemas).

Examples

Define a schema by passing instantiated data types.

>>> schema = pl.Schema(
...     {
...         "foo": pl.String(),
...         "bar": pl.Duration("us"),
...         "baz": pl.Array(pl.Int8, 4),
...     }
... )
>>> schema
Schema({'foo': String, 'bar': Duration(time_unit='us'), 'baz': Array(Int8, shape=(4,))})

Access the data type associated with a specific column name.

>>> schema["baz"]
Array(Int8, shape=(4,))

Access various schema properties using the names, dtypes, and len methods.

>>> schema.names()
['foo', 'bar', 'baz']
>>> schema.dtypes()
[String, Duration(time_unit='us'), Array(Int8, shape=(4,))]
>>> schema.len()
3

Import a pyarrow schema.

>>> import pyarrow as pa
>>> pl.Schema(pa.schema([pa.field("x", pa.int32())]))
Schema({'x': Int32})

Export a schema to pyarrow.

>>> pa.schema(pl.Schema({"x": pl.Int32}))
x: int32

Methods:

contains_dtype

Check if the schema contains the given data type.

dtypes

Get the data types of the schema.

len

Get the number of schema entries.

names

Get the column names of the schema.

to_arrow

Convert the schema to a pyarrow schema.

to_frame

Create an empty DataFrame (or LazyFrame) from this Schema.

to_python

Return a dictionary of column names and Python types.

contains_dtype(dtype: DataType, *, recursive: bool) bool[source]

Check if the schema contains the given data type.

Parameters:
dtype

The data type to search for.

recursive

If False, only check top-level column dtypes. If True, also search within nested types (List, Array, Struct).

Examples

>>> s = pl.Schema({"x": pl.Int64(), "y": pl.List(pl.Float64)})
>>> s.contains_dtype(pl.Int64, recursive=False)
True
>>> s.contains_dtype(pl.Float64, recursive=False)
False
>>> s.contains_dtype(pl.Float64, recursive=True)
True
dtypes() list[DataType][source]

Get the data types of the schema.

Examples

>>> s = pl.Schema({"x": pl.UInt8(), "y": pl.List(pl.UInt8)})
>>> s.dtypes()
[UInt8, List(UInt8)]
len() int[source]

Get the number of schema entries.

Examples

>>> s = pl.Schema({"x": pl.Int32(), "y": pl.List(pl.String)})
>>> s.len()
2
>>> len(s)
2
names() list[str][source]

Get the column names of the schema.

Examples

>>> s = pl.Schema({"x": pl.Float64(), "y": pl.Datetime(time_zone="UTC")})
>>> s.names()
['x', 'y']
to_arrow(
*,
compat_level: CompatLevel | None = None,
) Schema[source]

Convert the schema to a pyarrow schema.

Parameters:
compat_level

Use a specific compatibility level when exporting Polars’ internal data types.

Examples

>>> pl.Schema({"x": pl.String}).to_arrow()
x: string_view
to_frame(*, eager: bool = True) DataFrame | LazyFrame[source]

Create an empty DataFrame (or LazyFrame) from this Schema.

Parameters:
eager

If True, create a DataFrame; otherwise, create a LazyFrame.

Examples

>>> s = pl.Schema({"x": pl.Int32(), "y": pl.String()})
>>> s.to_frame()
shape: (0, 2)
┌─────┬─────┐
│ x   ┆ y   │
│ --- ┆ --- │
│ i32 ┆ str │
╞═════╪═════╡
└─────┴─────┘
>>> s.to_frame(eager=False)  
<LazyFrame at 0x11BC0AD80>
to_python() dict[str, type][source]

Return a dictionary of column names and Python types.

Examples

>>> s = pl.Schema(
...     {
...         "x": pl.Int8(),
...         "y": pl.String(),
...         "z": pl.Duration("us"),
...     }
... )
>>> s.to_python()
{'x': <class 'int'>, 'y':  <class 'str'>, 'z': <class 'datetime.timedelta'>}