Schema#

class polars.Schema( schema: Mapping[str, SchemaInitDataType] | Iterable[tuple[str, SchemaInitDataType]] | None = None, *, check_dtypes: bool = True, )[source]

Ordered mapping of column names to their data type.

Parameters:

schema: The schema definition given by column names and their associated Polars data type. Accepts a mapping or an iterable of tuples.

Examples

Define a schema by passing instantiated data types.

>>> schema = pl.Schema(
...     {
...         "foo": pl.String(),
...         "bar": pl.Duration("us"),
...         "baz": pl.Array(pl.Int8, 4),
...     }
... )
>>> schema
Schema({'foo': String, 'bar': Duration(time_unit='us'), 'baz': Array(Int8, shape=(4,))})

Access the data type associated with a specific column name.

>>> schema["baz"]
Array(Int8, shape=(4,))

Access various schema properties using the names, dtypes, and len methods.

>>> schema.names()
['foo', 'bar', 'baz']
>>> schema.dtypes()
[String, Duration(time_unit='us'), Array(Int8, shape=(4,))]
>>> schema.len()
3

Methods:

`dtypes`	Get the data types of the schema.
`len`	Get the number of schema entries.
`names`	Get the column names of the schema.
`to_frame`	Create an empty DataFrame (or LazyFrame) from this Schema.
`to_python`	Return a dictionary of column names and Python types.

dtypes() → list[DataType][source]

Get the data types of the schema.

Examples

>>> s = pl.Schema({"x": pl.UInt8(), "y": pl.List(pl.UInt8)})
>>> s.dtypes()
[UInt8, List(UInt8)]

len() → int[source]

Get the number of schema entries.

Examples

>>> s = pl.Schema({"x": pl.Int32(), "y": pl.List(pl.String)})
>>> s.len()
2
>>> len(s)
2

names() → list[str][source]

Get the column names of the schema.

Examples

>>> s = pl.Schema({"x": pl.Float64(), "y": pl.Datetime(time_zone="UTC")})
>>> s.names()
['x', 'y']

to_frame(*, eager: bool = True) → DataFrame | LazyFrame[source]

Create an empty DataFrame (or LazyFrame) from this Schema.

Parameters:

eager: If True, create a DataFrame; otherwise, create a LazyFrame.

Examples

>>> s = pl.Schema({"x": pl.Int32(), "y": pl.String()})
>>> s.to_frame()
shape: (0, 2)
┌─────┬─────┐
│ x   ┆ y   │
│ --- ┆ --- │
│ i32 ┆ str │
╞═════╪═════╡
└─────┴─────┘
>>> s.to_frame(eager=False)  
<LazyFrame at 0x11BC0AD80>

to_python() → dict[str, type][source]

Return a dictionary of column names and Python types.

Examples

>>> s = pl.Schema(
...     {
...         "x": pl.Int8(),
...         "y": pl.String(),
...         "z": pl.Duration("us"),
...     }
... )
>>> s.to_python()
{'x': <class 'int'>, 'y':  <class 'str'>, 'z': <class 'datetime.timedelta'>}