polars.json_normalize#

polars.json_normalize(
data: dict[Any, Any] | Sequence[dict[Any, Any] | Any],
*,
separator: str = '.',
max_level: int | None = None,
schema: Schema | None = None,
strict: bool = True,
infer_schema_length: int | None = 100,
) DataFrame[source]#

Normalize semi-structured deserialized JSON data into a flat table.

Dictionary objects that will not be unnested/normalized are encoded as json string data. Unlike it pandas’ counterpart, this function will not encode dictionaries as objects at any level.

Warning

This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.

Parameters:
data

Deserialized JSON objects.

separator

Nested records will generate names separated by sep. e.g., for separator=".", {"foo": {"bar": 0}} -> foo.bar.

max_level

Max number of levels(depth of dict) to normalize. If None, normalizes all levels.

schema

Overwrite the Schema when the normalized data is passed to the DataFrame constructor.

strict

Whether Polars should be strict when constructing the DataFrame.

infer_schema_length

Number of rows to take into consideration to determine the schema.

Examples

>>> data = [
...     {
...         "id": 1,
...         "name": "Cole Volk",
...         "fitness": {"height": 130, "weight": 60},
...     },
...     {"name": "Mark Reg", "fitness": {"height": 130, "weight": 60}},
...     {
...         "id": 2,
...         "name": "Faye Raker",
...         "fitness": {"height": 130, "weight": 60},
...     },
... ]
>>> pl.json_normalize(data, max_level=1)
shape: (3, 4)
┌──────┬────────────┬────────────────┬────────────────┐
│ id   ┆ name       ┆ fitness.height ┆ fitness.weight │
│ ---  ┆ ---        ┆ ---            ┆ ---            │
│ i64  ┆ str        ┆ i64            ┆ i64            │
╞══════╪════════════╪════════════════╪════════════════╡
│ 1    ┆ Cole Volk  ┆ 130            ┆ 60             │
│ null ┆ Mark Reg   ┆ 130            ┆ 60             │
│ 2    ┆ Faye Raker ┆ 130            ┆ 60             │
└──────┴────────────┴────────────────┴────────────────┘
>>> pl.json_normalize(data, max_level=0)
shape: (3, 3)
┌──────┬────────────┬───────────────────────────────┐
│ id   ┆ name       ┆ fitness                       │
│ ---  ┆ ---        ┆ ---                           │
│ i64  ┆ str        ┆ str                           │
╞══════╪════════════╪═══════════════════════════════╡
│ 1    ┆ Cole Volk  ┆ {"height": 130, "weight": 60} │
│ null ┆ Mark Reg   ┆ {"height": 130, "weight": 60} │
│ 2    ┆ Faye Raker ┆ {"height": 130, "weight": 60} │
└──────┴────────────┴───────────────────────────────┘