polars::datatypes

Enum ArrowDataType

pub enum ArrowDataType {
Show 38 variants Null, Boolean, Int8, Int16, Int32, Int64, UInt8, UInt16, UInt32, UInt64, Float16, Float32, Float64, Timestamp(TimeUnit, Option<PlSmallStr>), Date32, Date64, Time32(TimeUnit), Time64(TimeUnit), Duration(TimeUnit), Interval(IntervalUnit), Binary, FixedSizeBinary(usize), LargeBinary, Utf8, LargeUtf8, List(Box<Field>), FixedSizeList(Box<Field>, usize), LargeList(Box<Field>), Struct(Vec<Field>), Union(Vec<Field>, Option<Vec<i32>>, UnionMode), Map(Box<Field>, bool), Dictionary(IntegerType, Box<ArrowDataType>, bool), Decimal(usize, usize), Decimal256(usize, usize), Extension(PlSmallStr, Box<ArrowDataType>, Option<PlSmallStr>), BinaryView, Utf8View, Unknown,
}
Expand description

The set of supported logical types in this crate.

Each variant uniquely identifies a logical type, which define specific semantics to the data (e.g. how it should be represented). Each variant has a corresponding [PhysicalType], obtained via ArrowDataType::to_physical_type, which declares the in-memory representation of data. The ArrowDataType::Extension is special in that it augments a ArrowDataType with metadata to support custom types. Use to_logical_type to desugar such type and return its corresponding logical type.

Variants§

§

Null

Null type

§

Boolean

true and false.

§

Int8

An i8

§

Int16

An i16

§

Int32

An i32

§

Int64

An i64

§

UInt8

An u8

§

UInt16

An u16

§

UInt32

An u32

§

UInt64

An u64

§

Float16

An 16-bit float

§

Float32

A f32

§

Float64

A f64

§

Timestamp(TimeUnit, Option<PlSmallStr>)

A i64 representing a timestamp measured in TimeUnit with an optional timezone.

Time is measured as a Unix epoch, counting the seconds from 00:00:00.000 on 1 January 1970, excluding leap seconds, as a 64-bit signed integer.

The time zone is a string indicating the name of a time zone, one of:

  • As used in the Olson time zone database (the “tz database” or “tzdata”), such as “America/New_York”
  • An absolute time zone offset of the form +XX:XX or -XX:XX, such as +07:30

When the timezone is not specified, the timestamp is considered to have no timezone and is represented as is

§

Date32

An i32 representing the elapsed time since UNIX epoch (1970-01-01) in days.

§

Date64

An i64 representing the elapsed time since UNIX epoch (1970-01-01) in milliseconds. Values are evenly divisible by 86400000.

§

Time32(TimeUnit)

A 32-bit time representing the elapsed time since midnight in the unit of TimeUnit. Only TimeUnit::Second and TimeUnit::Millisecond are supported on this variant.

§

Time64(TimeUnit)

A 64-bit time representing the elapsed time since midnight in the unit of TimeUnit. Only TimeUnit::Microsecond and TimeUnit::Nanosecond are supported on this variant.

§

Duration(TimeUnit)

Measure of elapsed time. This elapsed time is a physical duration (i.e. 1s as defined in S.I.)

§

Interval(IntervalUnit)

A “calendar” interval modeling elapsed time that takes into account calendar shifts. For example an interval of 1 day may represent more than 24 hours.

§

Binary

Opaque binary data of variable length whose offsets are represented as i32.

§

FixedSizeBinary(usize)

Opaque binary data of fixed size. Enum parameter specifies the number of bytes per value.

§

LargeBinary

Opaque binary data of variable length whose offsets are represented as i64.

§

Utf8

A variable-length UTF-8 encoded string whose offsets are represented as i32.

§

LargeUtf8

A variable-length UTF-8 encoded string whose offsets are represented as i64.

§

List(Box<Field>)

A list of some logical data type whose offsets are represented as i32.

§

FixedSizeList(Box<Field>, usize)

A list of some logical data type with a fixed number of elements.

§

LargeList(Box<Field>)

A list of some logical data type whose offsets are represented as i64.

§

Struct(Vec<Field>)

A nested ArrowDataType with a given number of Fields.

§

Union(Vec<Field>, Option<Vec<i32>>, UnionMode)

A nested datatype that can represent slots of differing types. Third argument represents mode

§

Map(Box<Field>, bool)

A nested type that is represented as

List<entries: Struct<key: K, value: V>>

In this layout, the keys and values are each respectively contiguous. We do not constrain the key and value types, so the application is responsible for ensuring that the keys are hashable and unique. Whether the keys are sorted may be set in the metadata for this field.

In a field with Map type, the field has a child Struct field, which then has two children: key type and the second the value type. The names of the child fields may be respectively “entries”, “key”, and “value”, but this is not enforced.

Map

  - child[0] entries: Struct
    - child[0] key: K
    - child[1] value: V

Neither the “entries” field nor the “key” field may be nullable.

The metadata is structured so that Arrow systems without special handling for Map can make Map an alias for List. The “layout” attribute for the Map field must have the same contents as a List.

  • Field
  • ordered
§

Dictionary(IntegerType, Box<ArrowDataType>, bool)

A dictionary encoded array (key_type, value_type), where each array element is an index of key_type into an associated dictionary of value_type.

Dictionary arrays are used to store columns of value_type that contain many repeated values using less memory, but with a higher CPU overhead for some operations.

This type mostly used to represent low cardinality string arrays or a limited set of primitive types as integers.

The bool value indicates the Dictionary is sorted if set to true.

§

Decimal(usize, usize)

Decimal value with precision and scale precision is the number of digits in the number and scale is the number of decimal places. The number 999.99 has a precision of 5 and scale of 2.

§

Decimal256(usize, usize)

Decimal backed by 256 bits

§

Extension(PlSmallStr, Box<ArrowDataType>, Option<PlSmallStr>)

Extension type.

  • name
  • physical type
  • metadata
§

BinaryView

A binary type that inlines small values and can intern bytes.

§

Utf8View

A string type that inlines small values and can intern strings.

§

Unknown

A type unknown to Arrow.

Implementations§

§

impl ArrowDataType

pub fn to_physical_type(&self) -> PhysicalType

the [PhysicalType] of this ArrowDataType.

pub fn underlying_physical_type(&self) -> ArrowDataType

pub fn to_logical_type(&self) -> &ArrowDataType

Returns &self for all but ArrowDataType::Extension. For ArrowDataType::Extension, (recursively) returns the inner ArrowDataType. Never returns the variant ArrowDataType::Extension.

pub fn inner_dtype(&self) -> Option<&ArrowDataType>

pub fn is_nested(&self) -> bool

pub fn is_view(&self) -> bool

pub fn is_numeric(&self) -> bool

pub fn to_fixed_size_list(self, size: usize, is_nullable: bool) -> ArrowDataType

pub fn contains_dictionary(&self) -> bool

Check (recursively) whether datatype contains an ArrowDataType::Dictionary type.

Trait Implementations§

§

impl Clone for ArrowDataType

§

fn clone(&self) -> ArrowDataType

Returns a copy of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
§

impl Debug for ArrowDataType

§

fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>

Formats the value using the given formatter. Read more
§

impl Default for ArrowDataType

§

fn default() -> ArrowDataType

Returns the “default value” for a type. Read more
§

impl<'de> Deserialize<'de> for ArrowDataType

§

fn deserialize<__D>( __deserializer: __D, ) -> Result<ArrowDataType, <__D as Deserializer<'de>>::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl From<&ArrowDataType> for DataType

Source§

fn from(dt: &ArrowDataType) -> DataType

Converts to this type from the input type.
§

impl<T> From<ArrowDataType> for MutablePrimitiveArray<T>
where T: NativeType,

§

fn from(dtype: ArrowDataType) -> MutablePrimitiveArray<T>

Converts to this type from the input type.
§

impl From<IntegerType> for ArrowDataType

§

fn from(item: IntegerType) -> ArrowDataType

Converts to this type from the input type.
§

impl From<PrimitiveType> for ArrowDataType

§

fn from(item: PrimitiveType) -> ArrowDataType

Converts to this type from the input type.
§

impl Hash for ArrowDataType

§

fn hash<__H>(&self, state: &mut __H)
where __H: Hasher,

Feeds this value into the given Hasher. Read more
1.3.0 · Source§

fn hash_slice<H>(data: &[Self], state: &mut H)
where H: Hasher, Self: Sized,

Feeds a slice of this type into the given Hasher. Read more
Source§

impl PartialEq<ArrowDataType> for DataType

Source§

fn eq(&self, other: &ArrowDataType) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
§

impl PartialEq for ArrowDataType

§

fn eq(&self, other: &ArrowDataType) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
§

impl Serialize for ArrowDataType

§

fn serialize<__S>( &self, __serializer: __S, ) -> Result<<__S as Serializer>::Ok, <__S as Serializer>::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more
§

impl Eq for ArrowDataType

§

impl StructuralPartialEq for ArrowDataType

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dst: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
Source§

impl<T> DynClone for T
where T: Clone,

Source§

fn __clone_box(&self, _: Private) -> *mut ()

§

impl<Q, K> Equivalent<K> for Q
where Q: Eq + ?Sized, K: Borrow<Q> + ?Sized,

§

fn equivalent(&self, key: &K) -> bool

Compare self to key and return true if they are equal.
§

impl<Q, K> Equivalent<K> for Q
where Q: Eq + ?Sized, K: Borrow<Q> + ?Sized,

§

fn equivalent(&self, key: &K) -> bool

Checks if this value is equivalent to the given key. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

§

impl<T> Instrument for T

§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided [Span], returning an Instrumented wrapper. Read more
§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
§

impl<T> Pointable for T

§

const ALIGN: usize = _

The alignment of pointer.
§

type Init = T

The type for initializers.
§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

§

fn vzip(self) -> V

§

impl<T> WithSubscriber for T

§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a [WithDispatch] wrapper. Read more
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,

§

impl<T> ErasedDestructor for T
where T: 'static,

§

impl<T> MaybeSendSync for T