Struct FileMetadata
pub struct FileMetadata {
pub version: i32,
pub num_rows: usize,
pub max_row_group_height: usize,
pub created_by: Option<String>,
pub row_groups: Vec<RowGroupMetadata>,
pub key_value_metadata: Option<Vec<KeyValue>>,
pub schema_descr: SchemaDescriptor,
pub column_orders: Option<Vec<ColumnOrder>>,
pub footer_buf: Buffer<u8>,
}polars-io only.Expand description
Metadata for a Parquet file.
Fields§
§version: i32version of this file.
num_rows: usizenumber of rows in the file.
max_row_group_height: usizeMax row group height, useful for sharing column materializations.
created_by: Option<String>String message for application that wrote this file.
This should have the following format:
<application> version <application version> (build <application build hash>).
parquet-mr version 1.8.0 (build 0fda28af84b9746396014ad6a415b90592a98b3b)row_groups: Vec<RowGroupMetadata>The row groups of this file
key_value_metadata: Option<Vec<KeyValue>>key_value_metadata of this file.
schema_descr: SchemaDescriptorschema descriptor.
column_orders: Option<Vec<ColumnOrder>>Column (sort) order used for min and max values of each column in this file.
Each column order corresponds to one column, determined by its position in the list, matching the position of the column in the schema.
When None is returned, there are no column orders available, and each column
should be assumed to have undefined (legacy) column order.
Footer bytes that back this file’s column-chunk statistics. Stats
min_value / max_value are stored as (offset, len) ranges into
this buffer; pass &self.footer_buf to
[super::ColumnChunkMetadata::statistics] to materialise them.
Implementations§
§impl FileMetadata
impl FileMetadata
pub fn schema(&self) -> &SchemaDescriptor
pub fn schema(&self) -> &SchemaDescriptor
Returns the [SchemaDescriptor] that describes the schema of this file.
pub fn key_value_metadata(&self) -> &Option<Vec<KeyValue>>
pub fn key_value_metadata(&self) -> &Option<Vec<KeyValue>>
Returns the file-level key-value metadata, if present.
pub fn column_order(&self, i: usize) -> ColumnOrder
pub fn column_order(&self, i: usize) -> ColumnOrder
Returns column order for ith column in this file.
If column orders are not available, returns undefined (legacy) column order.
pub fn pruned(
&self,
keep_top_level_names: &[PlSmallStr],
predicate_top_level_names: &[PlSmallStr],
) -> Result<FileMetadata, ParquetError>
pub fn pruned( &self, keep_top_level_names: &[PlSmallStr], predicate_top_level_names: &[PlSmallStr], ) -> Result<FileMetadata, ParquetError>
Prune to projected columns, keeping statistics only for predicate columns.
Returns a new FileMetadata containing only:
- top-level schema fields whose name is in
keep_top_level_names, - row-group chunks corresponding to those fields’ leaves,
- statistics on chunks whose column is in
predicate_top_level_names.
predicate_top_level_names is treated as a subset of
keep_top_level_names; pass &[] to drop all stats. created_by,
key_value_metadata, and column_orders are also dropped (not
needed by the read hot path).
Returns Err only when [RowGroupMetadata::from_compact] rejects
the rebuilt row group (chunks-vs-leaves desync). Callers can fall
back to unpruned metadata; the unpruned form is always valid.
TODO: a planner-side pass could pre-evaluate static predicates against stats and drop fully-skipped row groups, removing stats from the wire for those cases.
Trait Implementations§
§impl Clone for FileMetadata
impl Clone for FileMetadata
§fn clone(&self) -> FileMetadata
fn clone(&self) -> FileMetadata
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more§impl Debug for FileMetadata
impl Debug for FileMetadata
§impl<'de> Deserialize<'de> for FileMetadata
impl<'de> Deserialize<'de> for FileMetadata
§fn deserialize<D>(d: D) -> Result<FileMetadata, <D as Deserializer<'de>>::Error>where
D: Deserializer<'de>,
fn deserialize<D>(d: D) -> Result<FileMetadata, <D as Deserializer<'de>>::Error>where
D: Deserializer<'de>,
§impl Serialize for FileMetadata
impl Serialize for FileMetadata
§fn serialize<S>(
&self,
s: S,
) -> Result<<S as Serializer>::Ok, <S as Serializer>::Error>where
S: Serializer,
fn serialize<S>(
&self,
s: S,
) -> Result<<S as Serializer>::Ok, <S as Serializer>::Error>where
S: Serializer,
Auto Trait Implementations§
impl Freeze for FileMetadata
impl !RefUnwindSafe for FileMetadata
impl Send for FileMetadata
impl Sync for FileMetadata
impl Unpin for FileMetadata
impl UnsafeUnpin for FileMetadata
impl !UnwindSafe for FileMetadata
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self> ⓘ
fn into_either(self, into_left: bool) -> Either<Self, Self> ⓘ
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self> ⓘ
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self> ⓘ
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more