Table of Contents

Class ArrowWriterPropertiesBuilder

Namespace
ParquetSharp.Arrow
Assembly
ParquetSharp.dll

Builder for ArrowWriterProperties.

public sealed class ArrowWriterPropertiesBuilder : IDisposable
Inheritance
ArrowWriterPropertiesBuilder
Implements
Inherited Members

Constructors

ArrowWriterPropertiesBuilder()

Create a new ArrowWriterPropertiesBuilder with default options

public ArrowWriterPropertiesBuilder()

Methods

AllowTruncatedTimestamps()

Allow loss of data when truncating timestamps.

This is disallowed by default and an error will be returned.

public ArrowWriterPropertiesBuilder AllowTruncatedTimestamps()

Returns

ArrowWriterPropertiesBuilder

Build()

Create new ArrowWriterProperties using the configured builder

public ArrowWriterProperties Build()

Returns

ArrowWriterProperties

CoerceTimestamps(TimeUnit)

Coerce all timestamps to the specified time unit.

For Parquet versions 1.0 and 2.4, nanoseconds are cast to microseconds.

public ArrowWriterPropertiesBuilder CoerceTimestamps(TimeUnit unit)

Parameters

unit TimeUnit

time unit to coerce to

Returns

ArrowWriterPropertiesBuilder

DisableCompliantNestedTypes()

Preserve Arrow list field names

public ArrowWriterPropertiesBuilder DisableCompliantNestedTypes()

Returns

ArrowWriterPropertiesBuilder

DisallowTruncatedTimestamps()

Disallow loss of data when truncating timestamps (default).

public ArrowWriterPropertiesBuilder DisallowTruncatedTimestamps()

Returns

ArrowWriterPropertiesBuilder

Dispose()

Performs application-defined tasks associated with freeing, releasing, or resetting unmanaged resources.

public void Dispose()

EnableCompliantNestedTypes()

When enabled, will not preserve Arrow field names for list types.

Instead of using the field names Arrow uses for the values array of list types (default "item"), will use "element", as is specified in the Parquet spec.

This is enabled by default.

public ArrowWriterPropertiesBuilder EnableCompliantNestedTypes()

Returns

ArrowWriterPropertiesBuilder

EngineVersion(WriterEngineVersion)

Set the version of the Parquet writer engine.

public ArrowWriterPropertiesBuilder EngineVersion(ArrowWriterProperties.WriterEngineVersion version)

Parameters

version ArrowWriterProperties.WriterEngineVersion

The engine version to use

Returns

ArrowWriterPropertiesBuilder

StoreSchema()

EXPERIMENTAL: Write binary serialized Arrow schema to the file, to enable certain read options (like "read_dictionary") to be set automatically. This also controls whether the metadata from the Arrow schema will be written to Parquet key-value metadata.

public ArrowWriterPropertiesBuilder StoreSchema()

Returns

ArrowWriterPropertiesBuilder

UseThreads(bool)

Set whether to use multiple threads to write columns in parallel in the buffered row group mode.

WARNING: If writing multiple files in parallel, deadlock may occur if use_threads is true. Please disable it in this case.

Default is false.

public ArrowWriterPropertiesBuilder UseThreads(bool useThreads)

Parameters

useThreads bool

Whether to use threads

Returns

ArrowWriterPropertiesBuilder