Class ArrowWriterPropertiesBuilder
- Namespace
- ParquetSharp.Arrow
- Assembly
- ParquetSharp.dll
Builder for ArrowWriterProperties.
public sealed class ArrowWriterPropertiesBuilder : IDisposable
- Inheritance
-
ArrowWriterPropertiesBuilder
- Implements
- Inherited Members
Constructors
ArrowWriterPropertiesBuilder()
Create a new ArrowWriterPropertiesBuilder with default options
public ArrowWriterPropertiesBuilder()
Methods
AllowTruncatedTimestamps()
Allow loss of data when truncating timestamps.
This is disallowed by default and an error will be returned.
public ArrowWriterPropertiesBuilder AllowTruncatedTimestamps()
Returns
Build()
Create new ArrowWriterProperties using the configured builder
public ArrowWriterProperties Build()
Returns
CoerceTimestamps(TimeUnit)
Coerce all timestamps to the specified time unit.
For Parquet versions 1.0 and 2.4, nanoseconds are cast to microseconds.
public ArrowWriterPropertiesBuilder CoerceTimestamps(TimeUnit unit)
Parameters
unit
TimeUnittime unit to coerce to
Returns
DisableCompliantNestedTypes()
Preserve Arrow list field names
public ArrowWriterPropertiesBuilder DisableCompliantNestedTypes()
Returns
DisallowTruncatedTimestamps()
Disallow loss of data when truncating timestamps (default).
public ArrowWriterPropertiesBuilder DisallowTruncatedTimestamps()
Returns
Dispose()
Performs application-defined tasks associated with freeing, releasing, or resetting unmanaged resources.
public void Dispose()
EnableCompliantNestedTypes()
When enabled, will not preserve Arrow field names for list types.
Instead of using the field names Arrow uses for the values array of list types (default "item"), will use "element", as is specified in the Parquet spec.
This is enabled by default.
public ArrowWriterPropertiesBuilder EnableCompliantNestedTypes()
Returns
EngineVersion(WriterEngineVersion)
Set the version of the Parquet writer engine.
public ArrowWriterPropertiesBuilder EngineVersion(ArrowWriterProperties.WriterEngineVersion version)
Parameters
version
ArrowWriterProperties.WriterEngineVersionThe engine version to use
Returns
StoreSchema()
EXPERIMENTAL: Write binary serialized Arrow schema to the file, to enable certain read options (like "read_dictionary") to be set automatically. This also controls whether the metadata from the Arrow schema will be written to Parquet key-value metadata.
public ArrowWriterPropertiesBuilder StoreSchema()
Returns
UseThreads(bool)
Set whether to use multiple threads to write columns in parallel in the buffered row group mode.
WARNING: If writing multiple files in parallel, deadlock may occur if use_threads is true. Please disable it in this case.
Default is false.
public ArrowWriterPropertiesBuilder UseThreads(bool useThreads)
Parameters
useThreads
boolWhether to use threads