Table of Contents

Class ParquetFileWriter

Namespace
ParquetSharp
Assembly
ParquetSharp.dll

Opens and writes Parquet files.

public sealed class ParquetFileWriter : IDisposable
Inheritance
ParquetFileWriter
Implements
Inherited Members

Constructors

ParquetFileWriter(OutputStream, Column[], Compression, IReadOnlyDictionary<string, string>?)

Open a new ParquetFileWriter

public ParquetFileWriter(OutputStream outputStream, Column[] columns, Compression compression = Compression.Snappy, IReadOnlyDictionary<string, string>? keyValueMetadata = null)

Parameters

outputStream OutputStream

Stream to write to

columns Column[]

Definitions of columns to be written

compression Compression

Compression to use for all columns

keyValueMetadata IReadOnlyDictionary<string, string>

Optional dictionary of key-value metadata. This isn't read until the file is closed, to allow metadata to be modified after data is written.

ParquetFileWriter(OutputStream, Column[], LogicalTypeFactory?, Compression, IReadOnlyDictionary<string, string>?)

Open a new ParquetFileWriter

public ParquetFileWriter(OutputStream outputStream, Column[] columns, LogicalTypeFactory? logicalTypeFactory, Compression compression = Compression.Snappy, IReadOnlyDictionary<string, string>? keyValueMetadata = null)

Parameters

outputStream OutputStream

Stream to write to

columns Column[]

Definitions of columns to be written

logicalTypeFactory LogicalTypeFactory

Custom type factory used to map from dotnet types to Parquet types

compression Compression

Compression to use for all columns

keyValueMetadata IReadOnlyDictionary<string, string>

Optional dictionary of key-value metadata. This isn't read until the file is closed, to allow metadata to be modified after data is written.

ParquetFileWriter(OutputStream, Column[], LogicalTypeFactory?, WriterProperties, IReadOnlyDictionary<string, string>?)

Open a new ParquetFileWriter

public ParquetFileWriter(OutputStream outputStream, Column[] columns, LogicalTypeFactory? logicalTypeFactory, WriterProperties writerProperties, IReadOnlyDictionary<string, string>? keyValueMetadata = null)

Parameters

outputStream OutputStream

Stream to write to

columns Column[]

Definitions of columns to be written

logicalTypeFactory LogicalTypeFactory

Custom type factory used to map from dotnet types to Parquet types

writerProperties WriterProperties

Writer properties to use

keyValueMetadata IReadOnlyDictionary<string, string>

Optional dictionary of key-value metadata. This isn't read until the file is closed, to allow metadata to be modified after data is written.

ParquetFileWriter(OutputStream, Column[], WriterProperties, IReadOnlyDictionary<string, string>?)

Open a new ParquetFileWriter

public ParquetFileWriter(OutputStream outputStream, Column[] columns, WriterProperties writerProperties, IReadOnlyDictionary<string, string>? keyValueMetadata = null)

Parameters

outputStream OutputStream

Stream to write to

columns Column[]

Definitions of columns to be written

writerProperties WriterProperties

Writer properties to use

keyValueMetadata IReadOnlyDictionary<string, string>

Optional dictionary of key-value metadata. This isn't read until the file is closed, to allow metadata to be modified after data is written.

ParquetFileWriter(OutputStream, GroupNode, WriterProperties, IReadOnlyDictionary<string, string>?)

Open a new ParquetFileWriter

public ParquetFileWriter(OutputStream outputStream, GroupNode schema, WriterProperties writerProperties, IReadOnlyDictionary<string, string>? keyValueMetadata = null)

Parameters

outputStream OutputStream

Stream to write to

schema GroupNode

Root schema node defining the structure of the file

writerProperties WriterProperties

Writer properties to use

keyValueMetadata IReadOnlyDictionary<string, string>

Optional dictionary of key-value metadata. This isn't read until the file is closed, to allow metadata to be modified after data is written.

ParquetFileWriter(Stream, Column[], LogicalTypeFactory?, Compression, IReadOnlyDictionary<string, string>?, bool)

Open a new ParquetFileWriter for writing to a .NET stream

public ParquetFileWriter(Stream stream, Column[] columns, LogicalTypeFactory? logicalTypeFactory = null, Compression compression = Compression.Snappy, IReadOnlyDictionary<string, string>? keyValueMetadata = null, bool leaveOpen = false)

Parameters

stream Stream

Stream to write to

columns Column[]

Definitions of columns to be written

logicalTypeFactory LogicalTypeFactory

Custom type factory used to map from dotnet types to Parquet types

compression Compression

Compression to use for all columns

keyValueMetadata IReadOnlyDictionary<string, string>

Optional dictionary of key-value metadata. This isn't read until the file is closed, to allow metadata to be modified after data is written.

leaveOpen bool

Whether to keep the stream open after closing the writer

ParquetFileWriter(Stream, Column[], LogicalTypeFactory?, WriterProperties, IReadOnlyDictionary<string, string>?, bool)

Open a new ParquetFileWriter for writing to a .NET stream

public ParquetFileWriter(Stream stream, Column[] columns, LogicalTypeFactory? logicalTypeFactory, WriterProperties writerProperties, IReadOnlyDictionary<string, string>? keyValueMetadata = null, bool leaveOpen = false)

Parameters

stream Stream

Stream to write to

columns Column[]

Definitions of columns to be written

logicalTypeFactory LogicalTypeFactory

Custom type factory used to map from dotnet types to Parquet types

writerProperties WriterProperties

Writer properties to use

keyValueMetadata IReadOnlyDictionary<string, string>

Optional dictionary of key-value metadata. This isn't read until the file is closed, to allow metadata to be modified after data is written.

leaveOpen bool

Whether to keep the stream open after closing the writer

ParquetFileWriter(Stream, GroupNode, WriterProperties, IReadOnlyDictionary<string, string>?, bool)

Open a new ParquetFileWriter for writing to a .NET stream

public ParquetFileWriter(Stream stream, GroupNode schema, WriterProperties writerProperties, IReadOnlyDictionary<string, string>? keyValueMetadata = null, bool leaveOpen = false)

Parameters

stream Stream

Stream to write to

schema GroupNode

Root schema node defining the structure of the file

writerProperties WriterProperties

Writer properties to use

keyValueMetadata IReadOnlyDictionary<string, string>

Optional dictionary of key-value metadata. This isn't read until the file is closed, to allow metadata to be modified after data is written.

leaveOpen bool

Whether to keep the stream open after closing the writer

ParquetFileWriter(string, Column[], Compression, IReadOnlyDictionary<string, string>?)

Open a new ParquetFileWriter

public ParquetFileWriter(string path, Column[] columns, Compression compression = Compression.Snappy, IReadOnlyDictionary<string, string>? keyValueMetadata = null)

Parameters

path string

Location to write to

columns Column[]

Definitions of columns to be written

compression Compression

Compression to use for all columns

keyValueMetadata IReadOnlyDictionary<string, string>

Optional dictionary of key-value metadata. This isn't read until the file is closed, to allow metadata to be modified after data is written.

ParquetFileWriter(string, Column[], LogicalTypeFactory?, Compression, IReadOnlyDictionary<string, string>?)

Open a new ParquetFileWriter

public ParquetFileWriter(string path, Column[] columns, LogicalTypeFactory? logicalTypeFactory, Compression compression = Compression.Snappy, IReadOnlyDictionary<string, string>? keyValueMetadata = null)

Parameters

path string

Location to write to

columns Column[]

Definitions of columns to be written

logicalTypeFactory LogicalTypeFactory

Custom type factory used to map from dotnet types to Parquet types

compression Compression

Compression to use for all columns

keyValueMetadata IReadOnlyDictionary<string, string>

Optional dictionary of key-value metadata. This isn't read until the file is closed, to allow metadata to be modified after data is written.

ParquetFileWriter(string, Column[], LogicalTypeFactory?, WriterProperties, IReadOnlyDictionary<string, string>?)

Open a new ParquetFileWriter

public ParquetFileWriter(string path, Column[] columns, LogicalTypeFactory? logicalTypeFactory, WriterProperties writerProperties, IReadOnlyDictionary<string, string>? keyValueMetadata = null)

Parameters

path string

Location to write to

columns Column[]

Definitions of columns to be written

logicalTypeFactory LogicalTypeFactory

Custom type factory used to map from dotnet types to Parquet types

writerProperties WriterProperties

Writer properties to use

keyValueMetadata IReadOnlyDictionary<string, string>

Optional dictionary of key-value metadata. This isn't read until the file is closed, to allow metadata to be modified after data is written.

ParquetFileWriter(string, Column[], WriterProperties, IReadOnlyDictionary<string, string>?)

Open a new ParquetFileWriter

public ParquetFileWriter(string path, Column[] columns, WriterProperties writerProperties, IReadOnlyDictionary<string, string>? keyValueMetadata = null)

Parameters

path string

Location to write to

columns Column[]

Definitions of columns to be written

writerProperties WriterProperties

Writer properties to use

keyValueMetadata IReadOnlyDictionary<string, string>

Optional dictionary of key-value metadata. This isn't read until the file is closed, to allow metadata to be modified after data is written.

ParquetFileWriter(string, GroupNode, WriterProperties, IReadOnlyDictionary<string, string>?)

Open a new ParquetFileWriter

public ParquetFileWriter(string path, GroupNode schema, WriterProperties writerProperties, IReadOnlyDictionary<string, string>? keyValueMetadata = null)

Parameters

path string

Location to write to

schema GroupNode

Root schema node defining the structure of the file

writerProperties WriterProperties

Writer properties to use

keyValueMetadata IReadOnlyDictionary<string, string>

Optional dictionary of key-value metadata. This isn't read until the file is closed, to allow metadata to be modified after data is written.

Properties

FileMetaData

Returns the file metadata for the file.

public FileMetaData? FileMetaData { get; }

Property Value

FileMetaData

KeyValueMetadata

Returns a read-only copy of the current key-value metadata to be written

public IReadOnlyDictionary<string, string> KeyValueMetadata { get; }

Property Value

IReadOnlyDictionary<string, string>

LogicalTypeFactory

The LogicalTypeFactory for handling custom types.

public LogicalTypeFactory LogicalTypeFactory { get; set; }

Property Value

LogicalTypeFactory

LogicalWriteConverterFactory

The LogicalWriteConverterFactory for writing custom types.

public LogicalWriteConverterFactory LogicalWriteConverterFactory { get; set; }

Property Value

LogicalWriteConverterFactory

Schema

The schema of the file.

public SchemaDescriptor Schema { get; }

Property Value

SchemaDescriptor

WriterProperties

The WriterProperties used to configure the writer.

public WriterProperties WriterProperties { get; }

Property Value

WriterProperties

Methods

AppendBufferedRowGroup()

Creates and returns a new RowGroupWriter for writing a buffered row group. Using a buffered writer allows writing to columns in any order, and writes to different columns may be interleaved, but requires more memory.

public RowGroupWriter AppendBufferedRowGroup()

Returns

RowGroupWriter

A new RowGroupWriter instance.

AppendRowGroup()

Creates and returns a new RowGroupWriter for writing a row group to the file.

public RowGroupWriter AppendRowGroup()

Returns

RowGroupWriter

A new RowGroupWriter instance.

Close()

Close the file writer as well any column or group writers that are still opened. This is the recommended way of closing Parquet files, rather than relying on the Dispose() method, as the latter will gobble exceptions.

public void Close()

ColumnDescriptor(int)

Get the ColumnDescriptor for the specified column index.

public ColumnDescriptor ColumnDescriptor(int i)

Parameters

i int

The column index.

Returns

ColumnDescriptor

A ColumnDescriptor instance for the specified column.

Dispose()

Performs application-defined tasks associated with freeing, releasing, or resetting unmanaged resources.

public void Dispose()