Class XmlMultiStreamExtractor<TRecord>

Namespace
Wolfgang.Etl.Xml
Assembly
Wolfgang.Etl.Xml.dll

Extracts items of type TRecord from multiple streams, reading one XML document per stream.

public sealed class XmlMultiStreamExtractor<TRecord> : ExtractorBase<TRecord, XmlReport>, IExtractWithProgressAndCancellationAsync<TRecord, XmlReport>, IExtractWithCancellationAsync<TRecord>, IExtractWithProgressAsync<TRecord, XmlReport>, IExtractAsync<TRecord> where TRecord : notnull, new()

Type Parameters

TRecord

The type of items to extract. Must be notnull and have a parameterless constructor.

Inheritance
ExtractorBase<TRecord, XmlReport>
XmlMultiStreamExtractor<TRecord>
Implements
IExtractWithProgressAndCancellationAsync<TRecord, XmlReport>
IExtractWithCancellationAsync<TRecord>
IExtractWithProgressAsync<TRecord, XmlReport>
IExtractAsync<TRecord>
Inherited Members
ExtractorBase<TRecord, XmlReport>.ExtractAsync()
ExtractorBase<TRecord, XmlReport>.ReportingInterval
ExtractorBase<TRecord, XmlReport>.CurrentItemCount
ExtractorBase<TRecord, XmlReport>.CurrentSkippedItemCount
ExtractorBase<TRecord, XmlReport>.MaximumItemCount
ExtractorBase<TRecord, XmlReport>.SkipItemCount

Examples

var streams = Directory.GetFiles("data/", "*.xml").Select(File.OpenRead);
var extractor = new XmlMultiStreamExtractor<Person>
(
    streams,
    new XmlReaderSettings { DtdProcessing = DtdProcessing.Prohibit },
    logger
);
await foreach (var person in extractor.ExtractAsync(cancellationToken))
{
    Console.WriteLine(person.Name);
}

Remarks

Iterates over an IEnumerable<T> of Stream instances, deserializing a single TRecord from each stream. Each stream is disposed after the item is read. Extraction stops when the enumerable is exhausted or Wolfgang.Etl.Abstractions.ExtractorBase<TSource, TProgress>.MaximumItemCount is reached.

Constructors

XmlMultiStreamExtractor(IEnumerable<Stream>)

Initializes a new instance of the XmlMultiStreamExtractor<TRecord> class.

public XmlMultiStreamExtractor(IEnumerable<Stream> streams)

Parameters

streams IEnumerable<Stream>

An enumerable of streams, each containing a single XML document.

Exceptions

ArgumentNullException

Thrown when streams is null.

XmlMultiStreamExtractor(IEnumerable<Stream>, ILogger<XmlMultiStreamExtractor<TRecord>>)

Initializes a new instance of the XmlMultiStreamExtractor<TRecord> class with a logger.

public XmlMultiStreamExtractor(IEnumerable<Stream> streams, ILogger<XmlMultiStreamExtractor<TRecord>> logger)

Parameters

streams IEnumerable<Stream>

An enumerable of streams, each containing a single XML document.

logger ILogger<XmlMultiStreamExtractor<TRecord>>

The logger instance for diagnostic output.

Exceptions

ArgumentNullException

Thrown when streams or logger is null.

XmlMultiStreamExtractor(IEnumerable<Stream>, XmlReaderSettings, ILogger<XmlMultiStreamExtractor<TRecord>>)

Initializes a new instance of the XmlMultiStreamExtractor<TRecord> class with custom reader settings.

public XmlMultiStreamExtractor(IEnumerable<Stream> streams, XmlReaderSettings readerSettings, ILogger<XmlMultiStreamExtractor<TRecord>> logger)

Parameters

streams IEnumerable<Stream>

An enumerable of streams, each containing a single XML document.

readerSettings XmlReaderSettings

The XML reader settings to use for deserialization.

logger ILogger<XmlMultiStreamExtractor<TRecord>>

The logger instance for diagnostic output.

Exceptions

ArgumentNullException

Thrown when streams, readerSettings, or logger is null.

Methods

CreateProgressReport()

Creates a progress report of type TProgress. This gives the derived class the opportunity to implement a custom progress report that is specific to the extraction process.

protected override XmlReport CreateProgressReport()

Returns

XmlReport

Progress of type TProgress

CreateProgressTimer(IProgress<XmlReport>)

Creates the Wolfgang.Etl.Abstractions.IProgressTimer used to drive progress callbacks. Override this method in a derived class to inject a custom timer (for example, a custom implementation that allows manual control in unit tests).

protected override IProgressTimer CreateProgressTimer(IProgress<XmlReport> progress)

Parameters

progress IProgress<XmlReport>

The progress sink that will receive callbacks.

Returns

IProgressTimer

A started Wolfgang.Etl.Abstractions.IProgressTimer instance.

ExtractWorkerAsync(CancellationToken)

This method is the core implementation of the extraction logic and should be overridden by derived classes.

protected override IAsyncEnumerable<TRecord> ExtractWorkerAsync(CancellationToken token)

Parameters

token CancellationToken

A CancellationToken to observe while waiting for the task to complete.

Returns

IAsyncEnumerable<TRecord>

IAsyncEnumerable<TSource> The result may be an empty sequence if no data is available or if the extraction fails.