Multi-Sink Parallel Processing (Fan-out)

Sometimes you need to perform multiple actions for the same record. For example, you might want to archive data as JSON to S3 while simultaneously inserting it into a SQL database.

The Strategy

Use the MultiWriter. It takes multiple writers and broadcasts every record to all of them in parallel.

Implementation

typescript

import { 
  CSVReader, 
  JsonWriter, 
  SQLWriter, 
  MultiWriter, 
  S3Sink, 
  Job 
} from '@pujansrt/data-genie';
import { S3Client } from '@aws-sdk/client-s3';

async function run() {
  const reader = new CSVReader('input.csv');

  // Writer 1: Archive to S3
  const s3Writer = new JsonWriter(new S3Sink(new S3Client({}), 'my-bucket', 'archives/today.json'));
  
  // Writer 2: Insert to DB
  const dbWriter = new SQLWriter(dbClient, 'INSERT INTO analytics ...', (r) => [...]);

  // Combine them
  const combinedWriter = new MultiWriter(s3Writer, dbWriter);

  // Job runs once, but data goes to both places!
  await Job.run(reader, combinedWriter);
}

run().catch(console.error);

Performance Note

MultiWriter waits for all underlying writers to complete before pulling the next record from the reader. If one writer is significantly slower than the others, it will determine the overall throughput of the pipeline.

High-Throughput Fan-out (ParallelWriter)

If you have a particularly "heavy" sink (e.g., generating complex Parquet files), you can wrap it in a ParallelWriter so it doesn't block the main thread and other faster writers:

typescript

// Fast writers stay on main thread
const consoleWriter = new ConsoleWriter();

// Heavy writer is offloaded to background threads
const heavyWriter = new ParallelWriter({
  workerPath: './parquet-worker.js',
  concurrency: 4
});

const multi = new MultiWriter(consoleWriter, heavyWriter);
await Job.run(reader, multi);

Multi-Sink Parallel Processing (Fan-out) ​

The Strategy ​

Implementation ​

Performance Note ​

High-Throughput Fan-out (ParallelWriter) ​

Multi-Sink Parallel Processing (Fan-out)

The Strategy

Implementation

Performance Note

High-Throughput Fan-out (ParallelWriter)