Skip to content

What is Data-Genie?

Data-Genie is a high-performant, streaming-first ETL (Extract, Transform, Load) Engine for Node.js and TypeScript.

It is designed for modern developers who need to process massive datasets (gigabytes or terabytes) without worrying about "Out of Memory" (OOM) errors. By using Node.js streams and async iterators, Data-Genie ensures your application maintains a constant memory footprint, regardless of whether you are processing 100 rows or 100 million rows.

While written in TypeScript for maximum type safety and developer experience, it is fully compatible with plain JavaScript projects.

Why Data-Genie?

Traditional JavaScript methods like fs.readFileSync or array.map load the entire dataset into memory. This works for small files but fails catastrophically as your data grows.

FeatureData-GenieTraditional Approach
Memory UsageConstant (e.g., ~15MB)Linear (proportional to file size)
LatencyStarts writing immediatelyWaits for full read/parse
ScalabilityUnlimitedLimited by available RAM

Key Features

  • Streaming First: O(1) memory complexity for all operations.
  • Multi-Format: CSV, TSV, JSON, NDJSON, Parquet, Excel, SQL.
  • Transport Agnostic: Local FS, S3, HTTP, Memory.
  • Type Safe: Built with TypeScript and supports Zod for schema validation.
  • Fault Tolerant: Built-in retries and Dead Letter Queue (DLQ) support.

Released under the MIT License.