Save memory with TypeScript generators!

Testing Scenario

I prepared a 184 MB CSV data file to compare the memory consumption of the iterable ReadableStream API with traditional synchronous file loading. The aim is to iterate over the first line of the data and assess the memory consumption required for this operation. With process.memoryUsage(), it's feasible to log the results in bytes. Here's how it appears in the initial implementation:

sync-file.ts

import fs from 'node:fs';
import os from 'os';
import path from 'node:path';
import url from 'node:url';
 
function mem() {
  console.log(process.memoryUsage().heapUsed);
}
 
// ESM globals
const __filename = url.fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
 
const filePath = path.join(__dirname, '184mb.csv');
mem(); // 4559808
const file = fs.readFileSync(filePath, 'utf8');
mem(); // 197654832
const data = file.split(os.EOL);
mem(); // 338513256
 
for (const chunk of data) {
  console.log(chunk);
  mem(); // 338699088
  break;
}

Synchronous Execution

After the first call of fs.readFileSync, the memory consumption quickly reached around 197 MB, indicating that the entire CSV file had been loaded into memory. Splitting the data nearly doubles the memory usage as the split result is also stored in memory. Iterating over the first chunk and exiting using the break keyword has minimal impact on memory consumption. With roughly 338.70 MB being used, the memory consumption is quite high for such a small amount of business logic. Increasing the test data from 184 MB to 843 MB actually crashed the application with the following error message:

FATAL ERROR: v8::ToLocalChecked Empty MaybeLocal

Iterable Streams

To enhance memory consumption and reduce crash rates, I restructured the business logic by making use of the iterable ReadableStream API:

stream-file.ts

import fs from 'node:fs';
import path from 'node:path';
import url from 'node:url';
import split2 from 'split2';
 
function mem() {
  console.log(process.memoryUsage().heapUsed);
}
 
// ESM globals
const __filename = url.fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
 
const filePath = path.join(__dirname, '184mb.csv');
mem(); // 4588544
const file = fs.createReadStream(filePath, 'utf8');
mem(); // 4603376
const data = file.pipe(split2());
mem(); // 4611728
 
for await (const chunk of data) {
  console.log(chunk);
  mem(); // 4641592
  break;
}

Observing the numbers above is quite impressive because the memory consumption remains quite stable. It begins at just 4.5 MB and only rises to a maximum of 4.6 MB. Surprisingly, even when handling the 843 MB sample data, there was not a significant increase in memory usage. In contrast to the 338.70 MB used in the previous scenario, this iterable code only consumes approximately 1.36% of the memory, potentially saving you up to 98.64 MB of RAM.