# NodeJS: Memory Usage Explained

## Introduction  <a href="#introduction" id="introduction"></a>

Code coverage tools like Istanbul (used by `nyc` ) and others are essential for measuring test coverage in JavaScript applications. However, they inevitably increase memory usage, both during the instrumentation process and runtime. This document highlights the differences between **static** and **dynamic instrumentation**, explains why memory usage increases, and addresses common concerns raised by customers.

***

## Key Concepts: Static vs. Dynamic Instrumentation  <a href="#key-concepts-static-vs.-dynamic-instrumentation" id="key-concepts-static-vs.-dynamic-instrumentation"></a>

### Static Instrumentation  <a href="#static-instrumentation" id="static-instrumentation"></a>

* **Definition**: Modifies source code or bytecode before execution to insert instrumentation logic (e.g., for tracking coverage).
* **When It Happens**: During the build phase or pre-execution.
* **Example Tools**: (Istanbul), Babel plugins for code coverage

### Dynamic Instrumentation  <a href="#dynamic-instrumentation" id="dynamic-instrumentation"></a>

* **Definition**: Modifies code on-the-fly during runtime to insert instrumentation logic.
* **When It Happens**: While the application is running.
* **Example Tools**: Sealights Node Agent (leveraging Istanbul for dynamic instrumentation).

***

## Memory usage <a href="#memory-usage" id="memory-usage"></a>

General trend looks like:

| 0.25MB | 260MB  | 300MB  | 280MB  |
| ------ | ------ | ------ | ------ |
| 2.5MB  | 1300MB | 1350MB | 2200MB |
| 10.5MB | 2500MB | 3500MB | 4100MB |

## Memory Usage Analysis  <a href="#memory-usage-analysis" id="memory-usage-analysis"></a>

### Static Instrumentation (used when scanning and instrumenting Browser applications with Sealights)  <a href="#static-instrumentation-used-when-scanning-and-instrumenting-browser-applications-with-sealights" id="static-instrumentation-used-when-scanning-and-instrumenting-browser-applications-with-sealights"></a>

1. **Instrumentation Phase**:
   * The process of statically instrumenting code can cause significant memory spikes.
   * For large projects, memory usage can reach 100% of available RAM during the build/instrument phase with the Sealights agent, as reported by customers if the usage was high enough previously (for example 80% peak without Sealights scan).
   * This is because tools like `nyc` parse, transform, and write back large amounts of source code, often holding intermediate representations in memory.
2. **Runtime Phase:**
   * Once instrumented, the code runs with minimal additional memory overhead since the instrumentation is already part of the source.
   * However, runtime performance may still be affected due to added tracking logic.
3. **Challenges:**
   * High memory consumption during the build phase can make static instrumentation infeasible for large-scale projects or resource- constrained environments.

### Dynamic Instrumentation (used for everything else beside Browser applications)  <a href="#dynamic-instrumentation-used-for-everything-else-beside-browser-applications" id="dynamic-instrumentation-used-for-everything-else-beside-browser-applications"></a>

1. **Instrumentation Phase**:
   * Dynamic instrumentation occurs gradually at runtime, avoiding the upfront memory spike seen in static approaches.
   * Memory usage is distributed over time as only executed code paths are instrumented.
2. **Runtime Phase:**
   * Higher memory overhead compared to statically instrumented code due to:
     * Maintaining runtime metadata and tracking structures.
     * Storing coverage data in memory until it is processed or written to disk.
   * Memory usage grows as more code paths are executed, which can lead to significant consumption in long-running applications.
3. **Challenges:**
   * Long-running applications or those with extensive execution paths may experience higher cumulative memory usage.
   * Requires careful management of runtime data structures to prevent excessive growth.

***

## Additional Information: Variability in Memory Usage  <a href="#additional-information-variability-in-memory-usage" id="additional-information-variability-in-memory-usage"></a>

It is important to note that predicting the exact memory overhead of instrumentation (static or dynamic) is inherently challenging due to several factors:

1. **Codebase Characteristics:**
   * The size of individual files (e.g., large files with thousands of lines of code will require more memory during parsing and transformation).
   * The total number of files in the project, as each file contributes to the overall memory footprint.
2. **Instrumentation Complexity:**
   * The complexity of the code being instrumented (e.g., deeply nested structures or complex logic may require more metadata and tracking structures).
3. **Execution Path Coverage:**
   * For dynamic instrumentation, the more code paths executed during runtime, the higher the memory usage for maintaining runtime metadata and coverage data.
4. **Environment Constraints:**
   * The available RAM and CPU resources on the system performing instrumentation can influence how efficiently the process executes.
   * Resource-constrained environments (e.g., CI/CD pipelines) may exacerbate memory spikes.
5. **Tool-Specific Behavior:**
   * Different tools handle instrumentation and coverage tracking differently, leading to variations in memory consumption. For example:
     * Static tools like `nyc` hold intermediate representations in memory during transformation.
     * Dynamic tools like Sealights' Node agent allocate memory incrementally at runtime.
6. **Project-Specific Factors:**
   * Frameworks or libraries used (e.g., Angular, React, or Node.js applications) may introduce additional overhead depending on their structure or build processes.
   * Specific configurations, such as excluding certain files from instrumentation, can significantly impact memory usage.

***

## Why Does Memory Usage Increase? <a href="#why-does-memory-usage-increase" id="why-does-memory-usage-increase"></a>

1. **Instrumentation Overhead:**
   * Static tools hold intermediate representations of files in memory while transforming them.
   * Dynamic tools maintain runtime metadata for each executed path.
2. **Tracking Execution Paths:**
   * Coverage tools must record which parts of the code were executed, requiring additional data structures in memory.
3. **Report Generation:**
   * Generating detailed coverage reports involves aggregating and processing large amounts of data, further increasing memory usage.

***

## General Guidelines for Managing Memory Usage  <a href="#general-guidelines-for-managing-memory-usage" id="general-guidelines-for-managing-memory-usage"></a>

1. **Static Instrumentation:**
   * Try to exclude non-critical files from instrumentation to reduce workload, for example we have such suggestions for Angular projects at the following [link](https://sealights.atlassian.net/wiki/spaces/SUP/pages/1080754179/Javascript%2B-%2BAngular%2B8%2Bcode%2Breports%2Bdouble%2Bmethods%2Bto%2BSealights#Update-the-.slignore-file-to-ignore-all-the-files-you-don%27t-want).
   * Run instrumentation on machines with sufficient RAM for large projects.
2. **General Recommendations:**
   * Monitor resource usage during both build and test phases to identify bottlenecks.
   * Follow general guidelines by bundlers to cap file sizes to a certain limit. (for ex. Webpack suggests < 5MB files)

***

## Conclusion  <a href="#conclusion" id="conclusion"></a>

Memory consumption is an inherent challenge when using code coverage tools due to their need to track execution paths and generate reports. Both static and dynamic instrumentation have trade-offs:

* Static instrumentation causes significant memory spikes during the build phase but has lower runtime overhead.
* Dynamic instrumentation distributes its impact over time but may result in higher cumulative memory usage during long-running processes.
