# Generating Test Recommendations

SeaLights Test Optimization (TO) is an intelligent engine designed to balance two competing needs in software delivery: **speed** and **safety**.

The process works in two major phases:

1. **Recommendation:** SeaLights identifies every test that is relevant to a new code change based on a sophisticated mapping of code-to-tests.
2. **Prioritization & Selection:** The **Most at Risk Ordering** layer sorts these recommended tests so that the most "dangerous" tests (those most likely to find a bug) are always at the front of the line, and an **Optimization Strategy** (Conservative, Moderate, or Aggressive) determines the final subset to be executed.

By combining these phases, SeaLights ensures that even when you reduce your test volume, you are always running the most effective and necessary tests first.

### Phase 1: Test Recommendation Principles

Before any ordering or advanced optimization occurs, SeaLights compiles a list of "Essential Tests". This list is built by identifying all code changes in a new build and applying the following six core principles:

1. **Impacted Tests:** Any test that exercises a method that was modified in the current build.
2. **Recent Failures:** Any test that failed in the last run of this specific test stage, ensuring comprehensive re-validation.
3. **Pinned Tests:** Tests explicitly marked by the team as "must-run," which always bypass optimization logic.
4. **New/Unmapped Tests:** Any new tests that have not yet been mapped to code sections are included to ensure they are profiled.
5. **Dependent Tests:** Tests that must run alongside other recommended tests due to functional or logic dependencies.
6. **Previously Blocked Tests:** Tests that were recommended in a previous run but weren't executed, often due to pipeline failures or blockers.

**Tests that do not meet at least one of these principles are excluded from the recommendation list and automatically skipped.**

### Phase 2: Optimization Strategies

Once the "Essential" list is compiled, teams can choose a strategy to further refine the execution volume based on their specific delivery goals. All three strategies utilize the **Most at Risk Ordering** layer to determine the selection order.

**Cross-App** test stages have three strategy options, while **App-Level** test stages can use only the Conservative strategy.&#x20;

#### 1. Conservative

* **Target:** 100% of recommended tests.
* **Logic:** Executes every test identified in Phase 1.
* **Best fit for:** Teams with Continuous Delivery (CD), hourly deployments, or pipelines requiring guaranteed full validation daily. Full validation occurs within 24 hours.

#### 2. Moderate

* **Target:** \~80% of recommended tests.
* **The "Double-Floor" Logic:**
  1. **Code Coverage Requirement:** The strategy **must** cover 100% of the changed methods. If reaching 80% volume doesn't achieve this, the engine adds more tests until all changes are exercised.
  2. **Volume Requirement:** If 100% change coverage is reached early (e.g., at 50%), the engine continues adding the next highest-risk tests until the 80% volume target is met.
* **Best fit for:** Teams with standard weekly/bi-weekly releases that rely on automation as the primary quality gate. Full validation occurs within the week.

#### 3. Aggressive

* **Target:** \~60% of recommended tests.
* **Logic:** Follows the same "Double-Floor" logic as the Moderate strategy but with a lower volume target (60%).
* **Best fit for:** Teams with Manual QA/Staging environments or high-volume code pushes. Full validation occurs before release or bi-weekly.

### How it Works: The Ordering Layer

The strategies above decide **how many** tests to pick, but the **Most At Risk Ordering** layer decides **which** tests to pick first. The engine calculates a Risk Score for each test to ensure that the chosen subset contains the highest-value tests.

#### Risk Scoring Signals

The engine evaluates risk based on three normalized statistical inputs:

* **Historical Failure Probability:** Analyzes the ratio of failures to total historical runs. Tests that have a higher historical failure rate are prioritized as they are statistically more likely to catch regressions.
* **Change-Coverage Proportion:** Measures how much of the current code change set is covered by a specific test. Tests that exercise a higher proportion of the modified logic are considered higher risk and moved to the top.
* **Impacted-but-Skipped Count:** Tracks how many times a test was relevant to a change but was skipped in previous runs. This prevents "test starvation", as a test is skipped more often, its risk priority naturally increases to ensure latent regressions are eventually caught.

#### Customizable Weighting

Every organization has different risk profiles. SeaLights allows for the customization of how much each signal contributes to the final risk score. While we provide optimized defaults, these weights can be tuned per customer to align with specific quality goals:

* **Failure Probability Weight:** (Default: 50%) Focuses on historical stability.
* **Change-Coverage Weight:** (Default: 35%) Focuses on immediate relevance to current changes.
* **Impacted-but-Skipped Weight:** (Default: 15%) Focuses on long-term coverage health.

{% hint style="info" %}
**Pinned Tests:** Manually selected "must-run" tests always appear at the top, regardless of their risk score.
{% endhint %}

***

### Detailed Workflow Summary

1. **Identify Impacted Tests:** TIA identifies all tests linked to the code change based on the **Six Recommendation Principles**.
2. **Score & Sort (Most at Risk Ordering):** Every recommended test is analyzed against the risk signals and sorted from highest to lowest risk.
3. **Apply Strategy Floor:** Select the top-ordered tests until **100% of modified methods** are exercised by at least one test.
4. **Apply Strategy Volume:** If the selection is still below the strategy target (60% or 80%), continue adding tests from the sorted list until the target is met.
5. **Output:** A prioritized execution list. Any remaining tests move to an **Appendix** (increasing their IBS score for the next run).

### FAQ

<details>

<summary><strong>If I choose the Aggressive strategy (60%), is it possible I'll run 70% of my tests?</strong></summary>

Yes. If 60% of your tests are not enough to cover 100% of the methods you changed, SeaLights will prioritize safety and add tests until every code change is exercised at least once.

</details>

<details>

<summary><strong>Why run 80% of tests if 100% coverage is reached at 50%?</strong></summary>

Redundancy is often valuable. Running additional "high-risk" tests that exercise the same code from different angles or state configurations increases the statistical likelihood of catching a defect that a single test might miss.

</details>

<details>

<summary><strong>Does the "Conservative" strategy use the ordering?</strong></summary>

Yes. While 100% of tests are run, they are still ordered by risk. This provides "fail-fast" behavior, where the tests most likely to fail are executed first, saving developers time.

</details>
