Statistical modeling

With the increasing popularity of parallel testing to shorten testing cycles, SeaLights utilizes machine learning and AI through statistical modeling to meticulously link each code method with its corresponding tests.

This process involves analyzing every build during testing to pinpoint which tests activate specific code areas, drilling down to the method level. The analysis primarily centers on the timing of tests, determining the code that was triggered during specific time intervals.

Test Impact Analysis (TIA) kicks off its mission to reduce test execution time and costs right from the analysis of the second build. Over time, it progressively enhances mapping accuracy. Initially, each test is associated with broad code coverage, which then narrows down based on statistical insights regarding code areas that are triggered. The effectiveness of this mapping hinges on various factors, including how test execution is orchestrated, the testing environments and labs. Enhanced mapping accuracy leads to substantial time and cost savings through TIA.


Efficient Code-to-Test Mapping with Parallel Testing

SeaLights utilizes machine learning and AI to establish precise connections between code methods and their corresponding tests via statistical modeling. The effectiveness of this modeling is directly impacted by how test execution is orchestrated. By clearly separating builds, testing environments, test types, and test labs, statistical modeling becomes more efficient, resulting in greater savings. Let's examine the impact of different configurations on TIA.

Single Lab (Test Environment)

When employing a single lab to run all tests concurrently, timing becomes a crucial factor. Since there can only be one test stage, all tests, regardless of type, are executed on the same stage without distinction. TIA relies solely on the start and end times of each test, comparing them to the code triggered during specific time intervals. Test order and timing variations enable the identification of which test triggers which code. Over time, these differences can enhance statistical modeling accuracy.

However, consistently timing tests, meaning running the same tests in parallel with identical order and timing on every execution of the test stage, can hinder statistical modeling's ability to learn the individual impact of each test. This approach offers no improvement over time and leads to many tests being linked to numerous code pieces, resulting in larger test recommendation lists.

Multiple Labs (Test Environments)

Employing multiple test labs in parallel testing can significantly elevate the effectiveness of statistical modeling, continuously refining the accuracy and efficiency of Test Impact Analysis (TIA). However, successful implementation necessitates efficient orchestration and monitoring of test interactions with the code. By minimizing test overlap and maximizing separation between test execution environments, statistical modeling can rapidly generate an accurate map of code and test connections. An ideal approach involves running multiple test groups/sets in parallel, where each group sequentially executes on a distinct lab.

Advanced technologies like containers, such as PCF and Kubernetes, have revolutionized the creation, management, and decommissioning of test environments, making it both simpler and more cost-effective to tailor testing workflows to specific requirements and achieve superior efficiency in TIA. This newfound flexibility empowers organizations to optimize their testing processes and maximize the benefits of TIA.

Last updated