MLOPs Blog Series Part 2: Testing the Robustness of Secure Machine Learning Systems…

37

Robustness is the ability of a control loop system to tolerate disturbances or anomalies while system parameters are varied over a wide range. There are three essential tests to ensure that the machine learning system is robust in the production environments: unit testing, data and model testing, and integration testing.

unit tests

Tests are performed on individual components, each of which has a single function within the larger system (such as a function that creates a new feature, a column in a DataFrame, or a function that adds two numbers). We can perform unit tests for individual functions or components; A recommended method for performing unit tests is the Arrange, act, enforce (AAA) approach:

1. Arrange: Set up the schema, create object instances and create test data/inputs.
2. Law: Run code, invoke methods, set properties, and apply input to the components under test.
3. Assert: Check the results, validate (confirm that the outputs received are as expected) and clean (test-related residues).

Data and model testing

It is important to test the integrity of the data and models in operation. Tests can be carried out in the MLOps pipeline to validate the integrity of data and the robustness of the model for training and inference. The following are some general tests that can be performed to validate the integrity of the data and the robustness of the models:

1. Data Check: The integrity of the test data can be verified by checking the following five factors: Accuracy, Completeness, Consistency, Relevance, and Timeliness. Some important considerations to consider when collecting or exporting data for model training and inference include the following:

• Rows and Columns: Check rows and columns to ensure no missing values ​​or incorrect patterns are found.

• Individual Values: Check individual values ​​if they fall within the range or have missing values ​​to ensure the correctness of the data.

• Aggregated Values: Review statistical aggregations for columns or groups within the data to understand data consistency, coherence, and accuracy.

2. Model testing: The model should be tested both during training and after training to ensure it is robust, scalable, and secure. The following are some aspects of the model test:

• Check the form of model input (for serialized or non-serialized model).

• Check the shape and output of the model.

• Behavioral tests (combinations of inputs and expected outputs).

• Load serialized or packaged model artifacts into memory and deployment targets. This ensures that the model is properly deserialized and ready to be deployed in memory and to deployment targets.

• Assess the accuracy or key metrics of the ML model.

integration test

Integration testing is a process of combining individual software components and testing them as a group (e.g., data processing or inference or CI/CD).

Illustration 1: Integration test (two modules)

Let’s look at a simple hypothetical example of performing integration testing for two components of the MLOps workflow. In the Build module, the data ingestion and model training steps have individual functionalities, but when integrated, they perform ML model training on data ingested into the training step. By integrating Module 1 (data ingestion) and Module 2 (model training) we can perform data loading tests (to see if the ingested data goes into the model training step), input and output tests (to confirm that the formats are expected from each step input and output), as well as any other tests that are use case specific.

In general, integration testing can be performed in two ways:

1. Big Bang Testing: An approach where all components or modules are integrated at the same time and then tested as a unit.

2. Incremental Testing: During testing, two or more modules that are logically connected to each other are brought together and then the functionality of the application is tested. Incremental testing is performed in three ways:

• Top-down approach

• Bottom-up approach

• Sandwich approach: a combination of top-down and bottom-up

Integration testing can test the modules using a bottom-up or top-down approach.

Figure 2: Integration tests (incremental tests)

The top-down testing approach is a way of performing integration testing from the top down in the control flow of a software system. Higher level modules are tested first, and then lower level modules are evaluated and merged to ensure the operation of the software. Stubs are used to test modules that are not ready yet. Benefits of a top-down strategy include the ability to get an early prototype, test key high-priority modules, and uncover and fix fatal bugs earlier. A disadvantage is that a large number of stubs are required and lower-level components may be undertested in some cases.

The bottom-up testing approach tests the child modules first. The tested modules are then used to support testing of higher-level modules. This process continues until all top-level modules have been thoroughly evaluated. When the child modules have been tested and integrated, the next level of modules is created. With the bottom-up technique, you don’t have to wait until all the modules are built. A disadvantage is that the essential modules (at the top level of the software architecture) that influence the program flow are tested last and are therefore more likely to have errors.
The sandwich testing approach tests top-level modules alongside low-level modules while merging low-level components with top-level modules and evaluating them as a system. This is called hybrid integration testing because it combines top-down and bottom-up methods.

Learn more

For more details and practical implementation information, see Book Engineering MLOpsor learn how to build and deploy a model in Azure Machine Learning with MLOps in the “Get Time to Value with MLOps” best practices On Demand Webinar. Also check out our recently announced blog on the subject solution accelerator (MLOps v2) to simplify your MLOps workstream in Azure Machine Learning.



Source link

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.