Performance Models support the AI-SPRINT design and runtime components in selecting an appropriate configuration to:
- avoid applications performance violations,
- avoid under or overestimation of continuum resource utilisation, and
- predict the execution time of Deep Learning components on a target configuration.
The best regression model is built for each task, and then used to predict the execution time of inference components/pipelines or training jobs to support the selection of the most appropriate system configuration for executing them fulfilling QoS requirements while minimising operational costs.
The software of a-MLLibrary is available at: https://github.com/a-MLLibrary/a-MLLibrary, https://gitlab.polimi.it/ai-sprint/a-mllibrary, while the performance profiling tool a-GPUBench (for the TensorFlow jobs training) is available at https://gitlab.polimi.it/ai-sprint/a-gpubench. Both are licensed under Apache License, Version 2.0.
The tool is a separate component. The current implementation is based on Python and relies on standard Python libraries. The tool is used by SPACE4AI-D, SPACE4AI-R, AI models architecture search, and GPU scheduler components to estimate the execution time for inference or training tasks under different deployments.