Metrics

AI Model Monitor User Manual

Portfolio
IndustrialAI
Product
AI Model Monitor
Software version
2.1.0
Language
en-US

Custom metrics created with AI SDK are automatically collected from the topic

/siemens/edge/aiinference/{model-name}/{model-version}/metrics/{component-name}/{metric-name}.

If you wish to collect custom metrics, make sure that the topic is added to your configured Databus user, and the necessary credentials are configured for AI Model Monitor Agent as well.

The metrics are extended with labels identifying the running pipeline where it is originated from.

**Metric name** **Metric type** **Metric origin** **Metric description**
host\_box\_availability gauge Edge device Device availability. Represented by: 0 - not available and 1 - available.
host\_box\_memory\_total gauge Edge device Total available memory on the box in bytes.
host\_box\_memory\_used gauge Edge device Used memory on the box in bytes.
host\_box\_cpu\_percentage gauge Edge device Used CPU percentage.
host\_box\_edge\_uptime gauge Edge device Uptime of the box represented in minutes.
host\_box\_running\_application\_count gauge Edge device Number of running applications on the box.
host\_box\_max\_running\_application\_count gauge Edge device Maximum number of applications allowed to run.
host\_box\_ApplicationName\_status gauge Edge device Application availability. Represented by: 0 - not available and 1 - available.
pipeline\_global\_inputs gauge AI Inference Server Number of pipeline inputs generated since the active pipeline is running.
pipeline\_global\_outputs gauge AI Inference Server Number of pipeline outputs generated since the active pipeline is running.
pipeline\_NodeName\_exec\_min gauge AI Inference Server Node execution time minimum.
pipeline\_NodeName\_exec\_max gauge AI Inference Server Node execution time maximum.
pipeline\_NodeName\_exec\_avg gauge AI Inference Server Node execution time average.
pipeline\_NodeName\_inputs gauge AI Inference Server Number of node inputs.
pipeline\_NodeName\_outputs gauge AI Inference Server Number of node outputs.
pipeline\_status gauge AI Inference Server The status of the AI Inference Server pipeline stored as label value. The metric always returns 0 value irrespective of the status of the pipeline.
pipeline\_StepName\_gpuruntime\_inference\_count gauge AI Inference Server The cumulative count of successful inference requests made for this model \(DOES NOT include cache hits\).
pipeline\_StepName\_gpuruntime\_execution\_count gauge AI Inference Server The cumulative count of successful inference executions performed for the model \(DOES NOT include cache hits\).
pipeline\_StepName\_gpuruntime\_success\_count gauge AI Inference Server The cumulative count of all successful inference requests made for this model \(INCLUDING cache hits\).
pipeline\_StepName\_gpuruntime\_success\_duration gauge AI Inference Server The cumulative duration for all successful inference requests in nanoseconds \(INCLUDING cache hits\).
pipeline\_StepName\_gpuruntime\_fail\_count gauge AI Inference Server The cumulative count of all failed inference requests made for this model.
pipeline\_StepName\_gpuruntime\_fail\_duration gauge AI Inference Server The cumulative duration for all failed inference requests in nanoseconds.
pipeline\_StepName\_gpuruntime\_queue\_count gauge AI Inference Server The cumulative count of inference requests waited in scheduling or in other queues. \(INCLUDING cache hits\)
pipeline\_StepName\_gpuruntime\_queue\_duration gauge AI Inference Server The cumulative duration that inference requests wait in scheduling \(or in other queues\) in nanoseconds \(INCLUDING cache hits\).
pipeline\_StepName\_gpuruntime\_compute\_input\_count gauge AI Inference Server The cumulative count of the prepared tensor data input required by the model \(DOES NOT include cache hits\).
pipeline\_StepName\_gpuruntime\_compute\_input\_duration gauge AI Inference Server The cumulative duration to prepare input tensor data as required by the model in nanoseconds \(DOES NOT include cache hits\).
pipeline\_StepName\_gpuruntime\_compute\_output\_count gauge AI Inference Server The cumulative count of the extracted tensor data output produced by the model \(DOES NOT include cache hits\).
pipeline\_StepName\_gpuruntime\_compute\_output\_duration gauge AI Inference Server The cumulative duration to extract output tensor data produced by the model in nanoseconds \(DOES NOT include cache hits\).
pipeline\_StepName\_gpuruntime\_cache\_hit\_count gauge AI Inference Server The count of response cache hits.
pipeline\_StepName\_gpuruntime\_cache\_hit\_duration gauge AI Inference Server The cumulative duration to look up and extract output tensor data from the Response Cache on a cache hit in nanoseconds.
pipeline\_StepName\_gpuruntime\_cache\_miss\_count gauge AI Inference Server The count of response cache misses.
pipeline\_StepName\_gpuruntime\_cache\_miss\_duration gauge AI Inference Server The cumulative duration to look up and insert output tensor data to the Response Cache on a cache miss in nanoseconds.
pipeline\_StepName\_gpuruntime\_batch\_compute\_input\_duration gauge AI Inference Server The cumulative duration to prepare input tensor data as required by the model in nanoseconds with the given batch size.
pipeline\_StepName\_gpuruntime\_batch\_compute\_input\_count gauge AI Inference Server The cumulative count of input tensor data with the given batch size.
pipeline\_StepName\_gpuruntime\_batch\_compute\_output\_duration gauge AI Inference Server The cumulative duration to extract output tensor data as required by the model in nanoseconds with the given batch size.
pipeline\_StepName\_gpuruntime\_batch\_compute\_output\_count gauge AI Inference Server The cumulative count of output tensor data with the given batch size.
pipeline\_StepName\_gpuruntime\_batch\_compute\_infer\_duration gauge AI Inference Server The cumulative duration to execute the model in nanoseconds with the given batch size.
pipeline\_StepName\_gpuruntime\_batch\_compute\_infer\_count gauge AI Inference Server The cumulative count of model executions with the given batch size.
is\_drift gauge Monitoring Node Whether a model drift is detected or not according to the collected buffer. Represented by: -1 - buffer is being collected or error occurred, 0 - no model drift detected, 1 - model drift detected.
number\_of\_data\_type\_errors gauge Monitoring Node Number of data type errors. The default threshold is set to 0.
number\_of\_missing\_properties gauge Monitoring Node Number of missing properties. The range of this metric is between 0 and the number of properties.
ratio\_of\_categorical\_out\_of\_domain\_features gauge Monitoring Node Ratio of categorical out of domain features. This metric is bounded between 0 and 1.
ratio\_of\_numerical\_out\_of\_domain\_features gauge Monitoring Node Ratio of numerical out of domain features. This metric is bounded between 0 and 1.
ratio\_of\_numerical\_outlier\_features gauge Monitoring Node Ratio of numerical outlier features.
<custom metric name> gauge AI Inference Server Custom metric defined in AI SDK at model creation.
error\_metric\_created gauge AI Model Monitor Agent Error metric counter creation time.
error\_metric\_total counter AI Model Monitor Agent Number of errors occurred during scraping.
scrape\_duration\_seconds gauge Prometheus Duration of the scrape.
scrape\_samples\_post\_metric\_relabeling gauge Prometheus Number of samples remaining after metric relabeling was applied.
scrape\_samples\_scraped gauge Prometheus Number of samples the target exposed.
scrape\_series\_added gauge Prometheus Approximate number of new series in this scrape.
up gauge Prometheus Prometheus availability. Represented by: 0 - not available and 1 - available.