Limitations and known issues

AI Inference Server Release Notes

Product
AI Inference Server
Product Version
2.1.0
Language
en-US

Limitations

  • Only one AI Inference Server application instance can be installed on an edge device. If you try to install more than one application instance with different MLFB numbers, you’ll get error messages, and you won’t be able to use AI Inference Server.

  • The AI Inference Server app relies on the ready-to-use connectors provided by the Siemens Industrial Edge platform. The performance limitations for the connectors are listed respectively in the help file of each app and each connector.

  • The performance of the connectors and the resolutions of the collected data is responsibility of each connector provided by the Edge platform. Restrictions and limitations are documented in the respective user manual.

  • AI Inference Server can browse tags available only in the data bus of the Edge device which is installed as the Edge platform does not allow at the current version to access the data bus of other Edge devices from one Edge device.

  • The browse tags capability covers in this version only tags available on the data bus coming from the SIMATIC S7 connector, SIMATIC S7+ connector, OPC-UA connector.

  • Currently only static IP addresses are supported.

  • Free space management of local storage is not part of AI Inference Server (it should be checked periodically by the user whether there is enough free space to work with a new configuration package file which also contains the AI Model)

  • AI Inference Server was tested with ML pipelines prepared and packaged using AI SDK and AI Model Deployer applications.

  • The state of a Python model is not persisted if user restarts the pipeline: every run/start of a pipeline via AI Inference Server UI is considered as a new call to the Python's 'import' directive.

  • Performance of the parallel step execution strongly depends on the used libraries which are running on the same CPU core thus in some cases, the overall performance of the pipeline will be only slightly better.

  • AI Inference Server truncates user-created Python log message at 1500 characters due to performance reasons.

  • Data access can be successful even if a topic does not exist on the data bus (reason: it is possible to subscribe for a non-existing topic)

  • All the steps can get value from data bus (via AI Inference Server) without considering the pipeline order in case the data of steps is coming from different AI Runtime Family apps (asynchronous mode) which might cause problem in some scenarios.

  • The following keys are used for internal purposes and cannot be used in Python scripts:

Name of key Comment
"timestamp" AI Inference Server inserts "timestamp" variable on the data bus

Technical information

  • AI Inference Server converts the message data type from data bus according to the following table (left side: type from data bus, right side: converted type how the AI Inference Server handles it). When the data type of an input or output is 'Integer' then the following data bus types can be accepted: 'Int', 'Byte', 'DInt', 'Word', 'LInt', 'SInt', 'USInt','UIInt', 'UDInt', 'ULInt', 'DWord', 'LWord', 'Char'.

Databus type Converted type (how AI Inference Server handles it)
'Bool' 'Boolean'
'Int' 'Integer'
'Byte' 'Integer'
'DInt' 'Integer'
'Real' 'Double'
'String' 'String'
'Word' 'Integer'
'LInt' 'Integer'
'SInt' 'Integer'
'USInt' 'Integer'
'UInt' 'Integer'
'UDInt' 'Integer'
'ULInt' 'Integer'
'LReal' 'Double'
'DWord' 'Integer'
'LWord' 'Integer'
'Char' 'String'
'WChar' 'String'
'WString' 'String'
'Date' 'String'
'DateTime' 'String'
'Date_And_Time' 'String'
'DT' 'String'
'DTL' 'String'
'LDT' 'String'
'TOD' 'String'
'Time_Of_Day' 'String'
'LTOD' 'String'
'LTime_Of_Day' 'String'
'S5Time' 'String'
'Time' 'String'
'LTime' 'String'
'BoolArray' 'BooleanArray'
'ByteArray' 'Int8Array' or 'IntegerArray'
'SIntArray' 'Int8Array' or 'IntegerArray'
'IntArray' 'Int16Array' or 'IntegerArray'
'DIntArray' 'Int32Array' or 'IntegerArray'
'LIntArray' 'Int64Array' or 'IntegerArray'
'USIntArray' 'UInt8Array' or 'IntegerArray'
'UIntArray' 'UInt16Array' or 'IntegerArray'
'UDIntArray' 'UInt32Array' or 'IntegerArray'
'ULIntArray' 'UInt64Array' or 'IntegerArray'
'WordArray' 'UInt16Array' or 'IntegerArray'
'DWordArray' 'UInt32Array' or 'IntegerArray'
'LWordArray' 'UInt64Array' or 'IntegerArray'
'RealArray' 'Float32Array' or 'DoubleArray'
'LRealArray' 'Float64Array' or 'DoubleArray'
'StringArray' 'StringArray'
'WStringArray' 'StringArray'
'CharArray' 'StringArray'
'WCharArray' 'StringArray'
'DateArray' 'StringArray'
'DateTimeArray' 'StringArray'
'Date_And_TimeArray' 'StringArray'
'DTArray' 'StringArray'
'DTLArray' 'StringArray'
'LDTArray' 'StringArray'
'TODArray' 'StringArray'
'Time_Of_DayArray' 'StringArray'
'LTODArray' 'StringArray'
'LTime_Of_DayArray' 'StringArray'
'S5TimeArray' 'StringArray'
'TimeArray' 'StringArray'
'LTimeArray' 'StringArray'
  • AI Inference Server supports configuration package execution based on the following table :

AI Inference Server (MLFB 6AV2170-0LA10-0AA0) AI Inference Server - 3 pipelines (MLFB 6AV2170-0LA10-1AA0) AI Inference Server GPU accelerated (MLFB 6AV2170-0LA11-0AA0)
Supported Python versions 3.8.19 and 3.10.14 3.8.19 and 3.10.14 3.8.19 and 3.10.14
Supported Pip version 24.0 24.0 24.0
Supported NumPy versions 1.x 1.x 1.x
CPU configuration package Yes Yes Yes
GPU configuration package Not supported Not supported Yes
Max. size of configuration package 2.2 GB 2.2 GB 2.2 GB
Reserved memory for Python runtime 1.5 GB 4 GB 8 GB
Reserved memory for External (GPU) runtime n.a. n.a. 5 GB
Max. number of pipelines executed simultaneously 1 3 1
Max. number of variables in a pipeline 1000 1000 1000
Max. length of pipeline description 2000 2000 2000
MQTT keep alive 60 secs 60 secs 60 secs
ONNX operations on GPU n.a. n.a. up to 19, based on ONNX Runtime 1.14.1
Timeout of configuration package import 5 mins 5 mins 5 mins
  • Maximum payload size with 1 sec / message frequency and 200 MB dedicated memory for Orchestrator:

    • With static payload size (as large string), memory fragmentation does not occur

      • by using SIMATIC S7 Connector v1.6.0-4: 6 Mbytes

      • by using IE MQTT Connector v1.5.1: 10 Mbytes

    • With dynamic payload size, memory fragmentation may occur:

      • by using SIMATIC S7 Connector v1.6.0-4: 6 Mbytes

      • by using IE MQTT Connector v1.5.1: 6 Mbytes

Known issues

  • When you download the logs of an edge application, the file is stored with the extension ".tar" by default although the content is a zip archive. Please rename the file from .tar to .zip in case the archive reader you use cannot recognize the archive type by file contents.

  • The application could crash if a pipeline contains more variables as the supported (due to memory limit). See table of the configuration package execution in Technical information

  • In case of any problem (e.g., restarting of AI Inference Server or restarting of Edge box) the logging configuration needs to be re-enabled. (Switch off and switch on)

  • If the memory limit is reached during the runtime of a pipeline, then the pipeline execution will crash without warning. The AI Inference Server will continue to work, and an error will pop up in the UI of AI Inference Server, and the process will go into Error state, however, no indication of memory limit reach is visible.

  • AI Inference Server swallows a row of input with inter signal alignment. Inter signal alignment assumes a continuous stream of input data and might not handle gaps correctly if the gap length significantly exceeds the defined alignment periodicity. The last alignment window might not be forwarded until a signal belonging to the next alignment window is detected. Workaround for the problem: Test your pipeline and finetune the inter-signal alignment time interval as appropriate for your use case. The test must send one additional row at the end to make sure all the preceding input rows are flushed from the buffer and forwarded to the pipeline.

  • Parallel importing on multiple internet browser tabs is not supported Workaround for the problem: Use only one browser tab when you import a configuration package.

  • It's not possible to downgrade AI Inference Server to an earlier version via Edge Management Workaround for the problem: Uninstall the existing application from your edge device and install an earlier version. The previously imported / received pipelines and their mapping information will be lost, you need to re-import and re-map them again.

  • The application description (displayed on Edge Management) is not refreshed after an upgrade process due to an Edge Management issue.

  • LogModule can write strings to the log. Any other type causes a fatal error inside the Runtime and the pipeline is killed, like:

def run (data):

.... log.info([1,2,3])
  • “503 Service unavailable" message may appear after the application upgrade (via Edge Management), the upgraded version of application is not working properly. Workaround for the problem: restart the application

  • The data of parameter which is received from the data bus may congest together with the data of variable in case the python script processes it too slow. (Reason: the parameters and the variables are read from the same data connector). The internal queue can be fine-tuned, if necessary. Please read the user manual.

  • Simultaneous pipeline execution may cause application crash in case the used memory of internal Python Runtime reaches the limit.

  • Performance drop might happen in case of multiple pipeline execution (more calculation is required from the same CPU)

  • In case the inputs come faster than the Python script can handle them, Out of memory issues may happen. Workaround for the problem: Please test your AI models before deploying them to production.

  • The application does not check how many CPU core exists. It's possible to set a higher number for the parallel step execution than the available CPU cores. It may cause performance degradation. Maximum 8 cores are supported.

  • Data transfer may delay between pipeline steps in case hostname of the ZMQ connection is invalid, and "Object" type is defined for pipeline output. Workaround for the problem: Please make sure the outbound ZMQ connection hostname is correct, present, and ready to accept incoming connections.

  • There are page rendering problems on Mozilla Firefox browser: the selected tab is not underlined on "Pipeline Visualization", "Settings" and "About & legal information" pages."

  • Databus application can crash in case of high frequency pipeline input (out of scope of AI Inference Server). Reason: in case the first step (usually a pre-processing) is much slower than the frequency of pipeline input. It may happen the inputs are being piled up in the network layer causing an Out Of Memory crash in the Databus application, which is silently restarted and the AI Inference Server reconnects automatically. Notes: Inter Signal Alignment related use cases are not impacted. For them the inputs are immediately handled and this way they cannot pile up. Workaround for the problem: Creation of an additional pipeline step before the first step which simply forwards it's input as output may help. (in this case, pipeline inputs cannot pile up in the network layer). It can help when there are only a few peaks (longer pre-processing time, but it happens occasionally only) because an internal queue/buffer is in charge between the Pipeline steps. Depending on the data loss app level configuration (in config.json) the pipeline will be stopped with an error message or warning log entries will be created.

  • Open-Source Components Notes

Vendor Open-Source Component name Version Risk Description
Go The GO programming language 1.22.1 1. An attacker may cause an HTTP/2 endpoint to read arbitrary amounts of header data by sending an excessive number of CONTINUATION frames. Maintaining HPACK state requires parsing and processing all HEADERS and CONTINUATION frames on a connection. When a request's headers exceed MaxHeaderBytes, no memory is allocated to store the excess headers, but they are still parsed. This permits an attacker to cause an HTTP/2 endpoint to read arbitrary amounts of header data, all associated with a request which is going to be rejected. These headers can include Huffman-encoded data which is significantly more expensive for the receiver to decode than for an attacker to send. The fix sets a limit on the amount of excess header frames we will process before closing a connection. : https://nvd.nist.gov/vuln/detail/CVE-2023-45288
      2. On Darwin, building a Go module which contains CGO can trigger arbitrary code execution when using the Apple version of ld, due to usage of the -lto_library flag in a "#cgo LDFLAGS" directive. : https://nvd.nist.gov/vuln/detail/CVE-2024-24787
      3. A malformed DNS message in response to a query can cause the Lookup functions to get stuck in an infinite loop. : https://nvd.nist.gov/vuln/detail/CVE-2024-24788
      4. Mishandling of corrupt central directory record. The archive/zip package's handling of certain types of invalid zip files differed from the behavior of most zip implementations. This misalignment could be exploited to create an zip file with contents that vary depending on the implementation reading the file. The archive/zip package now rejects files containing these errors. : https://nvd.nist.gov/vuln/detail/CVE-2024-24789
      5. Unexpected behavior from Is methods for IPv4-mapped IPv6 addresses. The various Is methods (IsPrivate, IsLoopback, etc) did not work as expected for IPv4-mapped IPv6 addresses, returning false for addresses which would return true in their traditional IPv4 forms. : https://nvd.nist.gov/vuln/detail/CVE-2024-24790
OpenSSL Project OpenSSL 3.1.5 1. An attacker may exploit certain server configurations to trigger unbounded memory growth that would lead to a Denial of Service. If the non-default SSL_OP_NO_TICKET option is being used (but not if early_data support is also configured and the default anti-replay protection is in use). In this case, under certain conditions, the session cache can get into an incorrect state and it will fail to flush properly as it fills. The session cache will continue to grow in an unbounded manner. A malicious client could deliberately create the scenario for this failure to force a Denial of Service. : https://nvd.nist.gov/vuln/detail/CVE-2024-2511
      2. A vulnerability in OpenSSL. Applications that use the functions EVP_PKEY_param_check() or EVP_PKEY_public_check() to check a DSA public key or DSA parameters may experience long delays when checking excessively long DSA keys or parameters. In applications that allow untrusted sources to provide the key or parameters that are checked, an attacker may be able to cause a denial of service. These functions are not called by OpenSSL on untrusted DSA keys. The applications that directly call these functions are the ones that may be vulnerable to this problem. : https://nvd.nist.gov/vuln/detail/CVE-2024-4603
      3. Use After Free with SSL_free_buffers. : https://nvd.nist.gov/vuln/detail/CVE-2024-4741
      4. Calling the OpenSSL API function SSL_select_next_proto with an empty supported client protocols buffer may cause a crash or memory contents to be sent to the peer. : https://nvd.nist.gov/vuln/detail/CVE-2024-5535
OpenSSL Project OpenSSL 3.1.5 OpenSSL version 3.1 will reach its end of life on March 14th, 2025.
Python Software Foundation Python 3.8.19 Python version 3.8.19 will reach its end of life on Oct 31, 2024
Python Software Foundation Python 3.10.14 Python version 3.10.14 will reach its end of life on Oct 31, 2026
Websockets Node.js Package: ws 8.12.0 A request with a number of headers exceeding theserver.maxHeadersCount threshold could be used to crash a ws server. : https://nvd.nist.gov/vuln/detail/CVE-2024-37890

Components Notes of AI Inference Server GPU accelerated :

Vendor Open-Source Component name Version Risk Description
NVIDIA CUDA Toolkit 11.8 1. NVIDIA CUDA toolkit for Linux and Windows contains a vulnerability in the nvdisasm binary file, where an attacker may cause a NULL pointer dereference by providing a user with a malformed ELF file. A successful exploit of this vulnerability may lead to a partial denial of service. : https://nvd.nist.gov/vuln/detail/CVE-2023-25523
      2. NVIDIA CUDA Toolkit SDK for Linux and Windows contains a NULL pointer dereference in cuobjdump, where a local user running the tool against a malformed binary may cause a limited denial of service. : https://nvd.nist.gov/vuln/detail/CVE-2023-25510
      3. NVIDIA CUDA toolkit for Linux and Windows contains a vulnerability in cuobjdump, where an attacker may cause an out-of-bounds memory read by running cuobjdump on a malformed input file. A successful exploit of this vulnerability may lead to limited denial of service, code execution, and limited information disclosure. : https://nvd.nist.gov/vuln/detail/CVE-2023-25512
      4. NVIDIA CUDA toolkit for Linux and Windows contains a vulnerability in cuobjdump, where an attacker may cause an out-of-bounds read by tricking a user into running cuobjdump on a malformed input file. A successful exploit of this vulnerability may lead to limited denial of service, code execution, and limited information disclosure. : https://nvd.nist.gov/vuln/detail/CVE-2023-25513
      5. NVIDIA CUDA toolkit for Linux and Windows contains a vulnerability in cuobjdump, where an attacker may cause an out-of-bounds read by tricking a user into running cuobjdump on a malformed input file. A successful exploit of this vulnerability may lead to limited denial of service, code execution, and limited information disclosure. : https://nvd.nist.gov/vuln/detail/CVE-2023-25514
      6. NVIDIA CUDA Toolkit for Linux and Windows contains a vulnerability in cuobjdump, where a division-by-zero error may enable a user to cause a crash, which may lead to a limited denial of service. : https://nvd.nist.gov/vuln/detail/CVE-2023-25511
      7. NVIDIA CUDA Toolkit SDK contains a vulnerability in cuobjdump, where a local user running the tool against a malicious binary may cause an out-of-bounds read, which may result in a limited denial of service and limited information disclosure. : https://nvd.nist.gov/vuln/detail/CVE-2023-0193
      8. NVIDIA CUDA Toolkit SDK contains a bug in cuobjdump, where a local user running the tool against an ill-formed binary may cause a null- pointer dereference, which may result in a limited denial of service. : https://nvd.nist.gov/vuln/detail/CVE-2023-0196
      9. NVIDIA CUDA toolkit for all platforms contains a vulnerability in cuobjdump and nvdisasm where an attacker may cause a crash by tricking a user into reading a malformed ELF file. A successful exploit of this vulnerability may lead to a partial denial of service. : https://nvd.nist.gov/vuln/detail/CVE-2024-0072
      10. NVIDIA CUDA toolkit for all platforms contains a vulnerability in cuobjdump and nvdisasm where an attacker may cause a crash by tricking a user into reading a malformed ELF file. A successful exploit of this vulnerability may lead to a partial denial of service. : https://nvd.nist.gov/vuln/detail/CVE-2024-0076
      11. NVIDIA nvJPEG2000 Library for Windows and Linux contains a vulnerability where improper input validation might enable an attacker to use a specially crafted input file. A successful exploit of this vulnerability might lead to a partial denial of service. : https://nvd.nist.gov/vuln/detail/CVE-2023-31028
      12. NVIDIA nvTIFF Library for Windows and Linux contains a vulnerability where improper input validation might enable an attacker to use a specially crafted input file. A successful exploit of this vulnerability might lead to a partial denial of service. : https://nvd.nist.gov/vuln/detail/CVE-2024-0080