· 4 min read

Tech Talk - How does the Multi2ConvAI platform work?

Insights about the technologies behind the development, deployment and operation of our Multi2ConvAI platform

This blog post introduces the technologies behind our Multi2ConvAI platform. The platform aims to bring science and research in Conversational AI closer together. For this purpose, it is intended to enable the exchange of domain-specific datasets and thereby simplify the development and operation of Conversational AI models.

Backend Schema

The illustration provides a behind-the-scenes look at the Multi2ConvAI platform: users can interact with the platform via the web application or the DVC API. The user’s input is propagated through the various components of the platform and triggers the intended actions. Developers moreover have the ability to make changes via Gitlab and automated CI/CD pipelines. In the following, we present the individual components of the platform.

Google Cloud Platform (GCP) & Google Kubernetes Engine (GKE)

As a cloud ecosystem, various Google Cloud Platform (GCP) products are used in our project. For example, Kubernetes in the form of GKE is applied to orchestrate the containerized applications. In addition, we use Google’s Cloud Storage (GCS) and Container Registry (GCR).

Web Application

End users of our platform usually interact via our Web Application. This is where the datasets and models can be managed and used. Like the other components in the cluster, the frontend is developed as a containerized application. The Web Application was built using Javascript, Vue.js and inovex elements. Further insights can be found in the related blogpost.

Inference Container

In order to provide models for inference, each of them is deployed in its own container. For these containers, there exists a container image, which contains the entire executable code in the form of our multi2convai Python package. When starting a container, a config file is required, which contains information about which model should be started and where the required files (e.g. model.bin, labels.json) can be found. With this information the model can be loaded and prepared for the inference. Each so-called inference container then has its own URL, which is unique in the cluster and can be used to perform inference tasks or query metadata.

Backend Applications

To prevent the inference containers from being manually started, stopped and monitored, we have written applications that can perform these tasks for us. They create and execute the required requests to the Kubernetes API, and at the same time provide a simplified REST interface for an end-user to deploy a new model, monitor or stop running Inference Containers and request a list of all models including their URLs.

Data Version Control (DVC)

For versioning and storing datasets and models, the Python library DVC is being used. To allow the inference containers to access the necessary model files, a Persistent Volume (PV) exists in our Kubernetes cluster that can be mounted by the containers to read the data. This means that all models and datasets are on the same volume in the cluster and are mounted by all containers that need the data. Thus, the required data is available very quickly and a temporal delay due to a download when starting a new inference container is avoided.

Routing / Reverse Proxy (nginx)

Each inference container and each additional deployment in the cluster has its own internal URL in the cluster by using Kubernetes Object Service. Access via the Internet is handled by a Kubernetes load balancer, which initially routes directly to an nginx web server. nginx serves as a reverse proxy for our project and delegates all requests arriving at the cluster to the respective deployment. The internal cluster services are used for this purpose. The connection from the Internet to the nginx instance is encrypted with a TLS certificate to ensure secure data exchange over HTTPS.

Gitlab / CI/CD

Gitlab is used for version control of all source code files. In addition, the DevOps features of gitlab are used, enabling changes in the code to lead to an update of the deployed applications in the cloud platform with the help of the integrated pipelines. The infrastructure itself can be adapted with the help of Terraform itself via changes in the code according to the infrastructure-as-code principle.

Share:
Back to Blog