Faculty Publications

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736

Publications by NITK Faculty

Browse

Search Results

Now showing 1 - 4 of 4
  • Item
    Automating the Selection of Container Orchestrators for Service Deployment
    (Institute of Electrical and Electronics Engineers Inc., 2022) Chaurasia, P.; Nath, S.B.; Addya, S.K.; Ghosh, S.K.
    With the ubiquitous usage of cloud computing, the services are deployed as a virtual machine (VM) in cloud servers. However, VM based deployment often takes more amount of resources. In order to minimize the resource consumption of service deployment, container based lightweight virtualization is used. The management of the containers for deployment is a challenging problem as the container managers need to consume less amount of resources while also catering to the needs of the clients. In order to choose the right container manager, we have proposed an architecture based on the application and user needs. In the proposed architecture, we have a machine learning based decision engine to solve the problem. We have considered docker containers for experimentation. The experimental results show that the proposed system can select the proper container manager among docker compose based manager and Kubernetes. © 2022 IEEE.
  • Item
    Multi Criteria Based Container Management in a Geo-Distributed Cluster
    (Institute of Electrical and Electronics Engineers Inc., 2024) Kumar, M.R.; Annappa, B.; Vishnu Teja, M.
    According to Gartner, 95% of workloads will shift to containers by 2025 due to its lightweight feature. Docker is a commonly used container software for binding applications; the container orchestration system Kubernetes (K8s) manages resources seamlessly across Cloud, Fog, and Edge environments through containers. However, Nodes in the cluster introduces the risk of exceeding node capacity thresholds, leading to failures and potential application loss which degrades the Quality of Service (QoS). In this regard, Multi-Criteria Decision Making (MCDM) strategy for ranking the nodes in the cluster is proposed to achieve the migration decision in the Geo-Distributed cluster for both stateful and stateless application servers using K8s. The proposed strategy has achieved a 15.94sec Average service restore time for the Nginx server and 48.99sec for the Zookeeper server. A proactive Deep Learning model BI-LSTM is proposed for resource utilization prediction of the cluster and achieved MAE of 0.01928 and 0.0206 for CPU and Memory utilization. © 2024 IEEE.
  • Item
    A Comprehensive Review on Scaling Machine Learning Workflows Using Cloud Technologies and DevOps
    (Institute of Electrical and Electronics Engineers Inc., 2025) Ramesh, G.; Vaikunta Pai, T.; Birǎu, R.; Poojary, K.K.; Abhay; Shingad, A.R.; Sowjanya, N.; Popescu, V.; Mitroi, A.T.; Nioata, R.M.; Kiran Raj, K.M.
    Scaling Machine Learning (ML) workflows in cloud environments presents critical challenges in ensuring reproducibility, low-latency inference, infrastructure reliability, and regulatory compliance. This review addresses the lack of a comprehensive synthesis of how integrated DevOps practices and cloud-native technologies enable scalable, production-grade ML systems. We analyze the convergence of MLOps with tools such as Kubernetes, Jenkins, and Terraform, detailing their role in automating CI/CD pipelines, infrastructure provisioning, and model lifecycle management. The main highlights strategies for optimizing resource utilization, minimizing inference latency, and managing data versioning across hybrid and multi-cloud architectures (AWS, Azure, GCP). We also examine serverless computing, container orchestration, and monitoring practices to enhance scalability and governance. By categorizing challenges chronologically and evaluating emerging practices such as federated learning and security-by-design, this work bridges a key gap in existing literature. It offers a unified perspective on building reliable, reproducible, and compliant ML workflows, thereby advancing the state of scalable AI system engineering. © IEEE. 2013 IEEE.
  • Item
    Pod Scheduling and Proactive Resource Management in an Edge Cluster using MCDM and Federated Learning
    (Springer Science and Business Media B.V., 2025) Kumar, N.K.; B, A.; J, H.; Srinivasan, S.; Sand, S.S.
    Edge computing, which locates computational resources closer to the data sources, has become crucial in meeting the demands of applications that need high bandwidth and low latency. To cater to edge computing scenarios, KubeEdge, an extension of Kubernetes(K8s), expands its capabilities to meet edge-specific requirements such as limited resources, irregular connections, and heterogeneous environments. Edge trace data cannot be shared between cloud providers because of privacy issues, which makes generic distributed training ineffective. However, even with edge computing’s potential advantages, the built-in scheduling algorithms have several drawbacks. A significant problem is the lack of efficient resource management and allocation mechanisms at the edge, which causes edge nodes to be underutilized or overloaded which leads to violation of Quality of Service(QoS) and inefficient utilization of resources leads to Service Level Agreement(SLA) violations. In this regard, VIKOR and ELECTRE III based pod scheduling strategy is proposed in this paper and evaluated using Wikipedia and NASA server workload. The experimental results shows that 50% reduction in standard deviation for ELECTRE III and 40% reduction in standard deviation for VIKOR against default scheduler of Kubernetes. The average response time of 30.6593ms and 31.8803ms is achieved for Electre III and VIKOR for Wikipedia dataset. A proactive resource management system is proposed for KubeEdge containerized services where it incorporates a federated learning framework to predict future workloads using the Bidirectional Long Short-Term Memory (Bi-LSTM) and Gated Recurrent Unit (GRU). The experimental comparison of federated learning shows 99.65%, 98.64% reduction in MSE for CPU utilization % and 89.72%, 76.57% reduction in MSE for Memory utilization % with respect to GRU and BI-LSTM models in contrast to centralized learning. The proposed approach effectiveness is evaluated through statistical techniques and found significant. © The Author(s), under exclusive licence to Springer Nature B.V. 2025.