Leveraging Small Language Model as a Sidecar for Linux App Service

July 06, 2025 by Anuraj

dotnet azure ai

In this blog post, we’ll learn how to use a small language model aka SLM as a sidecar for Linux App Service. A small language model (SLM) is a Gen AI technology that functions like a large language model (LLM), but with a much smaller footprint. Small language models (SLMs) are increasingly fine-tuned on domain-specific datasets, enabling them to excel in targeted applications like specialized chat bots, document summarization, and industry-specific information retrieval. Their compact size not only boosts efficiency in these focused tasks but also makes them ideal for deployment on devices with limited computational resources—such as mobile phones, IoT devices, or edge computing environments—bringing the benefits of generative AI to more accessible and resource-constrained settings.

In this example, I am using a very small LLM - smollm2:135m - We can use any other LLMs like Phi family or anything. And for interacting with language model, I am using OllamSharp and Microsoft.Extensions.AI nuget package.

First I will be creating a Docker image for the SLM. Here is the Dockerfile and shell script which I am using to build the SLM. Here is the Dockerfile code

FROM alpine/ollama:latest

ENV OLLAMA_HOST=0.0.0.0
ENV OLLAMA_PORT=11434
ENV OLLAMA_KEEP_ALIVE=24h

EXPOSE 11434

COPY start.sh /start.sh
RUN chmod +x /start.sh

ENTRYPOINT ["/start.sh"]

In the Docker file, I am using Ollam based on Alpine which reduces the size - my size of the image is around 300 MB only. Here is the start.sh file, which pulls the image.

#!/bin/sh
ollama serve &
sleep 10
ollama pull smollm2:135m
wait

Next we need to push the image to Docker Registry. For demo purposes I am using Docker Hub. We can use the docker build command to create the image, and docker push command to publish the image to Docker hub. Next we can create the .NET application which interact with the SLM. In the app, I am using two nuget packages, as mentioned earlier. Here is the project file.

<ItemGroup>
  <PackageReference Include="Microsoft.Extensions.AI" Version="9.6.0" />
  <PackageReference Include="OllamaSharp" Version="5.2.3" />
</ItemGroup>

Next in the Program.cs file, I will be adding the following code.

var innerChatClient = new OllamaApiClient("http://localhost:11434/", "smollm2:135m");
builder.Services.AddChatClient(innerChatClient);

And in the controller we can use IChatClient interface, like this.

[HttpPost]
public async Task<IActionResult> SendMessage([FromServices] IChatClient chatClient,
    [FromBody] ChatMessage request)
{
    try
    {
        if (request == null || string.IsNullOrWhiteSpace(request.Message))
        {
            return BadRequest(new { role = "system", message = "Message cannot be empty." });
        }

        var response = await chatClient.GetResponseAsync(request.Message);
        return Json(new { role = "system", message = response.ToString() });
    }
    catch (Exception ex)
    {
        _logger.LogError(ex, "Error processing chat message: {Message}", request?.Message);
        return Json(new
        {
            role = "system",
            message = "Sorry, I encountered an error processing your message. Please try again."
        });
    }
}

And the ChatMessage class looks like this.

public class ChatMessage
{
    public string Role { get; set; } = string.Empty;
    public string Message { get; set; } = string.Empty;
}

Now I need to build the Docker image for the application as well. I am using dotnet publish command to create Docker image. And publishing the image to Docker hub again.

Next I will be creating an Azure app service. Here is the screenshot of the first screen.

Create Azure App Service

We need to select the Container in Publish configuration, then choose Linux as Operating System. Next, select the Container option. In the screen, select the Sidecar support option. Next for the Image source, select Other container registries. Next under Docker hub options, select the Access Type - it can be public or private. For this demo I am using Public. For the Registry server URL, we can keep the URL as it is https://index.docker.io. And for Image and tag, use the web app docker image and tag. As we are using .NET 9, we need to set the port as 8080.

Container Options

we can keep other settings as default, and click on Review + Create button to create the App service and deploy the application. Once the application is running, select the Deployment Center option of the Azure App Service. Then click on the Add +, and select Custom Container.

Add Container

In the screen, similar to web app container, we need to configure various settings - select Image source as Other container registries, next Image type as Public. For the Registry server URL, set the URL as https://index.docker.io. Set the Image and Tag and port - for Ollama it is 11434. Now we can run the application and try interacting with that.

Chat Application

In this blog post, we demonstrated how to deploy smollm2, a Small Language Model (SLM), as a sidecar on Linux App Service to seamlessly infuse AI capabilities into your web applications. We highlighted the advantages of SLMs—including their lightweight footprint, enhanced accessibility, and improved security—making them ideal for domain-specific customization and environmentally conscious development. To bring these concepts to life, we also walked through the setup of a simple .NET chat application that interacts with the SLM, offering a hands-on example of how to integrate SLMs into real-world projects.

Happy Programming

Copyright © 2025 Anuraj. Blog content licensed under the Creative Commons CC BY 2.5 | Unless otherwise stated or granted, code samples licensed under the MIT license. This is a personal blog. The opinions expressed here represent my own and not those of my employer. Powered by Jekyll. Hosted with ❤ by GitHub