NLWeb Deployment in Kubernetes GitOps Style with FluxCD

by Chris

The landscape of AI-powered web applications is evolving rapidly, and at the forefront of this revolution stands NLWeb — Microsoft’s groundbreaking open-source protocol that transforms traditional websites into intelligent, AI-driven knowledge hubs. When combined with Kubernetes container orchestration platform and GitOps methodologies, NLWeb creates a production-ready ecosystem that’s both scalable and maintainable. This comprehensive guide explores how to deploy NLWeb using modern DevOps practices, leveraging the power of FluxCD for continuous deployment and Azure’s robust cloud infrastructure for Kubernetes.

Discover the power of NLWeb when using K8s as operations kraken.

What Makes NLWeb Revolutionary in the AI Web Space?

NLWeb represents a paradigm shift in how we think about web applications. Unlike traditional static websites or even dynamic web applications, NLWeb enables AI-powered websites that can understand, process, and respond to user queries with unprecedented intelligence. The platform seamlessly integrates with vector databases, multiple LLM providers, and enterprise data sources to create truly interactive web experiences.

The protocol’s architecture is designed with modern cloud-native principles and CNCF best practices in mind. It supports multiple embedding providers including OpenAI, Azure OpenAI, Gemini, and Snowflake, while offering flexible LLM integration with providers ranging from Anthropic’s Claude AI assistant to Hugging Face models. This multi-provider approach ensures resilience and allows organizations to optimize costs while maintaining performance.

Key Points:

Intelligent Interactions: Enables natural language understanding and contextual responses
Multi-Provider Support: Integrates with various AI providers for flexibility and redundancy
Not yet Enterprise-Ready: Designed for production deployments with easy use in mind, it’s currently in early stage. We try to increase this by submitting bug fixes and enhancements.

Understanding the GitOps Advantage for NLWeb Deployments

GitOps declarative infrastructure management methodology has emerged as the gold standard for Kubernetes deployments, and NLWeb’s architecture perfectly aligns with this approach. By treating Git repositories as the single source of truth for infrastructure and application configurations, teams can achieve unprecedented levels of automation, auditability, and reliability.

The iunera helm charts repository provides production-ready Helm charts specifically designed for NLWeb deployments. These charts encapsulate years of operational experience and best practices, making it straightforward to deploy NLWeb in any Kubernetes environment while maintaining consistency across development, staging, and production environments. If you are interessented in a general purpose helmchart for basically any kind of simple deployment the Spring Boot chart is worth a look.

FluxCD serves as the GitOps operator, continuously monitoring the Git repository for changes and automatically applying them to the Kubernetes cluster. This approach eliminates configuration drift, reduces manual intervention, and provides a complete audit trail of all changes made to the system.

GitOps Benefits for NLWeb:

Declarative Infrastructure: Everything defined as code in Git repositories
Automated Deployments: Changes automatically applied when committed to Git
Version Control: Complete history of all configuration changes
Rollback Capability: Easy reversion to previous known-good states
Consistency: Same deployment process across all environments

Now, let’s explore the technical architecture of NLWeb on Kubernetes to understand how these components work together.

An different Use Case we’ve implemented is the Apache druid the deployment of a production grade using Druid Operators and FluxCD.

Technical Architecture: NLWeb on Kubernetes

Core Components and Configuration

NLWeb’s Kubernetes deployment consists of several key components that work together to deliver AI-powered web experiences:

Application Layer: The core NLWeb application runs as a Python-based service, typically deployed using the iunera/nlweb Docker image. The application serves on port 8000 and includes comprehensive health checks for both liveness and readiness probes.

Configuration Management: NLWeb uses a sophisticated configuration system with multiple YAML files:

config_webserver.yaml: Handles server settings, CORS policies, SSL configuration, and static file serving
config_llm.yaml: Manages LLM provider configurations and model selections
config_embedding.yaml: Controls embedding provider settings and model preferences
config_llm_performance.yaml: Optimizes performance through caching and response management

Security Context: The deployment implements Kubernetes pod security standards and best practices including:

Non-root user execution (UID 999)
Read-only root filesystem
Dropped capabilities
Security contexts for both pod and container levels

This architecture provides a secure, scalable foundation for deploying NLWeb in production environments.

Helm Chart Structure and Values

The NLWeb Helm chart provides extensive customization options through its values.yaml configuration:

replicaCount: 1
image:
  repository: iunera/nlweb
  pullPolicy: IfNotPresent

service:
  type: ClusterIP
  port: 8000

env:
  - name: PYTHONPATH
    value: "/app"
  - name: PORT
    value: "8000"
  - name: NLWEB_LOGGING_PROFILE
    value: production

The chart supports advanced features including:

Autoscaling: Horizontal Pod Autoscaler configuration with CPU-based scaling
Ingress: NGINX ingress controller integration with SSL/TLS termination
Volumes: Persistent volume claims, ConfigMaps, and EmptyDir volumes
Configmaps: Configure the NLWeb Configs like LLM, Vector Endpoint, etc from it
Security: Pod security contexts and network policies

FluxCD Integration: Continuous Deployment Made Simple

FluxCD continuous delivery for Kubernetes is a critical component in the GitOps deployment strategy for NLWeb, providing automated continuous delivery capabilities. It connects your Git repository to your Kubernetes cluster, ensuring that any changes to your deployment manifests are automatically applied.

HelmRelease Controller

The GitOps deployment of NLWeb leverages FluxCD’s HelmRelease custom resource to manage the application lifecycle. Here’s how the integration works:

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: nlweb
  namespace: nlweb
spec:
  releaseName: nlweb
  targetNamespace: nlweb
  chart:
    spec:
      chart: nlweb
      version: ">=1.1.0"
      sourceRef:
        kind: HelmRepository
        name: iunera-helm-charts
        namespace: helmrepos
  interval: 1m0s

This configuration ensures that FluxCD continuously monitors the Helm repository for updates and automatically applies them to the cluster. The interval: 1m0s setting means FluxCD checks for changes every minute, providing near real-time deployment capabilities.

Image Automation and Version Management

FluxCD’s image automation capabilities work seamlessly with NLWeb deployments. The system can automatically detect new container image versions and update the deployment manifests accordingly. This is particularly valuable for maintaining up-to-date deployments while ensuring proper testing and validation workflows.

Image Policy Configuration

NLWeb deployments leverage FluxCD’s image automation controllers to automatically update container images when new versions are published. This is configured through special annotations in the HelmRelease manifest:

image:
  repository: iunera/nlweb # {"$imagepolicy": "flux-system:nlweb:name"}
  tag: 1.2.4 # {"$imagepolicy": "flux-system:nlweb:tag"}

These annotations tell FluxCD to automatically update the image repository and tag values based on the image policy defined in the nlweb.imagerepo.yaml file. When a new image version is detected that matches the policy criteria, FluxCD automatically updates the manifest and commits the changes to the Git repository.

Image Repository and Policy Configuration

The image automation is configured through two key resources defined in the nlweb.imagerepo.yaml file:

# ImageRepository defines the Docker image repository to monitor
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageRepository
metadata:
  name: nlweb
  namespace: flux-system
spec:
  image: iunera/nlweb
  interval: 10m
  secretRef:
    name: iunera

---
# ImagePolicy defines which image versions to select
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImagePolicy
metadata:
  name: nlweb
  namespace: flux-system
spec:
  imageRepositoryRef:
    name: nlweb
  policy:
    semver:
      range: ">=1.0.0"

The ImageRepository resource specifies:

The Docker image to monitor (iunera/nlweb)
How often to check for new versions (interval: 10m)
Authentication credentials for the Docker registry (secretRef: name: iunera)

The ImagePolicy resource defines the selection criteria for image versions using semantic versioning, in this case selecting any version greater than or equal to 1.0.0.

Automation Workflow

The complete automation workflow is managed by the ImageUpdateAutomation resource:

apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageUpdateAutomation
metadata:
  name: flux-system
  namespace: flux-system
spec:
  git:
    checkout:
      ref:
        branch: master
    commit:
      author:
        email: fluxcdbot@nodomain.local
        name: fluxcdbot
      messageTemplate: |
        Automated image update

        Automation name: {{ .AutomationObject }}

        Files:
        {{ range $filename, $_ := .Changed.FileChanges -}}
        - {{ $filename }}
        {{ end -}}

        Objects:
        {{ range $resource, $changes := .Changed.Objects -}}
        - {{ $resource.Kind }} {{ $resource.Name }}
          Changes:
        {{- range $_, $change := $changes }}
            - {{ $change.OldValue }} -> {{ $change.NewValue }}
        {{ end -}}
        {{ end -}}
    push:
      branch: master
  interval: 30m0s
  sourceRef:
    kind: GitRepository
    name: flux-system
  update:
    path: ./kubernetes/common
    strategy: Setters

This resource:

Checks out the Git repository’s master branch
Configures commit details with a template that includes what was changed
Pushes changes back to the master branch
Runs every 30 minutes
Updates files in the ./kubernetes/common path using the “Setters” strategy (looking for image policy annotations)

With this configuration, the NLWeb deployment automatically stays up-to-date with the latest compatible container images without manual intervention, while maintaining a complete audit trail of all changes through Git history.

Docker Build and CI/CD Pipeline

The NLWeb Docker image build and deployment process follows a comprehensive CI/CD pipeline that integrates with the FluxCD GitOps workflow:

Dockerfile Structure and Multi-Stage Build

The NLWeb Dockerfile uses a Docker multi-stage build process for optimized container images to create an efficient and secure deployment package:

# Stage 1: Build stage
FROM python:3.13-slim AS builder

# Install build dependencies
RUN apt-get update && \
    apt-get install -y --no-install-recommends gcc python3-dev && \
    pip install --no-cache-dir --upgrade pip && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

WORKDIR /app

# Copy requirements file
COPY code/requirements.txt .

# Install Python packages
RUN pip install --no-cache-dir -r requirements.txt

# Copy requirements file
COPY docker_requirements.txt .

# Install Python packages
RUN pip install --no-cache-dir -r docker_requirements.txt

# Stage 2: Runtime stage
FROM python:3.13-slim

# Apply security updates
RUN apt-get update && \
   apt-get install -y --no-install-recommends --only-upgrade \
       $(apt-get --just-print upgrade | grep "^Inst" | grep -i securi | awk '{print $2}') && \
   apt-get clean && \
   rm -rf /var/lib/apt/lists/*

WORKDIR /app

# Create a non-root user and set permissions
RUN groupadd -r nlweb && \
    useradd -r -g nlweb -d /app -s /bin/bash nlweb && \
    chown -R nlweb:nlweb /app

USER nlweb

# Copy application code
COPY code/ /app/
COPY static/ /app/static/

# Copy installed packages from builder stage
COPY --from=builder /usr/local/lib/python3.13/site-packages /usr/local/lib/python3.13/site-packages
COPY --from=builder /usr/local/bin /usr/local/bin

# Expose the port the app runs on
EXPOSE 8000

# Set environment variables
ENV NLWEB_OUTPUT_DIR=/app
ENV PYTHONPATH=/app
ENV PORT=8000

ENV VERSION=1.2.4

# Command to run the application
CMD ["python", "app-file.py"]

Key aspects of the Dockerfile:

Stage 1 (Builder): Installs all dependencies and build tools
Stage 2 (Runtime): Creates a minimal runtime environment
Security Features: Non-root user, security updates, minimal dependencies
Version Definition: ENV VERSION=1.2.4 defines the version that will be used for tagging

GitHub Actions Workflow

When changes are pushed to the iuneracustomizations branch and the Dockerfile is modified, the GitHub Actions CI/CD automation workflow in .github/workflows/prod-build.yml is triggered:

name: prod-build

on:
  push:
    branches:
      - iuneracustomizations
    paths:
      - Dockerfile

jobs:
  build-and-push:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout Repository
        uses: actions/checkout@v4

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Log in to Private Registry
        uses: docker/login-action@v3
        with:
          username: ${{ secrets.DOCKERHUB_USERNAME }}
          password: ${{ secrets.DOCKERHUB_TOKEN }}

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v3
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Extract Version from Dockerfile
        id: extract_version
        run: |
          # Extract the VERSION from Dockerfile
          VERSION=$(grep "ENV VERSION=" Dockerfile | cut -d= -f2)
          echo "VERSION=${VERSION}" >> $GITHUB_ENV
          echo "Using version from Dockerfile: ${VERSION}"

      - name: Build the Docker image
        run: |
          docker build -t iunera/nlweb:latest -t iunera/nlweb:${{ env.VERSION }} .
          docker push iunera/nlweb:latest
          docker push iunera/nlweb:${{ env.VERSION }}
          echo "Built and pushed Docker image with tags: latest, ${{ env.VERSION }}"

      - name: Inspect
        run: |
          docker image inspect iunera/nlweb:latest

      - name: Create and Push Git Tag
        run: |
          git config --global user.name "GitHub Actions"
          git config --global user.email "actions@github.com"
          git tag -a v${{ env.VERSION }} -m "Release version ${{ env.VERSION }}"
          git push origin v${{ env.VERSION }}

The workflow performs these steps:

Checkout Repository: Clones the repository to the GitHub Actions runner
Set up Docker Buildx: Configures Docker with multi-architecture build support
Log in to Docker Hub: Authenticates with Docker Hub using repository secrets
Set up QEMU: Enables building for multiple architectures (ARM64, AMD64)
Extract Version: Parses the Dockerfile to extract the VERSION environment variable
Build and Push: Builds the Docker image with two tags (latest and the version number) and pushes both to Docker Hub
Inspect: Displays information about the built image for verification
Create Git Tag: Creates a Git tag for the version and pushes it to the repository

Complete CI/CD to Deployment Flow

The complete flow from Dockerfile to deployment involves:

Development: A developer updates the Dockerfile, potentially changing the VERSION
CI/CD: GitHub Actions builds and pushes the Docker image to Docker Hub
Automation: FluxCD detects the new image version in Docker Hub
GitOps: FluxCD updates the Kubernetes manifests with the new image version and commits the changes back to the Git repository
Deployment: FluxCD applies the changes to the Kubernetes cluster, creating new pods with the updated image

This GitOps approach ensures that:

The Git repository is the single source of truth
All changes are tracked and auditable
Deployments are automated and consistent
Rollbacks are simple and reliable

Local Development Environment

While the GitHub Actions workflow handles production builds, local development uses Docker Compose:

services:
  nlweb:
    build:
      context: .
      dockerfile: Dockerfile
    container_name: nlweb
    ports:
      - "8000:8000"
    env_file:
      - ./code/.env
    environment:
      - PYTHONPATH=/app
      - PORT=8000
    volumes:
      - ./data:/data
      - ./code/config:/app/config:ro
    healthcheck:
      test: ["CMD-SHELL", "python -c \"import urllib.request; urllib.request.urlopen('http://localhost:8000')\""]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 10s
    restart: unless-stopped
    user: nlweb

This setup:

Uses the same Dockerfile as production
Mounts local directories for data and configuration
Loads environment variables from a local .env file
Includes healthchecks for monitoring
Runs as the non-root nlweb user

The combination of GitHub Actions for CI/CD and FluxCD for GitOps creates a robust and automated pipeline for building and deploying NLWeb, ensuring consistency between development and production environments.

Azure Integration: Cloud-Native AI Infrastructure

NLWeb’s integration with Azure services makes it an ideal choice for organizations already invested in Microsoft’s cloud ecosystem. The platform natively supports:

Azure Cognitive Search: For vector search capabilities, NLWeb integrates with Azure’s vector search service, providing scalable and performant similarity search across large datasets.

Azure OpenAI Service: Direct integration with Azure’s OpenAI offerings, including GPT-4 and embedding models, ensures enterprise-grade AI capabilities with proper governance and compliance.

Azure Container Registry: Seamless integration with ACR for container image management and security scanning.

The configuration for Azure services is handled through environment variables and ConfigMaps, making it easy to manage different environments and maintain security best practices:

env:
  - name: AZURE_VECTOR_SEARCH_ENDPOINT
    value: "https://your-vector-search-db.search.windows.net"
  - name: AZURE_OPENAI_ENDPOINT
    value: "https://your-openai-instance.openai.azure.com/"

Production-Ready Features and Best Practices

Multi-Provider LLM Support

One of NLWeb’s standout features is its support for multiple LLM providers, ensuring vendor independence and cost optimization. The platform supports:

OpenAI: GPT-4.1 and GPT-4.1-mini models
Anthropic: Claude-3-7-sonnet-latest and Claude-3-5-haiku-latest
Azure OpenAI: Enterprise-grade OpenAI models with Azure’s security and compliance
Google Gemini: chat-bison models for diverse AI capabilities
Snowflake: Arctic embedding models and Claude integration
Hugging Face: Open-source models including Qwen2.5 series

This multi-provider approach allows organizations to:

Optimize costs by using different models for different use cases
Ensure service availability through provider redundancy
Experiment with cutting-edge models without vendor lock-in

Performance Optimization and Caching

Iuneras Customizations of NLWeb implements sophisticated caching mechanisms to optimize performance and reduce API costs:

cache:
  enable: true
  max_size: 1000
  ttl: 0  # No expiration
  include_schema: true
  include_provider: true
  include_model: true

The caching system considers multiple factors including schema, provider, and model when generating cache keys, ensuring accurate cache hits while maintaining response quality.

Enterprise Data Integration

Building on the foundation laid out in the comprehensive guide to exposing enterprise data with Java and Spring for AI indexing, NLWeb provides seamless integration with enterprise data sources. The platform supports:

JSON-LD and Schema.org: Structured data integration for semantic web capabilities
Vector Database Integration: Support for various vector databases including Azure Cognitive Search
Real-time Data Processing: Stream processing capabilities for dynamic content updates
Enterprise Security: Role-based access control and data governance features

Deployment Comparison: NLWeb vs Traditional Approaches

Feature	NLWeb GitOps	Azure Web Apps	Traditional Linux Install
Scalability	Auto-scaling with HPA	Limited vertical scaling	Manual scaling required
Deployment Speed	Automated via GitOps	Manual deployment	Manual configuration
Configuration Management	Git-based versioning	Portal-based settings	File-based configuration
Multi-environment Support	Native Kubernetes namespaces	Separate app instances	Separate servers
Rollback Capabilities	Git-based rollbacks	Limited rollback options	Manual rollback process
Cost Optimization	Resource-based pricing	App Service Plan pricing	Infrastructure costs
Monitoring & Observability	Kubernetes-native tools	Azure Monitor integration	Custom monitoring setup
Security	Pod security contexts	Azure security features	Manual security hardening

The iunera helm charts provide a significant advantage in this comparison, offering production-tested configurations that eliminate common deployment pitfalls.

Advanced Configuration Examples

This section provides practical, production-ready configuration examples for deploying NLWeb in various environments. These examples can be used as templates for your own deployments, with customization as needed for your specific requirements.

Note: The following examples are organized by use case to help you find the most relevant configurations for your needs.

Complete Helm Installation Manifest Examples

Basic Development Setup

For development environments, here’s a minimal helm installation manifest:

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: nlweb-dev
  namespace: nlweb-dev
spec:
  releaseName: nlweb-dev
  targetNamespace: nlweb-dev
  chart:
    spec:
      chart: nlweb
      version: ">=1.1.0"
      sourceRef:
        kind: HelmRepository
        name: iunera-helm-charts
        namespace: helmrepos
  interval: 5m0s
  install:
    createNamespace: true
  values:
    replicaCount: 1
    image:
      repository: iunera/nlweb
      tag: "latest"
      pullPolicy: Always

    env:
      - name: NLWEB_LOGGING_PROFILE
        value: development
      - name: OPENAI_API_KEY
        valueFrom:
          secretKeyRef:
            name: nlweb-secrets
            key: openai-api-key

    ingress:
      enabled: true
      annotations:
        kubernetes.io/ingress.class: nginx
      hosts:
        - host: nlweb-dev.local
          paths:
            - path: /
              pathType: ImplementationSpecific

    resources:
      requests:
        cpu: 100m
        memory: 512Mi
      limits:
        cpu: 500m
        memory: 1Gi

Production-Ready Setup with Multi-Provider LLM Support

For production environments with comprehensive AI provider integration:

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: nlweb-prod
  namespace: nlweb
spec:
  releaseName: nlweb
  targetNamespace: nlweb
  chart:
    spec:
      chart: nlweb
      version: ">=1.1.0"
      sourceRef:
        kind: HelmRepository
        name: iunera-helm-charts
        namespace: helmrepos
  interval: 1m0s
  install:
    createNamespace: false
  upgrade:
    remediation:
      retries: 3
  values:
    replicaCount: 3
    image:
      repository: iunera/nlweb
      tag: "1.2.4"
      pullPolicy: IfNotPresent

    env:
      - name: NLWEB_LOGGING_PROFILE
        value: production
      - name: AZURE_VECTOR_SEARCH_ENDPOINT
        value: "https://nlweb-prod.search.windows.net"
      - name: AZURE_VECTOR_SEARCH_API_KEY
        valueFrom:
          secretKeyRef:
            name: nlweb-azure-secrets
            key: vector-search-key
      - name: OPENAI_API_KEY
        valueFrom:
          secretKeyRef:
            name: nlweb-openai-secrets
            key: api-key
      - name: ANTHROPIC_API_KEY
        valueFrom:
          secretKeyRef:
            name: nlweb-anthropic-secrets
            key: api-key
      - name: AZURE_OPENAI_API_KEY
        valueFrom:
          secretKeyRef:
            name: nlweb-azure-openai-secrets
            key: api-key

    ingress:
      enabled: true
      annotations:
        kubernetes.io/ingress.class: nginx
        kubernetes.io/tls-acme: "true"
        cert-manager.io/cluster-issuer: letsencrypt-prod
        nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
        nginx.ingress.kubernetes.io/enable-modsecurity: "true"
        nginx.ingress.kubernetes.io/enable-owasp-core-rules: "true"
        nginx.ingress.kubernetes.io/rate-limit: "100"
        nginx.ingress.kubernetes.io/rate-limit-window: "1m"
      hosts:
        - host: nlweb.example.com
          paths:
            - path: /
              pathType: ImplementationSpecific
      tls:
        - secretName: nlweb-tls
          hosts:
            - nlweb.example.com

    resources:
      requests:
        cpu: 200m
        memory: 1Gi
      limits:
        cpu: 1000m
        memory: 2Gi

    autoscaling:
      enabled: true
      minReplicas: 3
      maxReplicas: 10
      targetCPUUtilizationPercentage: 70
      targetMemoryUtilizationPercentage: 80

Comprehensive ConfigMap Customization Examples

Web Server Configuration for Different Environments

Development Environment ConfigMap:

volumes:
  configMaps:
    - name: nlweb-dev-config
      mountPath: /app/config
      data:
        config_webserver.yaml: |-
          port: 8000
          static_directory: ../../
          mode: development

          server:
            host: 0.0.0.0
            enable_cors: true
            cors_trusted_origins: "*"  # Allow all origins in dev
            max_connections: 50
            timeout: 60

            logging:
              level: debug
              file: ./logs/webserver.log
              console: true

            static:
              enable_cache: false  # Disable caching in dev
              gzip_enabled: false

Production Environment ConfigMap:

volumes:
  configMaps:
    - name: nlweb-prod-config
      mountPath: /app/config
      data:
        config_webserver.yaml: |-
          port: 8000
          static_directory: ../../
          mode: production

          server:
            host: 0.0.0.0
            enable_cors: true
            cors_trusted_origins:
              - https://nlweb.example.com
              - https://api.example.com
              - https://admin.example.com
            max_connections: 200
            timeout: 30

            ssl:
              enabled: true
              cert_file_env: SSL_CERT_FILE
              key_file_env: SSL_KEY_FILE

            logging:
              level: info
              file: ./logs/webserver.log
              console: false
              rotation:
                max_size: 100MB
                max_files: 10

            static:
              enable_cache: true
              cache_max_age: 86400  # 24 hours
              gzip_enabled: true
              compression_level: 6

Multi-Provider LLM Configuration

Enterprise LLM Setup with Fallback Providers:

volumes:
  configMaps:
    - name: nlweb-llm-config
      mountPath: /app/config
      data:
        config_llm.yaml: |-
          preferred_endpoint: azure_openai
          fallback_strategy: round_robin

          endpoints:
            azure_openai:
              api_key_env: AZURE_OPENAI_API_KEY
              api_endpoint_env: AZURE_OPENAI_ENDPOINT
              api_version_env: "2024-12-01-preview"
              llm_type: azure_openai
              models:
                high: gpt-4o
                low: gpt-4o-mini
              rate_limits:
                requests_per_minute: 1000
                tokens_per_minute: 150000
              retry_config:
                max_retries: 3
                backoff_factor: 2

            openai:
              api_key_env: OPENAI_API_KEY
              api_endpoint_env: OPENAI_ENDPOINT
              llm_type: openai
              models:
                high: gpt-4-turbo
                low: gpt-3.5-turbo
              rate_limits:
                requests_per_minute: 500
                tokens_per_minute: 90000

            anthropic:
              api_key_env: ANTHROPIC_API_KEY
              llm_type: anthropic
              models:
                high: claude-3-opus-20240229
                low: claude-3-haiku-20240307
              rate_limits:
                requests_per_minute: 300
                tokens_per_minute: 60000

            gemini:
              api_key_env: GCP_PROJECT
              llm_type: gemini
              models:
                high: gemini-1.5-pro
                low: gemini-1.5-flash
              rate_limits:
                requests_per_minute: 200
                tokens_per_minute: 40000

Embedding Provider Configuration for Vector Search

Multi-Provider Embedding Setup:

volumes:
  configMaps:
    - name: nlweb-embedding-config
      mountPath: /app/config
      data:
        config_embedding.yaml: |-
          preferred_provider: azure_openai
          fallback_providers:
            - openai
            - snowflake

          providers:
            azure_openai:
              api_key_env: AZURE_OPENAI_API_KEY
              api_endpoint_env: AZURE_OPENAI_ENDPOINT
              api_version_env: "2024-10-21"
              model: text-embedding-3-large
              dimensions: 3072
              batch_size: 100
              rate_limits:
                requests_per_minute: 1000

            openai:
              api_key_env: OPENAI_API_KEY
              api_endpoint_env: OPENAI_ENDPOINT
              model: text-embedding-3-large
              dimensions: 3072
              batch_size: 100
              rate_limits:
                requests_per_minute: 500

            snowflake:
              api_key_env: SNOWFLAKE_PAT
              api_endpoint_env: SNOWFLAKE_ACCOUNT_URL
              api_version_env: "2024-10-01"
              model: snowflake-arctic-embed-l
              dimensions: 1024
              batch_size: 50
              rate_limits:
                requests_per_minute: 200

            huggingface:
              api_key_env: HF_TOKEN
              model: sentence-transformers/all-mpnet-base-v2
              dimensions: 768
              local_inference: true
              device: cpu

Performance Optimization Configuration

High-Performance Caching Setup:

volumes:
  configMaps:
    - name: nlweb-performance-config
      mountPath: /app/config
      data:
        config_llm_performance.yaml: |-
          # LLM Performance Settings
          representation:
            use_compact: true
            limit: 10
            include_metadata: true

          cache:
            enable: true
            max_size: 10000
            ttl: 3600  # 1 hour
            include_schema: true
            include_provider: true
            include_model: true
            include_user_context: false
            compression: gzip

          rate_limiting:
            enable: true
            requests_per_minute: 1000
            burst_size: 100
            per_user_limit: 50

          monitoring:
            enable_metrics: true
            metrics_port: 9090
            health_check_interval: 30
            performance_logging: true

Environment-Specific Volume Configurations

Development with Hot Reloading:

volumes:
  enabled: true
  emptyDirs:
    - name: data
      mountPath: /app/data
    - name: logs
      mountPath: /app/logs
    - name: tmp
      mountPath: /tmp
    - name: cache
      mountPath: /app/cache

  # Development: Use hostPath for easy file access
  hostPaths:
    - name: dev-config
      hostPath: /local/dev/nlweb/config
      mountPath: /app/config
      type: DirectoryOrCreate

Production with Persistent Storage:

volumes:
  enabled: true
  emptyDirs:
    - name: tmp
      mountPath: /tmp
      sizeLimit: 1Gi

  pvc:
    enabled: true
    storageClass: fast-ssd
    size: 50Gi
    accessMode: ReadWriteOnce
    mountPath: /app/data

  # Production: Use ConfigMaps for configuration
  configMaps:
    - name: nlweb-prod-config
      mountPath: /app/config
    - name: nlweb-llm-config
      mountPath: /app/config/llm
    - name: nlweb-embedding-config
      mountPath: /app/config/embedding

  # Production: Use Secrets for sensitive data
  existingSecrets:
    - name: nlweb-api-keys
      mountPath: /app/secrets
      defaultMode: 0400

Step-by-Step Helm Installation Guide

Prerequisites Setup

Before deploying NLWeb, ensure you have the following prerequisites:

1. Add the Iunera Helm Repository:

helm repo add iunera https://iunera.github.io/helm-charts/
helm repo update

2. Create Namespace and Secrets:

# Create namespace
kubectl create namespace nlweb

# Create secrets for API keys
kubectl create secret generic nlweb-openai-secrets \
  --from-literal=api-key="your-openai-api-key" \
  -n nlweb

kubectl create secret generic nlweb-azure-secrets \
  --from-literal=vector-search-key="your-azure-search-key" \
  --from-literal=openai-api-key="your-azure-openai-key" \
  -n nlweb

3. Install with Custom Values:

# Create custom values file
cat > nlweb-values.yaml << EOF
replicaCount: 2
image:
  repository: iunera/nlweb
  tag: "1.2.4"
env:
  - name: NLWEB_LOGGING_PROFILE
    value: production
  - name: OPENAI_API_KEY
    valueFrom:
      secretKeyRef:
        name: nlweb-openai-secrets
        key: api-key
ingress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt-prod
  hosts:
    - host: nlweb.yourdomain.com
      paths:
        - path: /
          pathType: ImplementationSpecific
  tls:
    - secretName: nlweb-tls
      hosts:
        - nlweb.yourdomain.com
resources:
  requests:
    cpu: 200m
    memory: 1Gi
  limits:
    cpu: 1000m
    memory: 2Gi
autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 8
  targetCPUUtilizationPercentage: 70
EOF
# Install NLWeb
helm install nlweb iunera/nlweb \
  --namespace nlweb \
  --values nlweb-values.yaml \
  --wait --timeout 10m

4. Verify Installation:

# Check pod status
kubectl get pods -n nlweb

# Check service status
kubectl get svc -n nlweb

# Check ingress
kubectl get ingress -n nlweb

# View logs
kubectl logs -f deployment/nlweb -n nlweb

Security Considerations and Best Practices

Security is a critical aspect of any production NLWeb deployment. This section outlines key security considerations and best practices to protect your NLWeb deployment and the sensitive data it processes.

API Key Management

NLWeb handles multiple API keys for various AI providers. Best practices include:

Using Kubernetes Secrets for sensitive data
Implementing secret rotation policies
Leveraging Azure Key Vault integration
Monitoring API key usage and costs

Network Security

networkPolicies:
  enabled: true
  ingress:
    - from:
      - namespaceSelector:
          matchLabels:
            name: ingress-nginx
  egress:
    - to: []
      ports:
      - protocol: TCP
        port: 443  # HTTPS to AI providers

Pod Security Standards

The deployment implements Pod Security Standards at the restricted level:

securityContext:
  capabilities:
    drop:
      - ALL
  readOnlyRootFilesystem: true
  runAsNonRoot: true
  runAsUser: 999
  allowPrivilegeEscalation: false

Conclusion

NLWeb's deployment in Kubernetes using GitOps methodologies represents the convergence of several technological trends: AI-powered applications, cloud-native infrastructure, and modern DevOps practices. The combination provides organizations with a robust, scalable, and maintainable platform for building next-generation web applications.

The integration with FluxCD ensures that deployments remain consistent and auditable, while the comprehensive Helm charts eliminate much of the complexity traditionally associated with Kubernetes deployments. Azure's AI services provide enterprise-grade capabilities, and the multi-provider LLM support ensures flexibility and cost optimization.

As organizations continue to embrace AI-powered applications, the patterns and practices outlined in this guide will become increasingly valuable. The GitOps approach to NLWeb deployment not only simplifies operations but also provides the foundation for scaling AI applications across enterprise environments.

The future of web applications is undoubtedly AI-powered, and NLWeb's Kubernetes GitOps deployment model provides a clear path forward for organizations ready to embrace this transformation.

Frequently Asked Questions

What sets NLWeb Deployment in Kubernetes GitOps Style apart from traditional deployment methods?

NLWeb's GitOps deployment offers automated configuration management, version-controlled infrastructure, and seamless rollback capabilities. Unlike traditional deployments, it provides declarative configuration management through Git repositories, ensuring consistency across environments and eliminating configuration drift. The integration with FluxCD enables continuous deployment with minimal manual intervention.

How does NLWeb handle multiple LLM providers in a Kubernetes environment?

NLWeb's architecture supports multiple LLM providers through a unified configuration system. The platform can simultaneously connect to OpenAI, Anthropic, Azure OpenAI, Gemini, Snowflake, and Hugging Face models. This multi-provider approach is managed through environment variables and ConfigMaps, allowing for easy switching between providers based on cost, performance, or availability requirements.

What are the resource requirements for running NLWeb in production?

A typical production NLWeb deployment requires a minimum of 200m CPU and 1Gi memory per pod, with recommended limits of 1000m CPU and 2Gi memory. For high-traffic scenarios, horizontal pod autoscaling can scale from 2 to 10 replicas based on CPU utilization. Storage requirements vary based on caching configuration and data persistence needs, typically starting at 10Gi for persistent volumes.

How does the GitOps approach improve security for NLWeb deployments?

GitOps enhances security through immutable infrastructure, audit trails, and declarative configuration management. All changes are tracked in Git, providing complete visibility into who made what changes and when. The approach eliminates direct cluster access for deployments, reducing the attack surface. Additionally, secrets management is handled through Kubernetes native resources and can be integrated with external secret management systems.

Can NLWeb be deployed across multiple cloud providers?

Yes, NLWeb's cloud-agnostic design allows deployment across multiple cloud providers. While it has deep Azure integration, the Kubernetes-native architecture supports deployment on AWS EKS, Google GKE, or on-premises clusters. The Helm charts abstract cloud-specific configurations, making multi-cloud deployments straightforward.

What monitoring and observability tools work best with NLWeb?

NLWeb integrates well with the Kubernetes ecosystem's monitoring tools including Prometheus for metrics collection, Grafana for visualization, and Jaeger for distributed tracing. The application exposes health check endpoints and custom metrics for AI query performance, cache hit rates, and LLM provider response times. Integration with cloud-native monitoring solutions like Azure Monitor or AWS CloudWatch is also supported.

How does NLWeb handle data privacy and compliance requirements?

NLWeb implements several privacy and compliance features including data encryption in transit and at rest, configurable data retention policies, and audit logging. The platform supports role-based access control and can be configured to meet various compliance standards including GDPR, HIPAA, and SOC 2. Integration with enterprise identity providers ensures proper authentication and authorization.

What's the recommended approach for testing NLWeb deployments?

The recommended testing approach includes unit tests for individual components, integration tests for AI provider connectivity, and end-to-end tests for complete user workflows. The GitOps deployment model supports multiple environments (development, staging, production) with environment-specific configurations. Automated testing can be integrated into the CI/CD pipeline to validate deployments before they reach production.

How does NLWeb's caching system improve performance and reduce costs?

NLWeb implements intelligent caching that considers multiple factors including query schema, AI provider, and model type when generating cache keys. This approach significantly reduces API calls to expensive LLM providers while maintaining response accuracy. The cache can be configured with custom TTL values and size limits, and supports both in-memory and persistent storage options.

What are the backup and disaster recovery options for NLWeb?

NLWeb supports comprehensive backup strategies including persistent volume snapshots, configuration backup through Git repositories, and database backups for vector stores. The GitOps approach inherently provides configuration recovery through Git history. For disaster recovery, the platform supports cross-region deployments and can be quickly restored in different availability zones or cloud regions using the same Helm charts and configuration.

Let us know your challenges or support us by sharing the article

Check iunera.com to learn more about what we do!

Categories:

Machine Learning and AI NLWeb Our Projects

Tags:

FluxCD git Gitops k8s Kubernetes machineLearning NLweb