The landscape of AI-powered web applications is evolving rapidly, and at the forefront of this revolution stands NLWeb — Microsoft’s groundbreaking open-source protocol that transforms traditional websites into intelligent, AI-driven knowledge hubs. When combined with Kubernetes container orchestration platform and GitOps methodologies, NLWeb creates a production-ready ecosystem that’s both scalable and maintainable. This comprehensive guide explores how to deploy NLWeb using modern DevOps practices, leveraging the power of FluxCD for continuous deployment and Azure’s robust cloud infrastructure for Kubernetes.

- What Makes NLWeb Revolutionary in the AI Web Space?
- Understanding the GitOps Advantage for NLWeb Deployments
- Technical Architecture: NLWeb on Kubernetes
- FluxCD Integration: Continuous Deployment Made Simple
- Azure Integration: Cloud-Native AI Infrastructure
- Production-Ready Features and Best Practices
- Deployment Comparison: NLWeb vs Traditional Approaches
- Advanced Configuration Examples
- Security Considerations and Best Practices
- Conclusion
- Frequently Asked Questions
What Makes NLWeb Revolutionary in the AI Web Space?
NLWeb represents a paradigm shift in how we think about web applications. Unlike traditional static websites or even dynamic web applications, NLWeb enables AI-powered websites that can understand, process, and respond to user queries with unprecedented intelligence. The platform seamlessly integrates with vector databases, multiple LLM providers, and enterprise data sources to create truly interactive web experiences.
The protocol’s architecture is designed with modern cloud-native principles and CNCF best practices in mind. It supports multiple embedding providers including OpenAI, Azure OpenAI, Gemini, and Snowflake, while offering flexible LLM integration with providers ranging from Anthropic’s Claude AI assistant to Hugging Face models. This multi-provider approach ensures resilience and allows organizations to optimize costs while maintaining performance.
Key Points:
- Intelligent Interactions: Enables natural language understanding and contextual responses
- Multi-Provider Support: Integrates with various AI providers for flexibility and redundancy
- Not yet Enterprise-Ready: Designed for production deployments with easy use in mind, it’s currently in early stage. We try to increase this by submitting bug fixes and enhancements.
Understanding the GitOps Advantage for NLWeb Deployments
GitOps declarative infrastructure management methodology has emerged as the gold standard for Kubernetes deployments, and NLWeb’s architecture perfectly aligns with this approach. By treating Git repositories as the single source of truth for infrastructure and application configurations, teams can achieve unprecedented levels of automation, auditability, and reliability.
The iunera helm charts repository provides production-ready Helm charts specifically designed for NLWeb deployments. These charts encapsulate years of operational experience and best practices, making it straightforward to deploy NLWeb in any Kubernetes environment while maintaining consistency across development, staging, and production environments. If you are interessented in a general purpose helmchart for basically any kind of simple deployment the Spring Boot chart is worth a look.
FluxCD serves as the GitOps operator, continuously monitoring the Git repository for changes and automatically applying them to the Kubernetes cluster. This approach eliminates configuration drift, reduces manual intervention, and provides a complete audit trail of all changes made to the system.
GitOps Benefits for NLWeb:
- Declarative Infrastructure: Everything defined as code in Git repositories
- Automated Deployments: Changes automatically applied when committed to Git
- Version Control: Complete history of all configuration changes
- Rollback Capability: Easy reversion to previous known-good states
- Consistency: Same deployment process across all environments
Now, let’s explore the technical architecture of NLWeb on Kubernetes to understand how these components work together.
An different Use Case we’ve implemented is the Apache druid the deployment of a production grade using Druid Operators and FluxCD.
Technical Architecture: NLWeb on Kubernetes
Core Components and Configuration
NLWeb’s Kubernetes deployment consists of several key components that work together to deliver AI-powered web experiences:
Application Layer: The core NLWeb application runs as a Python-based service, typically deployed using the iunera/nlweb Docker image. The application serves on port 8000 and includes comprehensive health checks for both liveness and readiness probes.
Configuration Management: NLWeb uses a sophisticated configuration system with multiple YAML files:
config_webserver.yaml: Handles server settings, CORS policies, SSL configuration, and static file servingconfig_llm.yaml: Manages LLM provider configurations and model selectionsconfig_embedding.yaml: Controls embedding provider settings and model preferencesconfig_llm_performance.yaml: Optimizes performance through caching and response management
Security Context: The deployment implements Kubernetes pod security standards and best practices including:
- Non-root user execution (UID 999)
- Read-only root filesystem
- Dropped capabilities
- Security contexts for both pod and container levels
This architecture provides a secure, scalable foundation for deploying NLWeb in production environments.
Helm Chart Structure and Values
The NLWeb Helm chart provides extensive customization options through its values.yaml configuration:
replicaCount: 1
image:
repository: iunera/nlweb
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 8000
env:
- name: PYTHONPATH
value: "/app"
- name: PORT
value: "8000"
- name: NLWEB_LOGGING_PROFILE
value: production
The chart supports advanced features including:
- Autoscaling: Horizontal Pod Autoscaler configuration with CPU-based scaling
- Ingress: NGINX ingress controller integration with SSL/TLS termination
- Volumes: Persistent volume claims, ConfigMaps, and EmptyDir volumes
- Configmaps: Configure the NLWeb Configs like LLM, Vector Endpoint, etc from it
- Security: Pod security contexts and network policies
FluxCD Integration: Continuous Deployment Made Simple
FluxCD continuous delivery for Kubernetes is a critical component in the GitOps deployment strategy for NLWeb, providing automated continuous delivery capabilities. It connects your Git repository to your Kubernetes cluster, ensuring that any changes to your deployment manifests are automatically applied.
HelmRelease Controller
The GitOps deployment of NLWeb leverages FluxCD’s HelmRelease custom resource to manage the application lifecycle. Here’s how the integration works:
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: nlweb
namespace: nlweb
spec:
releaseName: nlweb
targetNamespace: nlweb
chart:
spec:
chart: nlweb
version: ">=1.1.0"
sourceRef:
kind: HelmRepository
name: iunera-helm-charts
namespace: helmrepos
interval: 1m0sThis configuration ensures that FluxCD continuously monitors the Helm repository for updates and automatically applies them to the cluster. The interval: 1m0s setting means FluxCD checks for changes every minute, providing near real-time deployment capabilities.
Image Automation and Version Management
FluxCD’s image automation capabilities work seamlessly with NLWeb deployments. The system can automatically detect new container image versions and update the deployment manifests accordingly. This is particularly valuable for maintaining up-to-date deployments while ensuring proper testing and validation workflows.
Image Policy Configuration
NLWeb deployments leverage FluxCD’s image automation controllers to automatically update container images when new versions are published. This is configured through special annotations in the HelmRelease manifest:
image:
repository: iunera/nlweb # {"$imagepolicy": "flux-system:nlweb:name"}
tag: 1.2.4 # {"$imagepolicy": "flux-system:nlweb:tag"}These annotations tell FluxCD to automatically update the image repository and tag values based on the image policy defined in the nlweb.imagerepo.yaml file. When a new image version is detected that matches the policy criteria, FluxCD automatically updates the manifest and commits the changes to the Git repository.
Image Repository and Policy Configuration
The image automation is configured through two key resources defined in the nlweb.imagerepo.yaml file:
# ImageRepository defines the Docker image repository to monitor
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageRepository
metadata:
name: nlweb
namespace: flux-system
spec:
image: iunera/nlweb
interval: 10m
secretRef:
name: iunera
---
# ImagePolicy defines which image versions to select
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImagePolicy
metadata:
name: nlweb
namespace: flux-system
spec:
imageRepositoryRef:
name: nlweb
policy:
semver:
range: ">=1.0.0"The ImageRepository resource specifies:
- The Docker image to monitor (
iunera/nlweb) - How often to check for new versions (
interval: 10m) - Authentication credentials for the Docker registry (
secretRef: name: iunera)
The ImagePolicy resource defines the selection criteria for image versions using semantic versioning, in this case selecting any version greater than or equal to 1.0.0.
Automation Workflow
The complete automation workflow is managed by the ImageUpdateAutomation resource:
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageUpdateAutomation
metadata:
name: flux-system
namespace: flux-system
spec:
git:
checkout:
ref:
branch: master
commit:
author:
email: fluxcdbot@nodomain.local
name: fluxcdbot
messageTemplate: |
Automated image update
Automation name: {{ .AutomationObject }}
Files:
{{ range $filename, $_ := .Changed.FileChanges -}}
- {{ $filename }}
{{ end -}}
Objects:
{{ range $resource, $changes := .Changed.Objects -}}
- {{ $resource.Kind }} {{ $resource.Name }}
Changes:
{{- range $_, $change := $changes }}
- {{ $change.OldValue }} -> {{ $change.NewValue }}
{{ end -}}
{{ end -}}
push:
branch: master
interval: 30m0s
sourceRef:
kind: GitRepository
name: flux-system
update:
path: ./kubernetes/common
strategy: SettersThis resource:
- Checks out the Git repository’s master branch
- Configures commit details with a template that includes what was changed
- Pushes changes back to the master branch
- Runs every 30 minutes
- Updates files in the
./kubernetes/commonpath using the “Setters” strategy (looking for image policy annotations)
With this configuration, the NLWeb deployment automatically stays up-to-date with the latest compatible container images without manual intervention, while maintaining a complete audit trail of all changes through Git history.
Docker Build and CI/CD Pipeline
The NLWeb Docker image build and deployment process follows a comprehensive CI/CD pipeline that integrates with the FluxCD GitOps workflow:
Dockerfile Structure and Multi-Stage Build
The NLWeb Dockerfile uses a Docker multi-stage build process for optimized container images to create an efficient and secure deployment package:
# Stage 1: Build stage
FROM python:3.13-slim AS builder
# Install build dependencies
RUN apt-get update && \
apt-get install -y --no-install-recommends gcc python3-dev && \
pip install --no-cache-dir --upgrade pip && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
WORKDIR /app
# Copy requirements file
COPY code/requirements.txt .
# Install Python packages
RUN pip install --no-cache-dir -r requirements.txt
# Copy requirements file
COPY docker_requirements.txt .
# Install Python packages
RUN pip install --no-cache-dir -r docker_requirements.txt
# Stage 2: Runtime stage
FROM python:3.13-slim
# Apply security updates
RUN apt-get update && \
apt-get install -y --no-install-recommends --only-upgrade \
$(apt-get --just-print upgrade | grep "^Inst" | grep -i securi | awk '{print $2}') && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
WORKDIR /app
# Create a non-root user and set permissions
RUN groupadd -r nlweb && \
useradd -r -g nlweb -d /app -s /bin/bash nlweb && \
chown -R nlweb:nlweb /app
USER nlweb
# Copy application code
COPY code/ /app/
COPY static/ /app/static/
# Copy installed packages from builder stage
COPY --from=builder /usr/local/lib/python3.13/site-packages /usr/local/lib/python3.13/site-packages
COPY --from=builder /usr/local/bin /usr/local/bin
# Expose the port the app runs on
EXPOSE 8000
# Set environment variables
ENV NLWEB_OUTPUT_DIR=/app
ENV PYTHONPATH=/app
ENV PORT=8000
ENV VERSION=1.2.4
# Command to run the application
CMD ["python", "app-file.py"]Key aspects of the Dockerfile:
- Stage 1 (Builder): Installs all dependencies and build tools
- Stage 2 (Runtime): Creates a minimal runtime environment
- Security Features: Non-root user, security updates, minimal dependencies
- Version Definition:
ENV VERSION=1.2.4defines the version that will be used for tagging
GitHub Actions Workflow
When changes are pushed to the iuneracustomizations branch and the Dockerfile is modified, the GitHub Actions CI/CD automation workflow in .github/workflows/prod-build.yml is triggered:
name: prod-build
on:
push:
branches:
- iuneracustomizations
paths:
- Dockerfile
jobs:
build-and-push:
runs-on: ubuntu-latest
steps:
- name: Checkout Repository
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to Private Registry
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Extract Version from Dockerfile
id: extract_version
run: |
# Extract the VERSION from Dockerfile
VERSION=$(grep "ENV VERSION=" Dockerfile | cut -d= -f2)
echo "VERSION=${VERSION}" >> $GITHUB_ENV
echo "Using version from Dockerfile: ${VERSION}"
- name: Build the Docker image
run: |
docker build -t iunera/nlweb:latest -t iunera/nlweb:${{ env.VERSION }} .
docker push iunera/nlweb:latest
docker push iunera/nlweb:${{ env.VERSION }}
echo "Built and pushed Docker image with tags: latest, ${{ env.VERSION }}"
- name: Inspect
run: |
docker image inspect iunera/nlweb:latest
- name: Create and Push Git Tag
run: |
git config --global user.name "GitHub Actions"
git config --global user.email "actions@github.com"
git tag -a v${{ env.VERSION }} -m "Release version ${{ env.VERSION }}"
git push origin v${{ env.VERSION }}The workflow performs these steps:
- Checkout Repository: Clones the repository to the GitHub Actions runner
- Set up Docker Buildx: Configures Docker with multi-architecture build support
- Log in to Docker Hub: Authenticates with Docker Hub using repository secrets
- Set up QEMU: Enables building for multiple architectures (ARM64, AMD64)
- Extract Version: Parses the Dockerfile to extract the VERSION environment variable
- Build and Push: Builds the Docker image with two tags (
latestand the version number) and pushes both to Docker Hub - Inspect: Displays information about the built image for verification
- Create Git Tag: Creates a Git tag for the version and pushes it to the repository
Complete CI/CD to Deployment Flow
The complete flow from Dockerfile to deployment involves:
- Development: A developer updates the Dockerfile, potentially changing the VERSION
- CI/CD: GitHub Actions builds and pushes the Docker image to Docker Hub
- Automation: FluxCD detects the new image version in Docker Hub
- GitOps: FluxCD updates the Kubernetes manifests with the new image version and commits the changes back to the Git repository
- Deployment: FluxCD applies the changes to the Kubernetes cluster, creating new pods with the updated image
This GitOps approach ensures that:
- The Git repository is the single source of truth
- All changes are tracked and auditable
- Deployments are automated and consistent
- Rollbacks are simple and reliable
Local Development Environment
While the GitHub Actions workflow handles production builds, local development uses Docker Compose:
services:
nlweb:
build:
context: .
dockerfile: Dockerfile
container_name: nlweb
ports:
- "8000:8000"
env_file:
- ./code/.env
environment:
- PYTHONPATH=/app
- PORT=8000
volumes:
- ./data:/data
- ./code/config:/app/config:ro
healthcheck:
test: ["CMD-SHELL", "python -c \"import urllib.request; urllib.request.urlopen('http://localhost:8000')\""]
interval: 30s
timeout: 10s
retries: 3
start_period: 10s
restart: unless-stopped
user: nlwebThis setup:
- Uses the same Dockerfile as production
- Mounts local directories for data and configuration
- Loads environment variables from a local .env file
- Includes healthchecks for monitoring
- Runs as the non-root nlweb user
The combination of GitHub Actions for CI/CD and FluxCD for GitOps creates a robust and automated pipeline for building and deploying NLWeb, ensuring consistency between development and production environments.
Azure Integration: Cloud-Native AI Infrastructure
NLWeb’s integration with Azure services makes it an ideal choice for organizations already invested in Microsoft’s cloud ecosystem. The platform natively supports:
Azure Cognitive Search: For vector search capabilities, NLWeb integrates with Azure’s vector search service, providing scalable and performant similarity search across large datasets.
Azure OpenAI Service: Direct integration with Azure’s OpenAI offerings, including GPT-4 and embedding models, ensures enterprise-grade AI capabilities with proper governance and compliance.
Azure Container Registry: Seamless integration with ACR for container image management and security scanning.
The configuration for Azure services is handled through environment variables and ConfigMaps, making it easy to manage different environments and maintain security best practices:
env:
- name: AZURE_VECTOR_SEARCH_ENDPOINT
value: "https://your-vector-search-db.search.windows.net"
- name: AZURE_OPENAI_ENDPOINT
value: "https://your-openai-instance.openai.azure.com/"Production-Ready Features and Best Practices
Multi-Provider LLM Support
One of NLWeb’s standout features is its support for multiple LLM providers, ensuring vendor independence and cost optimization. The platform supports:
- OpenAI: GPT-4.1 and GPT-4.1-mini models
- Anthropic: Claude-3-7-sonnet-latest and Claude-3-5-haiku-latest
- Azure OpenAI: Enterprise-grade OpenAI models with Azure’s security and compliance
- Google Gemini: chat-bison models for diverse AI capabilities
- Snowflake: Arctic embedding models and Claude integration
- Hugging Face: Open-source models including Qwen2.5 series
This multi-provider approach allows organizations to:
- Optimize costs by using different models for different use cases
- Ensure service availability through provider redundancy
- Experiment with cutting-edge models without vendor lock-in
Performance Optimization and Caching
Iuneras Customizations of NLWeb implements sophisticated caching mechanisms to optimize performance and reduce API costs:
cache: enable: true max_size: 1000 ttl: 0 # No expiration include_schema: true include_provider: true include_model: true
The caching system considers multiple factors including schema, provider, and model when generating cache keys, ensuring accurate cache hits while maintaining response quality.
Enterprise Data Integration
Building on the foundation laid out in the comprehensive guide to exposing enterprise data with Java and Spring for AI indexing, NLWeb provides seamless integration with enterprise data sources. The platform supports:
- JSON-LD and Schema.org: Structured data integration for semantic web capabilities
- Vector Database Integration: Support for various vector databases including Azure Cognitive Search
- Real-time Data Processing: Stream processing capabilities for dynamic content updates
- Enterprise Security: Role-based access control and data governance features
Deployment Comparison: NLWeb vs Traditional Approaches
| Feature | NLWeb GitOps | Azure Web Apps | Traditional Linux Install |
|---|---|---|---|
| Scalability | Auto-scaling with HPA | Limited vertical scaling | Manual scaling required |
| Deployment Speed | Automated via GitOps | Manual deployment | Manual configuration |
| Configuration Management | Git-based versioning | Portal-based settings | File-based configuration |
| Multi-environment Support | Native Kubernetes namespaces | Separate app instances | Separate servers |
| Rollback Capabilities | Git-based rollbacks | Limited rollback options | Manual rollback process |
| Cost Optimization | Resource-based pricing | App Service Plan pricing | Infrastructure costs |
| Monitoring & Observability | Kubernetes-native tools | Azure Monitor integration | Custom monitoring setup |
| Security | Pod security contexts | Azure security features | Manual security hardening |
The iunera helm charts provide a significant advantage in this comparison, offering production-tested configurations that eliminate common deployment pitfalls.
Advanced Configuration Examples
This section provides practical, production-ready configuration examples for deploying NLWeb in various environments. These examples can be used as templates for your own deployments, with customization as needed for your specific requirements.
Note: The following examples are organized by use case to help you find the most relevant configurations for your needs.
Complete Helm Installation Manifest Examples
Basic Development Setup
For development environments, here’s a minimal helm installation manifest:
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: nlweb-dev
namespace: nlweb-dev
spec:
releaseName: nlweb-dev
targetNamespace: nlweb-dev
chart:
spec:
chart: nlweb
version: ">=1.1.0"
sourceRef:
kind: HelmRepository
name: iunera-helm-charts
namespace: helmrepos
interval: 5m0s
install:
createNamespace: true
values:
replicaCount: 1
image:
repository: iunera/nlweb
tag: "latest"
pullPolicy: Always
env:
- name: NLWEB_LOGGING_PROFILE
value: development
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: nlweb-secrets
key: openai-api-key
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: nginx
hosts:
- host: nlweb-dev.local
paths:
- path: /
pathType: ImplementationSpecific
resources:
requests:
cpu: 100m
memory: 512Mi
limits:
cpu: 500m
memory: 1GiProduction-Ready Setup with Multi-Provider LLM Support
For production environments with comprehensive AI provider integration:
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: nlweb-prod
namespace: nlweb
spec:
releaseName: nlweb
targetNamespace: nlweb
chart:
spec:
chart: nlweb
version: ">=1.1.0"
sourceRef:
kind: HelmRepository
name: iunera-helm-charts
namespace: helmrepos
interval: 1m0s
install:
createNamespace: false
upgrade:
remediation:
retries: 3
values:
replicaCount: 3
image:
repository: iunera/nlweb
tag: "1.2.4"
pullPolicy: IfNotPresent
env:
- name: NLWEB_LOGGING_PROFILE
value: production
- name: AZURE_VECTOR_SEARCH_ENDPOINT
value: "https://nlweb-prod.search.windows.net"
- name: AZURE_VECTOR_SEARCH_API_KEY
valueFrom:
secretKeyRef:
name: nlweb-azure-secrets
key: vector-search-key
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: nlweb-openai-secrets
key: api-key
- name: ANTHROPIC_API_KEY
valueFrom:
secretKeyRef:
name: nlweb-anthropic-secrets
key: api-key
- name: AZURE_OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: nlweb-azure-openai-secrets
key: api-key
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: nginx
kubernetes.io/tls-acme: "true"
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
nginx.ingress.kubernetes.io/enable-modsecurity: "true"
nginx.ingress.kubernetes.io/enable-owasp-core-rules: "true"
nginx.ingress.kubernetes.io/rate-limit: "100"
nginx.ingress.kubernetes.io/rate-limit-window: "1m"
hosts:
- host: nlweb.example.com
paths:
- path: /
pathType: ImplementationSpecific
tls:
- secretName: nlweb-tls
hosts:
- nlweb.example.com
resources:
requests:
cpu: 200m
memory: 1Gi
limits:
cpu: 1000m
memory: 2Gi
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 80Comprehensive ConfigMap Customization Examples
Web Server Configuration for Different Environments
Development Environment ConfigMap:
volumes:
configMaps:
- name: nlweb-dev-config
mountPath: /app/config
data:
config_webserver.yaml: |-
port: 8000
static_directory: ../../
mode: development
server:
host: 0.0.0.0
enable_cors: true
cors_trusted_origins: "*" # Allow all origins in dev
max_connections: 50
timeout: 60
logging:
level: debug
file: ./logs/webserver.log
console: true
static:
enable_cache: false # Disable caching in dev
gzip_enabled: falseProduction Environment ConfigMap:
volumes:
configMaps:
- name: nlweb-prod-config
mountPath: /app/config
data:
config_webserver.yaml: |-
port: 8000
static_directory: ../../
mode: production
server:
host: 0.0.0.0
enable_cors: true
cors_trusted_origins:
- https://nlweb.example.com
- https://api.example.com
- https://admin.example.com
max_connections: 200
timeout: 30
ssl:
enabled: true
cert_file_env: SSL_CERT_FILE
key_file_env: SSL_KEY_FILE
logging:
level: info
file: ./logs/webserver.log
console: false
rotation:
max_size: 100MB
max_files: 10
static:
enable_cache: true
cache_max_age: 86400 # 24 hours
gzip_enabled: true
compression_level: 6Multi-Provider LLM Configuration
Enterprise LLM Setup with Fallback Providers:
volumes:
configMaps:
- name: nlweb-llm-config
mountPath: /app/config
data:
config_llm.yaml: |-
preferred_endpoint: azure_openai
fallback_strategy: round_robin
endpoints:
azure_openai:
api_key_env: AZURE_OPENAI_API_KEY
api_endpoint_env: AZURE_OPENAI_ENDPOINT
api_version_env: "2024-12-01-preview"
llm_type: azure_openai
models:
high: gpt-4o
low: gpt-4o-mini
rate_limits:
requests_per_minute: 1000
tokens_per_minute: 150000
retry_config:
max_retries: 3
backoff_factor: 2
openai:
api_key_env: OPENAI_API_KEY
api_endpoint_env: OPENAI_ENDPOINT
llm_type: openai
models:
high: gpt-4-turbo
low: gpt-3.5-turbo
rate_limits:
requests_per_minute: 500
tokens_per_minute: 90000
anthropic:
api_key_env: ANTHROPIC_API_KEY
llm_type: anthropic
models:
high: claude-3-opus-20240229
low: claude-3-haiku-20240307
rate_limits:
requests_per_minute: 300
tokens_per_minute: 60000
gemini:
api_key_env: GCP_PROJECT
llm_type: gemini
models:
high: gemini-1.5-pro
low: gemini-1.5-flash
rate_limits:
requests_per_minute: 200
tokens_per_minute: 40000Embedding Provider Configuration for Vector Search
Multi-Provider Embedding Setup:
volumes:
configMaps:
- name: nlweb-embedding-config
mountPath: /app/config
data:
config_embedding.yaml: |-
preferred_provider: azure_openai
fallback_providers:
- openai
- snowflake
providers:
azure_openai:
api_key_env: AZURE_OPENAI_API_KEY
api_endpoint_env: AZURE_OPENAI_ENDPOINT
api_version_env: "2024-10-21"
model: text-embedding-3-large
dimensions: 3072
batch_size: 100
rate_limits:
requests_per_minute: 1000
openai:
api_key_env: OPENAI_API_KEY
api_endpoint_env: OPENAI_ENDPOINT
model: text-embedding-3-large
dimensions: 3072
batch_size: 100
rate_limits:
requests_per_minute: 500
snowflake:
api_key_env: SNOWFLAKE_PAT
api_endpoint_env: SNOWFLAKE_ACCOUNT_URL
api_version_env: "2024-10-01"
model: snowflake-arctic-embed-l
dimensions: 1024
batch_size: 50
rate_limits:
requests_per_minute: 200
huggingface:
api_key_env: HF_TOKEN
model: sentence-transformers/all-mpnet-base-v2
dimensions: 768
local_inference: true
device: cpuPerformance Optimization Configuration
High-Performance Caching Setup:
volumes:
configMaps:
- name: nlweb-performance-config
mountPath: /app/config
data:
config_llm_performance.yaml: |-
# LLM Performance Settings
representation:
use_compact: true
limit: 10
include_metadata: true
cache:
enable: true
max_size: 10000
ttl: 3600 # 1 hour
include_schema: true
include_provider: true
include_model: true
include_user_context: false
compression: gzip
rate_limiting:
enable: true
requests_per_minute: 1000
burst_size: 100
per_user_limit: 50
monitoring:
enable_metrics: true
metrics_port: 9090
health_check_interval: 30
performance_logging: trueEnvironment-Specific Volume Configurations
Development with Hot Reloading:
volumes:
enabled: true
emptyDirs:
- name: data
mountPath: /app/data
- name: logs
mountPath: /app/logs
- name: tmp
mountPath: /tmp
- name: cache
mountPath: /app/cache
# Development: Use hostPath for easy file access
hostPaths:
- name: dev-config
hostPath: /local/dev/nlweb/config
mountPath: /app/config
type: DirectoryOrCreateProduction with Persistent Storage:
volumes:
enabled: true
emptyDirs:
- name: tmp
mountPath: /tmp
sizeLimit: 1Gi
pvc:
enabled: true
storageClass: fast-ssd
size: 50Gi
accessMode: ReadWriteOnce
mountPath: /app/data
# Production: Use ConfigMaps for configuration
configMaps:
- name: nlweb-prod-config
mountPath: /app/config
- name: nlweb-llm-config
mountPath: /app/config/llm
- name: nlweb-embedding-config
mountPath: /app/config/embedding
# Production: Use Secrets for sensitive data
existingSecrets:
- name: nlweb-api-keys
mountPath: /app/secrets
defaultMode: 0400Step-by-Step Helm Installation Guide
Prerequisites Setup
Before deploying NLWeb, ensure you have the following prerequisites:
1. Add the Iunera Helm Repository:
helm repo add iunera https://iunera.github.io/helm-charts/ helm repo update
2. Create Namespace and Secrets:
# Create namespace kubectl create namespace nlweb # Create secrets for API keys kubectl create secret generic nlweb-openai-secrets \ --from-literal=api-key="your-openai-api-key" \ -n nlweb kubectl create secret generic nlweb-azure-secrets \ --from-literal=vector-search-key="your-azure-search-key" \ --from-literal=openai-api-key="your-azure-openai-key" \ -n nlweb
3. Install with Custom Values:
# Create custom values file
cat > nlweb-values.yaml << EOF
replicaCount: 2
image:
repository: iunera/nlweb
tag: "1.2.4"
env:
- name: NLWEB_LOGGING_PROFILE
value: production
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: nlweb-openai-secrets
key: api-key
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: letsencrypt-prod
hosts:
- host: nlweb.yourdomain.com
paths:
- path: /
pathType: ImplementationSpecific
tls:
- secretName: nlweb-tls
hosts:
- nlweb.yourdomain.com
resources:
requests:
cpu: 200m
memory: 1Gi
limits:
cpu: 1000m
memory: 2Gi
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 8
targetCPUUtilizationPercentage: 70
EOF
# Install NLWeb
helm install nlweb iunera/nlweb \
--namespace nlweb \
--values nlweb-values.yaml \
--wait --timeout 10m4. Verify Installation:
# Check pod status kubectl get pods -n nlweb # Check service status kubectl get svc -n nlweb # Check ingress kubectl get ingress -n nlweb # View logs kubectl logs -f deployment/nlweb -n nlweb
Security Considerations and Best Practices
Security is a critical aspect of any production NLWeb deployment. This section outlines key security considerations and best practices to protect your NLWeb deployment and the sensitive data it processes.
API Key Management
NLWeb handles multiple API keys for various AI providers. Best practices include:
- Using Kubernetes Secrets for sensitive data
- Implementing secret rotation policies
- Leveraging Azure Key Vault integration
- Monitoring API key usage and costs
Network Security
networkPolicies:
enabled: true
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ingress-nginx
egress:
- to: []
ports:
- protocol: TCP
port: 443 # HTTPS to AI providersPod Security Standards
The deployment implements Pod Security Standards at the restricted level:
securityContext:
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 999
allowPrivilegeEscalation: falseConclusion
NLWeb's deployment in Kubernetes using GitOps methodologies represents the convergence of several technological trends: AI-powered applications, cloud-native infrastructure, and modern DevOps practices. The combination provides organizations with a robust, scalable, and maintainable platform for building next-generation web applications.
The integration with FluxCD ensures that deployments remain consistent and auditable, while the comprehensive Helm charts eliminate much of the complexity traditionally associated with Kubernetes deployments. Azure's AI services provide enterprise-grade capabilities, and the multi-provider LLM support ensures flexibility and cost optimization.
As organizations continue to embrace AI-powered applications, the patterns and practices outlined in this guide will become increasingly valuable. The GitOps approach to NLWeb deployment not only simplifies operations but also provides the foundation for scaling AI applications across enterprise environments.
The future of web applications is undoubtedly AI-powered, and NLWeb's Kubernetes GitOps deployment model provides a clear path forward for organizations ready to embrace this transformation.
Frequently Asked Questions
What sets NLWeb Deployment in Kubernetes GitOps Style apart from traditional deployment methods?
NLWeb's GitOps deployment offers automated configuration management, version-controlled infrastructure, and seamless rollback capabilities. Unlike traditional deployments, it provides declarative configuration management through Git repositories, ensuring consistency across environments and eliminating configuration drift. The integration with FluxCD enables continuous deployment with minimal manual intervention.
How does NLWeb handle multiple LLM providers in a Kubernetes environment?
NLWeb's architecture supports multiple LLM providers through a unified configuration system. The platform can simultaneously connect to OpenAI, Anthropic, Azure OpenAI, Gemini, Snowflake, and Hugging Face models. This multi-provider approach is managed through environment variables and ConfigMaps, allowing for easy switching between providers based on cost, performance, or availability requirements.
What are the resource requirements for running NLWeb in production?
A typical production NLWeb deployment requires a minimum of 200m CPU and 1Gi memory per pod, with recommended limits of 1000m CPU and 2Gi memory. For high-traffic scenarios, horizontal pod autoscaling can scale from 2 to 10 replicas based on CPU utilization. Storage requirements vary based on caching configuration and data persistence needs, typically starting at 10Gi for persistent volumes.
How does the GitOps approach improve security for NLWeb deployments?
GitOps enhances security through immutable infrastructure, audit trails, and declarative configuration management. All changes are tracked in Git, providing complete visibility into who made what changes and when. The approach eliminates direct cluster access for deployments, reducing the attack surface. Additionally, secrets management is handled through Kubernetes native resources and can be integrated with external secret management systems.
Can NLWeb be deployed across multiple cloud providers?
Yes, NLWeb's cloud-agnostic design allows deployment across multiple cloud providers. While it has deep Azure integration, the Kubernetes-native architecture supports deployment on AWS EKS, Google GKE, or on-premises clusters. The Helm charts abstract cloud-specific configurations, making multi-cloud deployments straightforward.
What monitoring and observability tools work best with NLWeb?
NLWeb integrates well with the Kubernetes ecosystem's monitoring tools including Prometheus for metrics collection, Grafana for visualization, and Jaeger for distributed tracing. The application exposes health check endpoints and custom metrics for AI query performance, cache hit rates, and LLM provider response times. Integration with cloud-native monitoring solutions like Azure Monitor or AWS CloudWatch is also supported.
How does NLWeb handle data privacy and compliance requirements?
NLWeb implements several privacy and compliance features including data encryption in transit and at rest, configurable data retention policies, and audit logging. The platform supports role-based access control and can be configured to meet various compliance standards including GDPR, HIPAA, and SOC 2. Integration with enterprise identity providers ensures proper authentication and authorization.
What's the recommended approach for testing NLWeb deployments?
The recommended testing approach includes unit tests for individual components, integration tests for AI provider connectivity, and end-to-end tests for complete user workflows. The GitOps deployment model supports multiple environments (development, staging, production) with environment-specific configurations. Automated testing can be integrated into the CI/CD pipeline to validate deployments before they reach production.
How does NLWeb's caching system improve performance and reduce costs?
NLWeb implements intelligent caching that considers multiple factors including query schema, AI provider, and model type when generating cache keys. This approach significantly reduces API calls to expensive LLM providers while maintaining response accuracy. The cache can be configured with custom TTL values and size limits, and supports both in-memory and persistent storage options.
What are the backup and disaster recovery options for NLWeb?
NLWeb supports comprehensive backup strategies including persistent volume snapshots, configuration backup through Git repositories, and database backups for vector stores. The GitOps approach inherently provides configuration recovery through Git history. For disaster recovery, the platform supports cross-region deployments and can be quickly restored in different availability zones or cloud regions using the same Helm charts and configuration.