The landscape of AI-powered web applications is evolving rapidly, and at the forefront of this revolution stands NLWeb — Microsoft’s groundbreaking open-source protocol that transforms traditional websites into intelligent, AI-driven knowledge hubs. When combined with Kubernetes container orchestration platform and GitOps methodologies, NLWeb creates a production-ready ecosystem that’s both scalable and maintainable. This comprehensive guide explores how to deploy NLWeb using modern DevOps practices, leveraging the power of FluxCD for continuous deployment and Azure’s robust cloud infrastructure for Kubernetes.

- What Makes NLWeb Revolutionary in the AI Web Space?
- Understanding the GitOps Advantage for NLWeb Deployments
- Technical Architecture: NLWeb on Kubernetes
- FluxCD Integration: Continuous Deployment Made Simple
- Azure Integration: Cloud-Native AI Infrastructure
- Production-Ready Features and Best Practices
- Deployment Comparison: NLWeb vs Traditional Approaches
- Advanced Configuration Examples
- Security Considerations and Best Practices
- Conclusion
- Frequently Asked Questions
What Makes NLWeb Revolutionary in the AI Web Space?
NLWeb represents a paradigm shift in how we think about web applications. Unlike traditional static websites or even dynamic web applications, NLWeb enables AI-powered websites that can understand, process, and respond to user queries with unprecedented intelligence. The platform seamlessly integrates with vector databases, multiple LLM providers, and enterprise data sources to create truly interactive web experiences.
The protocol’s architecture is designed with modern cloud-native principles and CNCF best practices in mind. It supports multiple embedding providers including OpenAI, Azure OpenAI, Gemini, and Snowflake, while offering flexible LLM integration with providers ranging from Anthropic’s Claude AI assistant to Hugging Face models. This multi-provider approach ensures resilience and allows organizations to optimize costs while maintaining performance.
Key Points:
- Intelligent Interactions: Enables natural language understanding and contextual responses
- Multi-Provider Support: Integrates with various AI providers for flexibility and redundancy
- Not yet Enterprise-Ready: Designed for production deployments with easy use in mind, it’s currently in early stage. We try to increase this by submitting bug fixes and enhancements.
Understanding the GitOps Advantage for NLWeb Deployments
GitOps declarative infrastructure management methodology has emerged as the gold standard for Kubernetes deployments, and NLWeb’s architecture perfectly aligns with this approach. By treating Git repositories as the single source of truth for infrastructure and application configurations, teams can achieve unprecedented levels of automation, auditability, and reliability.
The iunera helm charts repository provides production-ready Helm charts specifically designed for NLWeb deployments. These charts encapsulate years of operational experience and best practices, making it straightforward to deploy NLWeb in any Kubernetes environment while maintaining consistency across development, staging, and production environments. If you are interessented in a general purpose helmchart for basically any kind of simple deployment the Spring Boot chart is worth a look.
FluxCD serves as the GitOps operator, continuously monitoring the Git repository for changes and automatically applying them to the Kubernetes cluster. This approach eliminates configuration drift, reduces manual intervention, and provides a complete audit trail of all changes made to the system.
GitOps Benefits for NLWeb:
- Declarative Infrastructure: Everything defined as code in Git repositories
- Automated Deployments: Changes automatically applied when committed to Git
- Version Control: Complete history of all configuration changes
- Rollback Capability: Easy reversion to previous known-good states
- Consistency: Same deployment process across all environments
Now, let’s explore the technical architecture of NLWeb on Kubernetes to understand how these components work together.
An different Use Case we’ve implemented is the Apache druid the deployment of a production grade using Druid Operators and FluxCD.
Technical Architecture: NLWeb on Kubernetes
Core Components and Configuration
NLWeb’s Kubernetes deployment consists of several key components that work together to deliver AI-powered web experiences:
Application Layer: The core NLWeb application runs as a Python-based service, typically deployed using the iunera/nlweb
Docker image. The application serves on port 8000 and includes comprehensive health checks for both liveness and readiness probes.
Configuration Management: NLWeb uses a sophisticated configuration system with multiple YAML files:
config_webserver.yaml
: Handles server settings, CORS policies, SSL configuration, and static file servingconfig_llm.yaml
: Manages LLM provider configurations and model selectionsconfig_embedding.yaml
: Controls embedding provider settings and model preferencesconfig_llm_performance.yaml
: Optimizes performance through caching and response management
Security Context: The deployment implements Kubernetes pod security standards and best practices including:
- Non-root user execution (UID 999)
- Read-only root filesystem
- Dropped capabilities
- Security contexts for both pod and container levels
This architecture provides a secure, scalable foundation for deploying NLWeb in production environments.
Helm Chart Structure and Values
The NLWeb Helm chart provides extensive customization options through its values.yaml configuration:
replicaCount: 1 image: repository: iunera/nlweb pullPolicy: IfNotPresent service: type: ClusterIP port: 8000 env: - name: PYTHONPATH value: "/app" - name: PORT value: "8000" - name: NLWEB_LOGGING_PROFILE value: production
The chart supports advanced features including:
- Autoscaling: Horizontal Pod Autoscaler configuration with CPU-based scaling
- Ingress: NGINX ingress controller integration with SSL/TLS termination
- Volumes: Persistent volume claims, ConfigMaps, and EmptyDir volumes
- Configmaps: Configure the NLWeb Configs like LLM, Vector Endpoint, etc from it
- Security: Pod security contexts and network policies
FluxCD Integration: Continuous Deployment Made Simple
FluxCD continuous delivery for Kubernetes is a critical component in the GitOps deployment strategy for NLWeb, providing automated continuous delivery capabilities. It connects your Git repository to your Kubernetes cluster, ensuring that any changes to your deployment manifests are automatically applied.
HelmRelease Controller
The GitOps deployment of NLWeb leverages FluxCD’s HelmRelease custom resource to manage the application lifecycle. Here’s how the integration works:
apiVersion: helm.toolkit.fluxcd.io/v2beta1 kind: HelmRelease metadata: name: nlweb namespace: nlweb spec: releaseName: nlweb targetNamespace: nlweb chart: spec: chart: nlweb version: ">=1.1.0" sourceRef: kind: HelmRepository name: iunera-helm-charts namespace: helmrepos interval: 1m0s
This configuration ensures that FluxCD continuously monitors the Helm repository for updates and automatically applies them to the cluster. The interval: 1m0s
setting means FluxCD checks for changes every minute, providing near real-time deployment capabilities.
Image Automation and Version Management
FluxCD’s image automation capabilities work seamlessly with NLWeb deployments. The system can automatically detect new container image versions and update the deployment manifests accordingly. This is particularly valuable for maintaining up-to-date deployments while ensuring proper testing and validation workflows.
Image Policy Configuration
NLWeb deployments leverage FluxCD’s image automation controllers to automatically update container images when new versions are published. This is configured through special annotations in the HelmRelease manifest:
image: repository: iunera/nlweb # {"$imagepolicy": "flux-system:nlweb:name"} tag: 1.2.4 # {"$imagepolicy": "flux-system:nlweb:tag"}
These annotations tell FluxCD to automatically update the image repository and tag values based on the image policy defined in the nlweb.imagerepo.yaml
file. When a new image version is detected that matches the policy criteria, FluxCD automatically updates the manifest and commits the changes to the Git repository.
Image Repository and Policy Configuration
The image automation is configured through two key resources defined in the nlweb.imagerepo.yaml
file:
# ImageRepository defines the Docker image repository to monitor apiVersion: image.toolkit.fluxcd.io/v1beta2 kind: ImageRepository metadata: name: nlweb namespace: flux-system spec: image: iunera/nlweb interval: 10m secretRef: name: iunera --- # ImagePolicy defines which image versions to select apiVersion: image.toolkit.fluxcd.io/v1beta2 kind: ImagePolicy metadata: name: nlweb namespace: flux-system spec: imageRepositoryRef: name: nlweb policy: semver: range: ">=1.0.0"
The ImageRepository
resource specifies:
- The Docker image to monitor (
iunera/nlweb
) - How often to check for new versions (
interval: 10m
) - Authentication credentials for the Docker registry (
secretRef: name: iunera
)
The ImagePolicy
resource defines the selection criteria for image versions using semantic versioning, in this case selecting any version greater than or equal to 1.0.0.
Automation Workflow
The complete automation workflow is managed by the ImageUpdateAutomation
resource:
apiVersion: image.toolkit.fluxcd.io/v1beta2 kind: ImageUpdateAutomation metadata: name: flux-system namespace: flux-system spec: git: checkout: ref: branch: master commit: author: email: [email protected] name: fluxcdbot messageTemplate: | Automated image update Automation name: {{ .AutomationObject }} Files: {{ range $filename, $_ := .Changed.FileChanges -}} - {{ $filename }} {{ end -}} Objects: {{ range $resource, $changes := .Changed.Objects -}} - {{ $resource.Kind }} {{ $resource.Name }} Changes: {{- range $_, $change := $changes }} - {{ $change.OldValue }} -> {{ $change.NewValue }} {{ end -}} {{ end -}} push: branch: master interval: 30m0s sourceRef: kind: GitRepository name: flux-system update: path: ./kubernetes/common strategy: Setters
This resource:
- Checks out the Git repository’s master branch
- Configures commit details with a template that includes what was changed
- Pushes changes back to the master branch
- Runs every 30 minutes
- Updates files in the
./kubernetes/common
path using the “Setters” strategy (looking for image policy annotations)
With this configuration, the NLWeb deployment automatically stays up-to-date with the latest compatible container images without manual intervention, while maintaining a complete audit trail of all changes through Git history.
Docker Build and CI/CD Pipeline
The NLWeb Docker image build and deployment process follows a comprehensive CI/CD pipeline that integrates with the FluxCD GitOps workflow:
Dockerfile Structure and Multi-Stage Build
The NLWeb Dockerfile uses a Docker multi-stage build process for optimized container images to create an efficient and secure deployment package:
# Stage 1: Build stage FROM python:3.13-slim AS builder # Install build dependencies RUN apt-get update && \ apt-get install -y --no-install-recommends gcc python3-dev && \ pip install --no-cache-dir --upgrade pip && \ apt-get clean && \ rm -rf /var/lib/apt/lists/* WORKDIR /app # Copy requirements file COPY code/requirements.txt . # Install Python packages RUN pip install --no-cache-dir -r requirements.txt # Copy requirements file COPY docker_requirements.txt . # Install Python packages RUN pip install --no-cache-dir -r docker_requirements.txt # Stage 2: Runtime stage FROM python:3.13-slim # Apply security updates RUN apt-get update && \ apt-get install -y --no-install-recommends --only-upgrade \ $(apt-get --just-print upgrade | grep "^Inst" | grep -i securi | awk '{print $2}') && \ apt-get clean && \ rm -rf /var/lib/apt/lists/* WORKDIR /app # Create a non-root user and set permissions RUN groupadd -r nlweb && \ useradd -r -g nlweb -d /app -s /bin/bash nlweb && \ chown -R nlweb:nlweb /app USER nlweb # Copy application code COPY code/ /app/ COPY static/ /app/static/ # Copy installed packages from builder stage COPY --from=builder /usr/local/lib/python3.13/site-packages /usr/local/lib/python3.13/site-packages COPY --from=builder /usr/local/bin /usr/local/bin # Expose the port the app runs on EXPOSE 8000 # Set environment variables ENV NLWEB_OUTPUT_DIR=/app ENV PYTHONPATH=/app ENV PORT=8000 ENV VERSION=1.2.4 # Command to run the application CMD ["python", "app-file.py"]
Key aspects of the Dockerfile:
- Stage 1 (Builder): Installs all dependencies and build tools
- Stage 2 (Runtime): Creates a minimal runtime environment
- Security Features: Non-root user, security updates, minimal dependencies
- Version Definition:
ENV VERSION=1.2.4
defines the version that will be used for tagging
GitHub Actions Workflow
When changes are pushed to the iuneracustomizations
branch and the Dockerfile is modified, the GitHub Actions CI/CD automation workflow in .github/workflows/prod-build.yml
is triggered:
name: prod-build on: push: branches: - iuneracustomizations paths: - Dockerfile jobs: build-and-push: runs-on: ubuntu-latest steps: - name: Checkout Repository uses: actions/checkout@v4 - name: Set up Docker Buildx uses: docker/setup-buildx-action@v3 - name: Log in to Private Registry uses: docker/login-action@v3 with: username: ${{ secrets.DOCKERHUB_USERNAME }} password: ${{ secrets.DOCKERHUB_TOKEN }} - name: Set up QEMU uses: docker/setup-qemu-action@v3 - name: Set up Docker Buildx uses: docker/setup-buildx-action@v3 - name: Extract Version from Dockerfile id: extract_version run: | # Extract the VERSION from Dockerfile VERSION=$(grep "ENV VERSION=" Dockerfile | cut -d= -f2) echo "VERSION=${VERSION}" >> $GITHUB_ENV echo "Using version from Dockerfile: ${VERSION}" - name: Build the Docker image run: | docker build -t iunera/nlweb:latest -t iunera/nlweb:${{ env.VERSION }} . docker push iunera/nlweb:latest docker push iunera/nlweb:${{ env.VERSION }} echo "Built and pushed Docker image with tags: latest, ${{ env.VERSION }}" - name: Inspect run: | docker image inspect iunera/nlweb:latest - name: Create and Push Git Tag run: | git config --global user.name "GitHub Actions" git config --global user.email "[email protected]" git tag -a v${{ env.VERSION }} -m "Release version ${{ env.VERSION }}" git push origin v${{ env.VERSION }}
The workflow performs these steps:
- Checkout Repository: Clones the repository to the GitHub Actions runner
- Set up Docker Buildx: Configures Docker with multi-architecture build support
- Log in to Docker Hub: Authenticates with Docker Hub using repository secrets
- Set up QEMU: Enables building for multiple architectures (ARM64, AMD64)
- Extract Version: Parses the Dockerfile to extract the VERSION environment variable
- Build and Push: Builds the Docker image with two tags (
latest
and the version number) and pushes both to Docker Hub - Inspect: Displays information about the built image for verification
- Create Git Tag: Creates a Git tag for the version and pushes it to the repository
Complete CI/CD to Deployment Flow
The complete flow from Dockerfile to deployment involves:
- Development: A developer updates the Dockerfile, potentially changing the VERSION
- CI/CD: GitHub Actions builds and pushes the Docker image to Docker Hub
- Automation: FluxCD detects the new image version in Docker Hub
- GitOps: FluxCD updates the Kubernetes manifests with the new image version and commits the changes back to the Git repository
- Deployment: FluxCD applies the changes to the Kubernetes cluster, creating new pods with the updated image
This GitOps approach ensures that:
- The Git repository is the single source of truth
- All changes are tracked and auditable
- Deployments are automated and consistent
- Rollbacks are simple and reliable
Local Development Environment
While the GitHub Actions workflow handles production builds, local development uses Docker Compose:
services: nlweb: build: context: . dockerfile: Dockerfile container_name: nlweb ports: - "8000:8000" env_file: - ./code/.env environment: - PYTHONPATH=/app - PORT=8000 volumes: - ./data:/data - ./code/config:/app/config:ro healthcheck: test: ["CMD-SHELL", "python -c \"import urllib.request; urllib.request.urlopen('http://localhost:8000')\""] interval: 30s timeout: 10s retries: 3 start_period: 10s restart: unless-stopped user: nlweb
This setup:
- Uses the same Dockerfile as production
- Mounts local directories for data and configuration
- Loads environment variables from a local .env file
- Includes healthchecks for monitoring
- Runs as the non-root nlweb user
The combination of GitHub Actions for CI/CD and FluxCD for GitOps creates a robust and automated pipeline for building and deploying NLWeb, ensuring consistency between development and production environments.
Azure Integration: Cloud-Native AI Infrastructure
NLWeb’s integration with Azure services makes it an ideal choice for organizations already invested in Microsoft’s cloud ecosystem. The platform natively supports:
Azure Cognitive Search: For vector search capabilities, NLWeb integrates with Azure’s vector search service, providing scalable and performant similarity search across large datasets.
Azure OpenAI Service: Direct integration with Azure’s OpenAI offerings, including GPT-4 and embedding models, ensures enterprise-grade AI capabilities with proper governance and compliance.
Azure Container Registry: Seamless integration with ACR for container image management and security scanning.
The configuration for Azure services is handled through environment variables and ConfigMaps, making it easy to manage different environments and maintain security best practices:
env: - name: AZURE_VECTOR_SEARCH_ENDPOINT value: "https://your-vector-search-db.search.windows.net" - name: AZURE_OPENAI_ENDPOINT value: "https://your-openai-instance.openai.azure.com/"
Production-Ready Features and Best Practices
Multi-Provider LLM Support
One of NLWeb’s standout features is its support for multiple LLM providers, ensuring vendor independence and cost optimization. The platform supports:
- OpenAI: GPT-4.1 and GPT-4.1-mini models
- Anthropic: Claude-3-7-sonnet-latest and Claude-3-5-haiku-latest
- Azure OpenAI: Enterprise-grade OpenAI models with Azure’s security and compliance
- Google Gemini: chat-bison models for diverse AI capabilities
- Snowflake: Arctic embedding models and Claude integration
- Hugging Face: Open-source models including Qwen2.5 series
This multi-provider approach allows organizations to:
- Optimize costs by using different models for different use cases
- Ensure service availability through provider redundancy
- Experiment with cutting-edge models without vendor lock-in
Performance Optimization and Caching
Iuneras Customizations of NLWeb implements sophisticated caching mechanisms to optimize performance and reduce API costs:
cache: enable: true max_size: 1000 ttl: 0 # No expiration include_schema: true include_provider: true include_model: true
The caching system considers multiple factors including schema, provider, and model when generating cache keys, ensuring accurate cache hits while maintaining response quality.
Enterprise Data Integration
Building on the foundation laid out in the comprehensive guide to exposing enterprise data with Java and Spring for AI indexing, NLWeb provides seamless integration with enterprise data sources. The platform supports:
- JSON-LD and Schema.org: Structured data integration for semantic web capabilities
- Vector Database Integration: Support for various vector databases including Azure Cognitive Search
- Real-time Data Processing: Stream processing capabilities for dynamic content updates
- Enterprise Security: Role-based access control and data governance features
Deployment Comparison: NLWeb vs Traditional Approaches
Feature | NLWeb GitOps | Azure Web Apps | Traditional Linux Install |
---|---|---|---|
Scalability | Auto-scaling with HPA | Limited vertical scaling | Manual scaling required |
Deployment Speed | Automated via GitOps | Manual deployment | Manual configuration |
Configuration Management | Git-based versioning | Portal-based settings | File-based configuration |
Multi-environment Support | Native Kubernetes namespaces | Separate app instances | Separate servers |
Rollback Capabilities | Git-based rollbacks | Limited rollback options | Manual rollback process |
Cost Optimization | Resource-based pricing | App Service Plan pricing | Infrastructure costs |
Monitoring & Observability | Kubernetes-native tools | Azure Monitor integration | Custom monitoring setup |
Security | Pod security contexts | Azure security features | Manual security hardening |
The iunera helm charts provide a significant advantage in this comparison, offering production-tested configurations that eliminate common deployment pitfalls.
Advanced Configuration Examples
This section provides practical, production-ready configuration examples for deploying NLWeb in various environments. These examples can be used as templates for your own deployments, with customization as needed for your specific requirements.
Note: The following examples are organized by use case to help you find the most relevant configurations for your needs.
Complete Helm Installation Manifest Examples
Basic Development Setup
For development environments, here’s a minimal helm installation manifest:
apiVersion: helm.toolkit.fluxcd.io/v2beta1 kind: HelmRelease metadata: name: nlweb-dev namespace: nlweb-dev spec: releaseName: nlweb-dev targetNamespace: nlweb-dev chart: spec: chart: nlweb version: ">=1.1.0" sourceRef: kind: HelmRepository name: iunera-helm-charts namespace: helmrepos interval: 5m0s install: createNamespace: true values: replicaCount: 1 image: repository: iunera/nlweb tag: "latest" pullPolicy: Always env: - name: NLWEB_LOGGING_PROFILE value: development - name: OPENAI_API_KEY valueFrom: secretKeyRef: name: nlweb-secrets key: openai-api-key ingress: enabled: true annotations: kubernetes.io/ingress.class: nginx hosts: - host: nlweb-dev.local paths: - path: / pathType: ImplementationSpecific resources: requests: cpu: 100m memory: 512Mi limits: cpu: 500m memory: 1Gi
Production-Ready Setup with Multi-Provider LLM Support
For production environments with comprehensive AI provider integration:
apiVersion: helm.toolkit.fluxcd.io/v2beta1 kind: HelmRelease metadata: name: nlweb-prod namespace: nlweb spec: releaseName: nlweb targetNamespace: nlweb chart: spec: chart: nlweb version: ">=1.1.0" sourceRef: kind: HelmRepository name: iunera-helm-charts namespace: helmrepos interval: 1m0s install: createNamespace: false upgrade: remediation: retries: 3 values: replicaCount: 3 image: repository: iunera/nlweb tag: "1.2.4" pullPolicy: IfNotPresent env: - name: NLWEB_LOGGING_PROFILE value: production - name: AZURE_VECTOR_SEARCH_ENDPOINT value: "https://nlweb-prod.search.windows.net" - name: AZURE_VECTOR_SEARCH_API_KEY valueFrom: secretKeyRef: name: nlweb-azure-secrets key: vector-search-key - name: OPENAI_API_KEY valueFrom: secretKeyRef: name: nlweb-openai-secrets key: api-key - name: ANTHROPIC_API_KEY valueFrom: secretKeyRef: name: nlweb-anthropic-secrets key: api-key - name: AZURE_OPENAI_API_KEY valueFrom: secretKeyRef: name: nlweb-azure-openai-secrets key: api-key ingress: enabled: true annotations: kubernetes.io/ingress.class: nginx kubernetes.io/tls-acme: "true" cert-manager.io/cluster-issuer: letsencrypt-prod nginx.ingress.kubernetes.io/force-ssl-redirect: "true" nginx.ingress.kubernetes.io/enable-modsecurity: "true" nginx.ingress.kubernetes.io/enable-owasp-core-rules: "true" nginx.ingress.kubernetes.io/rate-limit: "100" nginx.ingress.kubernetes.io/rate-limit-window: "1m" hosts: - host: nlweb.example.com paths: - path: / pathType: ImplementationSpecific tls: - secretName: nlweb-tls hosts: - nlweb.example.com resources: requests: cpu: 200m memory: 1Gi limits: cpu: 1000m memory: 2Gi autoscaling: enabled: true minReplicas: 3 maxReplicas: 10 targetCPUUtilizationPercentage: 70 targetMemoryUtilizationPercentage: 80
Comprehensive ConfigMap Customization Examples
Web Server Configuration for Different Environments
Development Environment ConfigMap:
volumes: configMaps: - name: nlweb-dev-config mountPath: /app/config data: config_webserver.yaml: |- port: 8000 static_directory: ../../ mode: development server: host: 0.0.0.0 enable_cors: true cors_trusted_origins: "*" # Allow all origins in dev max_connections: 50 timeout: 60 logging: level: debug file: ./logs/webserver.log console: true static: enable_cache: false # Disable caching in dev gzip_enabled: false
Production Environment ConfigMap:
volumes: configMaps: - name: nlweb-prod-config mountPath: /app/config data: config_webserver.yaml: |- port: 8000 static_directory: ../../ mode: production server: host: 0.0.0.0 enable_cors: true cors_trusted_origins: - https://nlweb.example.com - https://api.example.com - https://admin.example.com max_connections: 200 timeout: 30 ssl: enabled: true cert_file_env: SSL_CERT_FILE key_file_env: SSL_KEY_FILE logging: level: info file: ./logs/webserver.log console: false rotation: max_size: 100MB max_files: 10 static: enable_cache: true cache_max_age: 86400 # 24 hours gzip_enabled: true compression_level: 6
Multi-Provider LLM Configuration
Enterprise LLM Setup with Fallback Providers:
volumes: configMaps: - name: nlweb-llm-config mountPath: /app/config data: config_llm.yaml: |- preferred_endpoint: azure_openai fallback_strategy: round_robin endpoints: azure_openai: api_key_env: AZURE_OPENAI_API_KEY api_endpoint_env: AZURE_OPENAI_ENDPOINT api_version_env: "2024-12-01-preview" llm_type: azure_openai models: high: gpt-4o low: gpt-4o-mini rate_limits: requests_per_minute: 1000 tokens_per_minute: 150000 retry_config: max_retries: 3 backoff_factor: 2 openai: api_key_env: OPENAI_API_KEY api_endpoint_env: OPENAI_ENDPOINT llm_type: openai models: high: gpt-4-turbo low: gpt-3.5-turbo rate_limits: requests_per_minute: 500 tokens_per_minute: 90000 anthropic: api_key_env: ANTHROPIC_API_KEY llm_type: anthropic models: high: claude-3-opus-20240229 low: claude-3-haiku-20240307 rate_limits: requests_per_minute: 300 tokens_per_minute: 60000 gemini: api_key_env: GCP_PROJECT llm_type: gemini models: high: gemini-1.5-pro low: gemini-1.5-flash rate_limits: requests_per_minute: 200 tokens_per_minute: 40000
Embedding Provider Configuration for Vector Search
Multi-Provider Embedding Setup:
volumes: configMaps: - name: nlweb-embedding-config mountPath: /app/config data: config_embedding.yaml: |- preferred_provider: azure_openai fallback_providers: - openai - snowflake providers: azure_openai: api_key_env: AZURE_OPENAI_API_KEY api_endpoint_env: AZURE_OPENAI_ENDPOINT api_version_env: "2024-10-21" model: text-embedding-3-large dimensions: 3072 batch_size: 100 rate_limits: requests_per_minute: 1000 openai: api_key_env: OPENAI_API_KEY api_endpoint_env: OPENAI_ENDPOINT model: text-embedding-3-large dimensions: 3072 batch_size: 100 rate_limits: requests_per_minute: 500 snowflake: api_key_env: SNOWFLAKE_PAT api_endpoint_env: SNOWFLAKE_ACCOUNT_URL api_version_env: "2024-10-01" model: snowflake-arctic-embed-l dimensions: 1024 batch_size: 50 rate_limits: requests_per_minute: 200 huggingface: api_key_env: HF_TOKEN model: sentence-transformers/all-mpnet-base-v2 dimensions: 768 local_inference: true device: cpu
Performance Optimization Configuration
High-Performance Caching Setup:
volumes: configMaps: - name: nlweb-performance-config mountPath: /app/config data: config_llm_performance.yaml: |- # LLM Performance Settings representation: use_compact: true limit: 10 include_metadata: true cache: enable: true max_size: 10000 ttl: 3600 # 1 hour include_schema: true include_provider: true include_model: true include_user_context: false compression: gzip rate_limiting: enable: true requests_per_minute: 1000 burst_size: 100 per_user_limit: 50 monitoring: enable_metrics: true metrics_port: 9090 health_check_interval: 30 performance_logging: true
Environment-Specific Volume Configurations
Development with Hot Reloading:
volumes: enabled: true emptyDirs: - name: data mountPath: /app/data - name: logs mountPath: /app/logs - name: tmp mountPath: /tmp - name: cache mountPath: /app/cache # Development: Use hostPath for easy file access hostPaths: - name: dev-config hostPath: /local/dev/nlweb/config mountPath: /app/config type: DirectoryOrCreate
Production with Persistent Storage:
volumes: enabled: true emptyDirs: - name: tmp mountPath: /tmp sizeLimit: 1Gi pvc: enabled: true storageClass: fast-ssd size: 50Gi accessMode: ReadWriteOnce mountPath: /app/data # Production: Use ConfigMaps for configuration configMaps: - name: nlweb-prod-config mountPath: /app/config - name: nlweb-llm-config mountPath: /app/config/llm - name: nlweb-embedding-config mountPath: /app/config/embedding # Production: Use Secrets for sensitive data existingSecrets: - name: nlweb-api-keys mountPath: /app/secrets defaultMode: 0400
Step-by-Step Helm Installation Guide
Prerequisites Setup
Before deploying NLWeb, ensure you have the following prerequisites:
1. Add the Iunera Helm Repository:
helm repo add iunera https://iunera.github.io/helm-charts/ helm repo update
2. Create Namespace and Secrets:
# Create namespace kubectl create namespace nlweb # Create secrets for API keys kubectl create secret generic nlweb-openai-secrets \ --from-literal=api-key="your-openai-api-key" \ -n nlweb kubectl create secret generic nlweb-azure-secrets \ --from-literal=vector-search-key="your-azure-search-key" \ --from-literal=openai-api-key="your-azure-openai-key" \ -n nlweb
3. Install with Custom Values:
# Create custom values file cat > nlweb-values.yaml << EOF replicaCount: 2 image: repository: iunera/nlweb tag: "1.2.4" env: - name: NLWEB_LOGGING_PROFILE value: production - name: OPENAI_API_KEY valueFrom: secretKeyRef: name: nlweb-openai-secrets key: api-key ingress: enabled: true annotations: kubernetes.io/ingress.class: nginx cert-manager.io/cluster-issuer: letsencrypt-prod hosts: - host: nlweb.yourdomain.com paths: - path: / pathType: ImplementationSpecific tls: - secretName: nlweb-tls hosts: - nlweb.yourdomain.com resources: requests: cpu: 200m memory: 1Gi limits: cpu: 1000m memory: 2Gi autoscaling: enabled: true minReplicas: 2 maxReplicas: 8 targetCPUUtilizationPercentage: 70 EOF # Install NLWeb helm install nlweb iunera/nlweb \ --namespace nlweb \ --values nlweb-values.yaml \ --wait --timeout 10m
4. Verify Installation:
# Check pod status kubectl get pods -n nlweb # Check service status kubectl get svc -n nlweb # Check ingress kubectl get ingress -n nlweb # View logs kubectl logs -f deployment/nlweb -n nlweb
Security Considerations and Best Practices
Security is a critical aspect of any production NLWeb deployment. This section outlines key security considerations and best practices to protect your NLWeb deployment and the sensitive data it processes.
API Key Management
NLWeb handles multiple API keys for various AI providers. Best practices include:
- Using Kubernetes Secrets for sensitive data
- Implementing secret rotation policies
- Leveraging Azure Key Vault integration
- Monitoring API key usage and costs
Network Security
networkPolicies: enabled: true ingress: - from: - namespaceSelector: matchLabels: name: ingress-nginx egress: - to: [] ports: - protocol: TCP port: 443 # HTTPS to AI providers
Pod Security Standards
The deployment implements Pod Security Standards at the restricted level:
securityContext: capabilities: drop: - ALL readOnlyRootFilesystem: true runAsNonRoot: true runAsUser: 999 allowPrivilegeEscalation: false
Conclusion
NLWeb's deployment in Kubernetes using GitOps methodologies represents the convergence of several technological trends: AI-powered applications, cloud-native infrastructure, and modern DevOps practices. The combination provides organizations with a robust, scalable, and maintainable platform for building next-generation web applications.
The integration with FluxCD ensures that deployments remain consistent and auditable, while the comprehensive Helm charts eliminate much of the complexity traditionally associated with Kubernetes deployments. Azure's AI services provide enterprise-grade capabilities, and the multi-provider LLM support ensures flexibility and cost optimization.
As organizations continue to embrace AI-powered applications, the patterns and practices outlined in this guide will become increasingly valuable. The GitOps approach to NLWeb deployment not only simplifies operations but also provides the foundation for scaling AI applications across enterprise environments.
The future of web applications is undoubtedly AI-powered, and NLWeb's Kubernetes GitOps deployment model provides a clear path forward for organizations ready to embrace this transformation.
Frequently Asked Questions
What sets NLWeb Deployment in Kubernetes GitOps Style apart from traditional deployment methods?
NLWeb's GitOps deployment offers automated configuration management, version-controlled infrastructure, and seamless rollback capabilities. Unlike traditional deployments, it provides declarative configuration management through Git repositories, ensuring consistency across environments and eliminating configuration drift. The integration with FluxCD enables continuous deployment with minimal manual intervention.
How does NLWeb handle multiple LLM providers in a Kubernetes environment?
NLWeb's architecture supports multiple LLM providers through a unified configuration system. The platform can simultaneously connect to OpenAI, Anthropic, Azure OpenAI, Gemini, Snowflake, and Hugging Face models. This multi-provider approach is managed through environment variables and ConfigMaps, allowing for easy switching between providers based on cost, performance, or availability requirements.
What are the resource requirements for running NLWeb in production?
A typical production NLWeb deployment requires a minimum of 200m CPU and 1Gi memory per pod, with recommended limits of 1000m CPU and 2Gi memory. For high-traffic scenarios, horizontal pod autoscaling can scale from 2 to 10 replicas based on CPU utilization. Storage requirements vary based on caching configuration and data persistence needs, typically starting at 10Gi for persistent volumes.
How does the GitOps approach improve security for NLWeb deployments?
GitOps enhances security through immutable infrastructure, audit trails, and declarative configuration management. All changes are tracked in Git, providing complete visibility into who made what changes and when. The approach eliminates direct cluster access for deployments, reducing the attack surface. Additionally, secrets management is handled through Kubernetes native resources and can be integrated with external secret management systems.
Can NLWeb be deployed across multiple cloud providers?
Yes, NLWeb's cloud-agnostic design allows deployment across multiple cloud providers. While it has deep Azure integration, the Kubernetes-native architecture supports deployment on AWS EKS, Google GKE, or on-premises clusters. The Helm charts abstract cloud-specific configurations, making multi-cloud deployments straightforward.
What monitoring and observability tools work best with NLWeb?
NLWeb integrates well with the Kubernetes ecosystem's monitoring tools including Prometheus for metrics collection, Grafana for visualization, and Jaeger for distributed tracing. The application exposes health check endpoints and custom metrics for AI query performance, cache hit rates, and LLM provider response times. Integration with cloud-native monitoring solutions like Azure Monitor or AWS CloudWatch is also supported.
How does NLWeb handle data privacy and compliance requirements?
NLWeb implements several privacy and compliance features including data encryption in transit and at rest, configurable data retention policies, and audit logging. The platform supports role-based access control and can be configured to meet various compliance standards including GDPR, HIPAA, and SOC 2. Integration with enterprise identity providers ensures proper authentication and authorization.
What's the recommended approach for testing NLWeb deployments?
The recommended testing approach includes unit tests for individual components, integration tests for AI provider connectivity, and end-to-end tests for complete user workflows. The GitOps deployment model supports multiple environments (development, staging, production) with environment-specific configurations. Automated testing can be integrated into the CI/CD pipeline to validate deployments before they reach production.
How does NLWeb's caching system improve performance and reduce costs?
NLWeb implements intelligent caching that considers multiple factors including query schema, AI provider, and model type when generating cache keys. This approach significantly reduces API calls to expensive LLM providers while maintaining response accuracy. The cache can be configured with custom TTL values and size limits, and supports both in-memory and persistent storage options.
What are the backup and disaster recovery options for NLWeb?
NLWeb supports comprehensive backup strategies including persistent volume snapshots, configuration backup through Git repositories, and database backups for vector stores. The GitOps approach inherently provides configuration recovery through Git history. For disaster recovery, the platform supports cross-region deployments and can be quickly restored in different availability zones or cloud regions using the same Helm charts and configuration.