Skip to main content
Background Image
  1. Projects/

Ollama - Self-hosted Multi Language Model

·185 words·1 min·
Nguyen Trung Kien
Author
Nguyen Trung Kien
Software Developer | Linux Enthusiast

Project Overview
#

Duration: 12 months
Team: Solo project
Purpose: Networking study and self-hosted AI infrastructure

Tech Stack
#

Operating System: Ubuntu Server
Containerization: Docker
Networking: Cloudflare Tunnel
Monitoring: Prometheus, Grafana
AI Platform: Ollama

Architecture
#

Built a dual-PC infrastructure optimizing for both energy efficiency and computational power:

  • Low-Energy PC - Hosts website via Cloudflare tunnel with personal domain
  • High-Energy Server - Handles compute-intensive tasks including large language models

Key Features
#

  • 🖥️ Dual-PC Setup - Optimized resource allocation between networking and computing
  • 🐳 Full Containerization - All services Dockerized for efficient management
  • 🔄 Auto-Restart - Automatic service recovery and management
  • 📊 Performance Monitoring - Real-time metrics with Prometheus
  • 📈 Log Visualization - Streamlined analysis through Grafana dashboards
  • 🌐 Secure Access - Cloudflare tunnel integration with personal domain

Technical Implementation
#

  • Containerized website, Cloudflare tunnel, and LLM services
  • Implemented comprehensive monitoring stack for performance tracking
  • Configured automated restart mechanisms for service reliability
  • Established secure remote access through Cloudflare infrastructure

Learning Outcomes
#

  • Advanced networking concepts and tunnel configuration
  • Container orchestration and service management
  • Infrastructure monitoring and observability
  • Self-hosted AI deployment strategies
  • Energy-efficient server architecture design