Metal GPU acceleration
Ollama runs natively via Homebrew, not in Docker — direct access to the Metal API on Apple Silicon. The other services live behind Docker Compose so state stays portable; only the LLM hot path skips the Docker boundary.
One-command installer for a full local AI/RAG stack on macOS. Deploys Dify, Open WebUI, Ollama, Weaviate/Qdrant, PostgreSQL, Redis, Nginx — with optional Open Notebook and DB-GPT — optimized for Mac Studio and Mac Pro with native Metal GPU acceleration.
Running AI locally on macOS means dealing with Docker quirks, Ollama networking,
Compose profiles, nginx routing, LaunchAgents instead of systemd,
and BSD tools instead of GNU. AGmind handles all of it in one command.
Ollama runs natively via Homebrew, not in Docker — direct access to the Metal API on Apple Silicon. The other services live behind Docker Compose so state stays portable; only the LLM hot path skips the Docker boundary.
Wizard auto-detects your hardware and recommends models by unified memory — gemma3:4b on 8 GB, qwen2.5:7b on 16 GB, qwen2.5:14b on 32 GB, gemma3:27b on 64 GB, qwen2.5:72b on 96 GB+. Smart defaults, no manual YAML.
Add either with a single «yes» in the wizard, or env-vars
INSTALL_OPEN_NOTEBOOK=1 INSTALL_DBGPT=1. Compose-profile-gated;
both connect to native Ollama via host.docker.internal:11434.
brew instead of apt-get. LaunchAgents instead of
systemctl/cron. sysctl hw.memsize, lsof -iTCP,
BSD sed -i ''. Bash 3.2 stock-shell compatible — no gymnastics.
agmind CLIEight commands: status · doctor · start · stop · restart · logs · backup · uninstall.
Doctor returns PASS/WARN/FAIL. Start brings up Ollama first, then Compose; stop reverses
the order. Backup is rotated; uninstall is full removal with confirmation.
Every script is shellcheck-clean. 230 unit tests in tests/unit/,
a sandboxed 9-phase integration run, and a --dry-run mode that validates
the entire flow in ~1 second without sudo or system changes.
Three modes — interactive (9 questions with smart defaults), non-interactive
(env vars), and --dry-run (no sudo, no Docker, no system changes).
All flags below are documented in the README.
--verbose · debug-level output--non-interactive · skip prompts, read from env vars--dry-run · full 9-phase run without system changes--force-phase N · re-run phase N even if completelan (no TLS) and offline (air-gapped, no internet)# Wizard asks 9 questions with smart defaults $ git clone https://github.com/botAGI/AGmind-macos.git $ cd AGmind-macos $ bash install.sh
$ NON_INTERACTIVE=1 \ DEPLOY_PROFILE=lan \ LLM_MODEL=qwen2.5:14b \ EMBED_MODEL=nomic-embed-text \ VECTOR_DB=weaviate \ INSTALL_OPEN_NOTEBOOK=1 \ INSTALL_DBGPT=1 \ bash install.sh
$ bash install.sh --dry-run Names below match the README phase table verbatim. Re-run picks up where it
left off; --force-phase N re-runs a specific phase even if marked complete.
Validates macOS 13+, RAM, disk, ports, brew, Docker.
Interactive or non-interactive config — profile, models, vector DB, optional tools.
Docker (Colima or Docker Desktop) + Compose v2.
Native Ollama via Homebrew — Metal GPU acceleration.
Generates .env, nginx.conf, docker-compose.yml, TOML configs, secrets.
docker compose up + admin credential injection.
Polls all containers + Ollama API until healthy.
Pulls LLM + embedding models via Ollama.
LaunchAgents, agmind CLI install, final summary.
Verbatim from the README stack table. Ollama is the only native (Homebrew-managed) component — everything else is Docker Compose.
| Component | Runs as | Access |
|---|---|---|
| Dify (API + Worker + Web) | Docker | http://<ip>/apps/ |
| Open WebUI | Docker | http://<ip>/ |
| Ollama | Native (brew) | http://localhost:11434 |
| Weaviate or Qdrant | Docker | internal |
| PostgreSQL + Redis | Docker | internal |
| Nginx | Docker | port 80 |
| Open Notebook (optional) | Docker | http://<ip>/notebook/ |
| DB-GPT (optional) | Docker | http://<ip>/dbgpt/ |
Numbers come straight from the README requirements table — minimums for a working stack.
Apache 2.0. 230 BATS tests. macOS 13+ with Apple Silicon or Intel. Auto-detects your hardware, recommends models by RAM, your data stays local.