TL;DR:
- Clone the MCP Memory Service repository
- Install CPU-only Python dependencies
- Configure SQLite-vec storage backend
- Test compilation and basic functionality
Why Compile Without CUDA
MCP Memory Service supports GPU acceleration via CUDA for improved performance, but many systems lack CUDA-compatible GPUs (most laptops, ARM devices, and some servers). This guide focuses on CPU-only compilation and installation, providing full functionality while maintaining reasonable performance through optimized CPU libraries.
Enhanced Fork with CPU Support
Note: For optimized CPU-only compilation without CUDA dependencies, consider using my enhanced fork at jpmrblood/mcp-memory-service, which includes dedicated CPU build profiles and Dockerfile.cpu support, with critically important pyproject.toml optimizations detailed below.
pyproject.toml Changes (Most Important)
The most critical improvement in the fork is the enhanced pyproject.toml configuration that enables true CPU-only installations:
Added CPU Dependency Profile
[project.optional-dependencies]cpu = [ "sentence-transformers[onnx]>=2.2.2", "torch>=2.0.0"]This creates an --extra cpu installation option that:
- Forces CPU-only PyTorch without CUDA dependencies
- Uses ONNX-optimized sentence transformers for better performance
- Eliminates GPU package bloat from CPU installations
Astral UV Integration
[[tool.uv.index]]name = "pytorch-cpu"url = "https://download.pytorch.org/whl/cpu"explicit = true
[tool.uv.sources]torch = [ { index = "pytorch-cpu", extra = "cpu" }]This sophisticated UV configuration:
- Defines a custom PyTorch CPU index that never pulls CUDA packages
- Automatically switches PyTorch to CPU-only wheels when using
--extra cpu - Ensures deterministic CPU-only builds regardless of system configuration
- Prevents accidental CUDA package installation
Usage with UV
# CPU-optimized installation (recommended)uv sync --extra cpuOr,
uv pip install ".[cpu]"This configuration eliminates the common issue where pip install torch automatically pulls CUDA-enabled wheels even on CPU-only systems, saving disk space and preventing compatibility issues.
Prerequisites
My system is using Python3.13, so I need to install Python 3.12 or less. SO I install 3.11:
sudo apt updatesudo apt install python3.11 python3.11-venv python3-pip git build-essentialInstallation Steps
1. Clone the Repository
Clone the latest MCP Memory Service code:
git clone https://github.com/jpmrblood/mcp-memory-service.gitcd mcp-memory-service2. Create Virtual Environment
Set up an isolated Python environment:
uv venvsource venv/bin/activate3. Install CPU-Only Dependencies
Install dependencies using Astral UV for optimized CPU-only compilation:
# Install with CPU profile using uvuv sync --extra cpuOr/then,
# For compiling, enable CPU installationuv pip install ".[cpu]"4. Verify CPU Installation
Ensure the installation uses CPU-only libraries:
python3 -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}'); print(f'PyTorch version: {torch.__version__}')"Expected output should show CUDA available: False.
5. Download Model for Embedding
Let’s install hugging-face cli:
uv pip install -U "huggingface_hub[cli]"Download the file:
uv run hf download sentence-transformers/all-MiniLM-L6-v27. Create a running script
I want to make this running as a service because I have multiple free clients that burn tokens easily.
In ~/.local/bin/run-mcp-memory.sh
#!/bin/bash# run-mcp-memoryexport MCP_HTTP_ENABLED=trueexport MCP_SERVER_HOST=0.0.0.0export MCP_SERVER_PORT=8000export MCP_MEMORY_STORAGE_BACKEND=sqlite_vecexport MCP_COMMAND="uv --directory=$HOME/mcp-memory-service run memory server"
npx -y supergateway --stdio "${MCP_COMMAND}" -outputTransport streamableHttp --port 8000Yes, I run this using supergateway, a gateway that converts stdio MCPs into HTTP streaming and SSE.
Make the script executable:
chmod +x ~/.local/bin/run-mcp-memory.shIt can run well now.
7. Run as server
run-mcp-memory.sh &6. Install to the Editors
gemini mcp add -e sse -s user memory http://localhost:8000/sseReferences
- MCP Memory Service GitHub Repository
- PyTorch CPU Installation Guide
- Using uv with PyTorch
- SQLite-vec Documentation
Compiling MCP Memory Service without CUDA provides a lightweight, accessible solution for CPU-only environments while maintaining full compatibility with Claude’s MCP protocol.
