granitepi

Running IBM Granite 3.2-2B on Raspberry Pi 5: Simple CPU Setup Guide

Author: Julian A. Gonzalez, IBM Champion 2025

Date: 6-29-2025


Image 1 Image 2

Introduction

Welcome to this beginner-friendly guide for running IBM’s Granite 3.2-2B language model on a Raspberry Pi 5! This guide shows you how to set up a local AI assistant using only your Raspberry Pi’s built-in hardware - no external drives or complex configurations needed.

What to Expect:
The Granite 3.2-2B model has 2 billion parameters and runs well on the Raspberry Pi 5’s CPU. Expect responses in 5-15 seconds. Your Pi will get warm during use, so good ventilation is recommended.

What You Need

Required Hardware

Important: This guide uses only the Raspberry Pi 5’s built-in storage and memory. No external drives needed!

Table of Contents

  1. Setting Up Your Raspberry Pi 5
  2. Installing Ollama
  3. Basic Performance Optimization
  4. Downloading Granite 3.2-2B
  5. Testing Your Setup
  6. Simple Usage Examples
  7. Troubleshooting
  8. What’s Next

Setting Up Your Raspberry Pi 5

Step 1: Install the Operating System

  1. Download Raspberry Pi Imager from rpi.org
  2. Flash your SD card:
    • Insert your microSD card into your computer
    • Open Raspberry Pi Imager
    • Choose “Raspberry Pi OS (64-bit)” - Must be 64-bit!
    • Select your SD card
    • Click the gear icon for advanced options:
      • Set a username and password
      • Enable SSH if you want remote access
      • Configure Wi-Fi if needed
    • Click “Write” and wait for it to finish
  3. First boot:
    • Insert the SD card into your Raspberry Pi 5
    • Connect monitor, keyboard, mouse, and power
    • Follow the setup wizard

Step 2: Update Your System

Open a terminal and run these commands:

sudo apt update
sudo apt upgrade -y
sudo reboot

After reboot, verify you have the 64-bit system:

uname -m

You should see aarch64 (this means 64-bit ARM).

Installing Ollama

Ollama makes it easy to run AI models on your Raspberry Pi. Here’s how to install it:

Simple Installation

Run this single command to install Ollama:

curl -fsSL https://ollama.com/install.sh | sh

That’s it! Ollama is now installed and running.

Verify Installation

Check if Ollama is working:

ollama --version

You should see version information displayed.

Basic Performance Optimization

Let’s make a few simple changes to get the best performance from your Pi’s CPU:

Step 1: Set CPU to Performance Mode

This makes your Pi’s CPU run at full speed:

echo "performance" | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

Step 2: Make Performance Mode Permanent

Create a simple service to set performance mode on every boot:

sudo nano /etc/systemd/system/cpu-performance.service

Copy and paste this content:

[Unit]
Description=Set CPU to Performance Mode
After=multi-user.target

[Service]
Type=oneshot
ExecStart=/bin/bash -c 'echo performance | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor'
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target

Save the file (Ctrl+X, then Y, then Enter), then enable it:

sudo systemctl enable cpu-performance.service
sudo systemctl start cpu-performance.service

Step 3: Optimize Ollama for CPU

Tell Ollama to use all 4 CPU cores:

sudo systemctl edit ollama.service

Add these lines in the editor that opens:

[Service]
Environment=OLLAMA_NUM_THREADS=4

Save and exit, then restart Ollama:

sudo systemctl daemon-reload
sudo systemctl restart ollama

Downloading Granite 3.2-2B

Now for the exciting part - getting the AI model!

Method 1: Direct Download (Easiest)

Try this first - it might work directly:

ollama pull granite3.2:2b

If this works, skip to the Testing section.

If the direct method doesn’t work, we’ll create a custom setup:

  1. Create a model configuration file:
    nano granite-pi.Modelfile
    
  2. Add this content (copy and paste):
    FROM ibm/granite-3.2-2b
    PARAMETER temperature 0.7
    PARAMETER num_ctx 2048
    SYSTEM """
    You are a helpful AI assistant running on Raspberry Pi 5 hardware.
    Keep responses concise and efficient due to hardware limitations.
    """
    
  3. Create the model:
    ollama create granite-pi -f granite-pi.Modelfile
    
  4. Verify the model is ready:
    ollama list
    

Testing Your Setup

Let’s make sure everything works!

Basic Test

Start a conversation with your AI:

ollama run granite-pi

Try asking it something simple:

To exit the conversation, type /bye or press Ctrl+C.

Monitor Temperature (Optional)

Create a simple temperature monitor to keep an eye on your Pi:

cat > ~/check_temp.sh << 'EOF'
#!/bin/bash
temp=$(vcgencmd measure_temp | cut -d= -f2 | cut -d\' -f1)
echo "CPU Temperature: $temp"
if (( $(echo "$temp > 70" | bc -l) )); then
    echo "⚠️  Getting warm! Consider better cooling."
fi
EOF

chmod +x ~/check_temp.sh

Run it anytime with:

~/check_temp.sh

Simple Usage Examples

Here are some easy ways to use your new AI assistant:

Quick Questions

# Ask a quick question
echo "Explain what machine learning is in simple terms" | ollama run granite-pi --format json | jq -r '.response'

Create a Simple Chat Script

cat > ~/chat.sh << 'EOF'
#!/bin/bash
echo "🤖 Granite AI Assistant (type 'quit' to exit)"
echo "=================================="

while true; do
    echo -n "You: "
    read input
    
    if [ "$input" = "quit" ]; then
        echo "Goodbye!"
        break
    fi
    
    echo -n "AI: "
    echo "$input" | ollama run granite-pi --format json | jq -r '.response'
    echo ""
done
EOF

chmod +x ~/chat.sh

Use it with:

~/chat.sh

Check System Status

cat > ~/ai_status.sh << 'EOF'
#!/bin/bash
echo "🔍 Raspberry Pi AI Status"
echo "========================"
echo "Temperature: $(vcgencmd measure_temp)"
echo "Memory usage: $(free -h | grep Mem | awk '{print $3 "/" $2}')"
echo "CPU governor: $(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor)"
echo "Ollama status: $(systemctl is-active ollama)"
echo "Available models:"
ollama list
EOF

chmod +x ~/ai_status.sh

Troubleshooting

Problem: Model Won’t Load

Symptoms: Error messages when trying to run the model
Solutions:

  1. Check available memory:
    free -h
    
  2. Restart Ollama:
    sudo systemctl restart ollama
    
  3. Try a smaller context size:
    ollama run granite-pi --num_ctx 1024
    

Problem: Very Slow Responses

Symptoms: Responses take more than 30 seconds
Solutions:

  1. Check CPU temperature:
    vcgencmd measure_temp
    
  2. Verify performance mode:
    cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
    
  3. Close other applications to free up resources.

Problem: Pi Becomes Unresponsive

Symptoms: System freezes or becomes very slow
Solutions:

  1. This usually means overheating. Improve cooling.
  2. Consider using 8GB RAM model if using 4GB.
  3. Reduce the model context size:
    ollama run granite-pi --num_ctx 1024
    

Problem: “Out of Memory” Errors

Symptoms: Ollama crashes with memory errors
Solutions:

  1. Close all other applications.
  2. Reboot your Pi to clear memory:
    sudo reboot
    
  3. Consider upgrading to 8GB RAM model.

What’s Next?

Congratulations! You now have a working AI assistant on your Raspberry Pi. Here are some ideas for what to do next:

Experiment with Different Prompts

Try asking your AI to:

Create Useful Scripts

# Daily briefing script
cat > ~/daily_brief.sh << 'EOF'
#!/bin/bash
echo "Good morning! Here's your daily tech tip:" | ollama run granite-pi --format json | jq -r '.response'
EOF

Learn More About AI

Your Raspberry Pi is a great platform for learning about:

Join the Community

Performance Expectations

| Raspberry Pi Model | Response Time | Best Use Cases | |——————–|——————-|————————————| | Pi 5 (4GB RAM) | 10-20 seconds | Simple questions, learning | | Pi 5 (8GB RAM) | 5-15 seconds | General use, longer conversations |

Tips for Best Performance

  1. Keep it cool: Good cooling = better performance
  2. Close other apps: Give your AI all available resources
  3. Use shorter prompts: Longer questions take more time
  4. Be patient: AI on a $100 computer is still amazing!

Safety and Care

Protecting Your Pi

Responsible AI Use

Conclusion

You’ve successfully set up a powerful AI assistant using just your Raspberry Pi 5’s built-in hardware! This simple setup proves that you don’t need expensive equipment to experiment with AI technology.

The Granite 3.2-2B model running on your Pi’s CPU provides a great introduction to local AI without any cloud dependencies. You have complete control over your data and can experiment freely.

Remember:

This is just the beginning of your AI journey. Enjoy exploring the possibilities!

Julian A. Gonzalez, IBM Champion 2025