Howto ramalama: Difference between revisions
Mandulete1 (talk | contribs) |
Mandulete1 (talk | contribs) |
||
Line 50: | Line 50: | ||
ramalama convert ollama://tinyllama:latest oci://quay.io/rhatdan/tiny:latest | ramalama convert ollama://tinyllama:latest oci://quay.io/rhatdan/tiny:latest | ||
= file and dir location = | |||
shortname file location: | |||
/usr/share/ramalama/shortnames.conf | |||
ramalama.conf file location: | |||
/usr/share/ramalama/ramalama.conf | |||
models directory location running as normal user: | |||
~/.local/share/ramalama/store/ | |||
models directory location running as root user: | |||
/var/lib/ramalama/store | |||
= running as daemon = | = running as daemon = | ||
for running container as systemd daemon create this directory: | for running container as systemd daemon create this directory: |
Revision as of 02:32, 20 August 2025
install
- fedora
install podman:
sudo dnf -y install podman podman-compose
install ramala:
sudo dnf -y install python3-ramalama
install via pypi:
pip install ramalama
- debian/ubuntu
install podman:
apt install podman podman-compose -y
install ramala:
curl -fsSL https://ramalama.ai/install.sh | bash
- archlinux
install podman:
pacman -Sy podman podman-compose --noconfirm
install yay using chaotic repo:
https://wiki.vidalinux.org/index.php?title=Howto_NVK#enable_chaotic_repo
install ramalama using yay:
yay -S ramala
usage
pull model openai gpt-oss:
ramalama pull gpt-oss
pull model deekseek-r1:
ramalama pull deepseek
pull model granite:
ramala pull granite
run model ibm granite:
ramalama run granite
serve model:
ramalama serve gpt-oss
serve model with vulkan backend:
ramalama serve --image=quay.io/ramalama/ramalama:latest deepseek
serve model with intel-gpu backend:
ramalama serve --image=quay.io/ramalama/intel-gpu:latest deepseek
serve model with nvidia-gpu backend:
ramalama serve --image=quay.io/ramalama/cuda:latest deepseek
serve model with amd-gpu backend:
ramalama serve --image=quay.io/ramalama/rocm:latest deepseek
serve model as daemon:
ramalama serve --port 8080 --name llamaserver -d deepseek
enter to the web chat ui:
http://localhost:8080
show container runtime command output without executing it:
ramalama --dryrun run deepseek
stop model service:
ramalama stop deepseek-service
convert specified model to an oci formatted ai model:
ramalama convert ollama://tinyllama:latest oci://quay.io/rhatdan/tiny:latest
file and dir location
shortname file location:
/usr/share/ramalama/shortnames.conf
ramalama.conf file location:
/usr/share/ramalama/ramalama.conf
models directory location running as normal user:
~/.local/share/ramalama/store/
models directory location running as root user:
/var/lib/ramalama/store
running as daemon
for running container as systemd daemon create this directory:
mkdir ~/.config/containers/systemd/
create yaml from containers deployed:
CONTAINERID=$(podman ps|grep -v CONTAINER|awk '{print $1}') podman kube generate ${CONTAINERID} -f ~/.config/containers/systemd/llamaserver.yaml
remove all containers running:
podman rm -af
create systemd file:
cat > ~/.config/containers/systemd/llamaserver.kube << EOF [Unit] Description = Run Kubernetes YAML with podman kube play [Kube] Yaml=llamaserver.yaml EOF
reload systemd:
systemctl --user daemon-reload
start service using systemd:
systemctl --user start llamaserver.service