ggml-org/llama.cpp

33 workflows · maturity 67% · 12 patterns · GitHub ↗

Security 1.52/100

Practices

✓ Matrix✓ Permissions○ Security scan○ AI review✓ Cache✓ Concurrency✓ Reusable workflows

Detected patterns

ai-code-review chaos-engineering cross-version-compat dogfooding hardware-matrix least-privilege-permissions multi-channel-release multi-stage-release multiple-build-systems per-sample-ci performance-tracking reusable-workflows

Security dimensions

permissions

1.5

security scan

supply chain

secret handling

harden runner

Workflows (33)

ai-issues .github/workflows/ai-issues.yml

Triggers

issues

Runs on

self-hosted, opencode

Jobs

find-related

Commands

rm AGENTS.md rm CLAUDE.md timeout 5m opencode run -m llama.cpp-dgx/ai-review-issues-find-similar --thinking "A new issue has been created: Issue number: ${{ github.event.issue.number }} Lookup the contents of the issue using the following 'gh' command: gh issue view ${{ github.event.issue.number }} --json title,body,url,number Next, perform the following task and then post a SINGLE comment (if needed). --- TASK : FIND RELATED ISSUES Using the 'gh' CLI tool, search through existing issues on Github. Find related or similar issues to the newly created one and list them. Do not list the new issue itself (it is #${{ github.event.issue.number }}). Consider: 1. Similar titles or descriptions 2. Same error messages or symptoms 3. Related functionality or components 4. Similar feature requests --- POSTING YOUR COMMENT: Based on your findings, post a SINGLE comment on issue #${{ github.event.issue.number }}. Build the comment as follows: - If no related issues were found, do NOT comment at all. - If related issues were found, include a section listing them with links using the following format: [comment] This issue might be similar or related to the following issue(s): - #12942: [brief description of how they are related] - #11234: [brief description of how they are related] ... _This comment was auto-generated locally using **$GA_ENGINE** on **$GA_MACHINE**_ [/comment] Remember: - Do not include the comment tags in your actual comment. - Post at most ONE comment combining all findings. - If you didn't find issues that are related enough, post nothing. - You have access only to the 'gh' CLI tool - don't try to use other tools. - If the output from a tool call is too long, try to limit down the search. "

View raw YAML

name: AI review (issues)

on:
  issues:
    types: [opened]

jobs:
  find-related:
    if: github.event.action == 'opened'
    runs-on: [self-hosted, opencode]

    permissions:
      contents: read
      issues: write

    steps:
      - name: Checkout repository
        uses: actions/checkout@v6
        with:
          fetch-depth: 1

      - name: Find related
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          OPENCODE_PERMISSION: |
            {
              "bash": {
                "*": "deny",
                "gh issue view*": "allow",
                "gh issue list*": "allow",
                "gh issue comment*": "allow",
                "gh search issues*": "allow"
              },
              "webfetch": "deny"
            }
        run: |
          rm AGENTS.md
          rm CLAUDE.md

          timeout 5m opencode run -m llama.cpp-dgx/ai-review-issues-find-similar --thinking "A new issue has been created:

          Issue number: ${{ github.event.issue.number }}

          Lookup the contents of the issue using the following 'gh' command:

          gh issue view ${{ github.event.issue.number }} --json title,body,url,number

          Next, perform the following task and then post a SINGLE comment (if needed).

          ---

          TASK : FIND RELATED ISSUES

          Using the 'gh' CLI tool, search through existing issues on Github.
          Find related or similar issues to the newly created one and list them.
          Do not list the new issue itself (it is #${{ github.event.issue.number }}).

          Consider:
          1. Similar titles or descriptions
          2. Same error messages or symptoms
          3. Related functionality or components
          4. Similar feature requests

          ---

          POSTING YOUR COMMENT:

          Based on your findings, post a SINGLE comment on issue #${{ github.event.issue.number }}. Build the comment as follows:

          - If no related issues were found, do NOT comment at all.
          - If related issues were found, include a section listing them with links using the following format:

          [comment]
          This issue might be similar or related to the following issue(s):

            - #12942: [brief description of how they are related]
            - #11234: [brief description of how they are related]
            ...

          _This comment was auto-generated locally using **$GA_ENGINE** on **$GA_MACHINE**_
          [/comment]

          Remember:
            - Do not include the comment tags in your actual comment.
            - Post at most ONE comment combining all findings.
            - If you didn't find issues that are related enough, post nothing.
            - You have access only to the 'gh' CLI tool - don't try to use other tools.
            - If the output from a tool call is too long, try to limit down the search.
          "

build matrix .github/workflows/build.yml

Triggers

workflow_dispatch, push, pull_request

Runs on

macos-latest, macos-15-intel, macos-latest, ${{ matrix.os }}, ubuntu-latest, ${{ 'ubuntu-24.04-arm' || 'ubuntu-24.04' }}, ubuntu-24.04, ${{ 'ubuntu-24.04-arm' || 'ubuntu-24.04' }}, ubuntu-22.04, ubuntu-22.04, ubuntu-22.04, ubuntu-22.04, ${{ fromJSON(matrix.runner) }}, windows-2025, ubuntu-latest, windows-2022, windows-2022, windows-2022, RISCV64, ubuntu-22.04, ubuntu-22.04-arm, ubuntu-22.04, ubuntu-22.04-arm, ubuntu-22.04-arm, ubuntu-22.04-arm, ah-ubuntu_22_04-c8g_8x

Jobs

build-cmake-pkg, macOS-latest-arm64, macOS-latest-x64, macOS-latest-arm64-webgpu, ubuntu-cpu, ubuntu-latest-rpc, ubuntu-24-vulkan, ubuntu-24-webgpu, ubuntu-24-webgpu-wasm, ubuntu-22-hip, ubuntu-22-musa, ubuntu-22-sycl, ubuntu-22-sycl-fp16, ubuntu-24-openvino, windows-latest, ubuntu-latest-cuda, windows-2022-cuda, windows-latest-sycl, windows-latest-hip, ubuntu-cpu-riscv64-native, ggml-ci-x64-cpu-low-perf, ggml-ci-arm64-cpu-low-perf, ggml-ci-x64-cpu-high-perf, ggml-ci-arm64-cpu-high-perf, ggml-ci-arm64-cpu-high-perf-sve, ggml-ci-arm64-cpu-kleidiai, ggml-ci-arm64-cpu-kleidiai-graviton4

Matrix

cuda, include, include.arch, include.build, include.defines, include.openvino_device, include.os, include.runner, include.variant→ "ubuntu-24.04", -DCMAKE_BUILD_TYPE=Release -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_BACKEND_DL=ON -DGGML_CPU_ALL_VARIANTS=ON -DGGML_VULKAN=ON, -G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/arm64-windows-llvm.cmake -DCMAKE_PREFIX_PATH="$env:RUNNER_TEMP/opencl-arm64-release" -DGGML_OPENCL=ON -DGGML_OPENCL_USE_ADRENO_KERNELS=ON, -G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/arm64-windows-llvm.cmake -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON, -G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/x64-windows-llvm.cmake -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DBUILD_SHARED_LIBS=OFF, -G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/x64-windows-llvm.cmake -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_BACKEND_DL=ON -DGGML_CPU_ALL_VARIANTS=ON -DGGML_OPENMP=OFF -DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS -DBLAS_INCLUDE_DIRS="$env:RUNNER_TEMP/openblas/include" -DBLAS_LIBRARIES="$env:RUNNER_TEMP/openblas/lib/openblas.lib", 12.4, CPU, GPU, ["self-hosted","Linux","X64","Intel"], arm64, cpu, cpu-x64 (static), gpu, llvm-arm64, llvm-arm64-opencl-adreno, openblas-x64, ppc64le, s390x, ubuntu-22.04, ubuntu-22.04-arm, ubuntu-24.04-ppc64le, ubuntu-24.04-s390x, vulkan-x64, x64

Actions

ggml-org/ccache-action, ggml-org/ccache-action, ggml-org/ccache-action, ggml-org/ccache-action, ggml-org/ccache-action, ggml-org/ccache-action, ggml-org/ccache-action, ggml-org/ccache-action, ggml-org/ccache-action, ggml-org/ccache-action, ggml-org/ccache-action, ggml-org/ccache-action, ggml-org/ccache-action, ggml-org/ccache-action, ggml-org/ccache-action, ggml-org/ccache-action, ggml-org/ccache-action, ggml-org/ccache-action, ggml-org/ccache-action, ggml-org/ccache-action, ggml-org/ccache-action, ggml-org/ccache-action

Commands

sysctl -a cmake -B build \ -DCMAKE_BUILD_RPATH="@loader_path" \ -DLLAMA_FATAL_WARNINGS=ON \ -DLLAMA_BUILD_BORINGSSL=ON \ -DGGML_METAL_USE_BF16=ON \ -DGGML_METAL_EMBED_LIBRARY=OFF \ -DGGML_METAL_SHADER_DEBUG=ON \ -DGGML_RPC=ON time cmake --build build --config Release -j $(sysctl -n hw.logicalcpu) leaks -atExit -- ./build/bin/test-thread-safety -hf ggml-org/gemma-3-270m-qat-GGUF -ngl 99 -p "$(printf 'hello %.0s' {1..128})" -n 16 -c 512 -ub 32 -np 2 -t 2 -lv 1
cd build ctest -L main -E "test-llama-archs" --verbose --timeout 900
sysctl -a # Metal is disabled due to intermittent failures with Github runners not having a GPU: # https://github.com/ggml-org/llama.cpp/actions/runs/8635935781/job/23674807267#step:5:2313 cmake -B build \ -DCMAKE_BUILD_RPATH="@loader_path" \ -DLLAMA_FATAL_WARNINGS=ON \ -DLLAMA_BUILD_BORINGSSL=ON \ -DGGML_METAL=OFF \ -DGGML_RPC=ON \ -DCMAKE_OSX_DEPLOYMENT_TARGET=13.3 time cmake --build build --config Release -j $(sysctl -n hw.logicalcpu)
cd build ctest -L main --verbose --timeout 900
DAWN_VERSION="v2.0.0" DAWN_OWNER="reeselevine" DAWN_REPO="dawn" DAWN_ASSET_NAME="Dawn-5e9a4865b1635796ccc77dd30057f2b4002a1355-macos-latest-Release" echo "Fetching release asset from https://github.com/${DAWN_OWNER}/${DAWN_REPO}/releases/download/${DAWN_VERSION}/${DAWN_ASSET_NAME}.zip" curl -L -o artifact.zip \ "https://github.com/${DAWN_OWNER}/${DAWN_REPO}/releases/download/${DAWN_VERSION}/${DAWN_ASSET_NAME}.zip" mkdir dawn unzip artifact.zip tar -xvf ${DAWN_ASSET_NAME}.tar.gz -C dawn --strip-components=1
export CMAKE_PREFIX_PATH=dawn cmake -B build -G "Ninja" -DCMAKE_BUILD_TYPE=Release -DGGML_WEBGPU=ON -DGGML_METAL=OFF -DGGML_BLAS=OFF time cmake --build build --config Release -j $(sysctl -n hw.logicalcpu)
cd build ctest -L main --verbose --timeout 900
sudo apt-get update sudo apt-get install -y --no-install-recommends \ python3 python3-pip python3-dev \ libjpeg-dev build-essential libssl-dev \ git-lfs

View raw YAML

name: CI

on:
  workflow_dispatch: # allows manual triggering
  push:
    branches:
      - master
    paths: [
      '.github/workflows/build.yml',
      '.github/workflows/build-cmake-pkg.yml',
      '**/CMakeLists.txt',
      '**/.cmake',
      '**/*.h',
      '**/*.hpp',
      '**/*.c',
      '**/*.cpp',
      '**/*.cu',
      '**/*.cuh',
      '**/*.swift',
      '**/*.m',
      '**/*.metal',
      '**/*.comp',
      '**/*.glsl',
      '**/*.wgsl'
    ]

  pull_request:
    types: [opened, synchronize, reopened]
    paths: [
      '.github/workflows/build.yml',
      '.github/workflows/build-cmake-pkg.yml',
      '**/CMakeLists.txt',
      '**/.cmake',
      '**/*.h',
      '**/*.hpp',
      '**/*.c',
      '**/*.cpp',
      '**/*.cu',
      '**/*.cuh',
      '**/*.swift',
      '**/*.m',
      '**/*.metal',
      '**/*.comp',
      '**/*.glsl',
      '**/*.wgsl'
    ]

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
  cancel-in-progress: true

env:
  GGML_NLOOP: 3
  GGML_N_THREADS: 1
  LLAMA_LOG_COLORS: 1
  LLAMA_LOG_PREFIX: 1
  LLAMA_LOG_TIMESTAMPS: 1

jobs:
  build-cmake-pkg:
    uses: ./.github/workflows/build-cmake-pkg.yml

  macOS-latest-arm64:
    runs-on: macos-latest

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: macOS-latest-arm64
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Build
        id: cmake_build
        run: |
          sysctl -a
          cmake -B build \
            -DCMAKE_BUILD_RPATH="@loader_path" \
            -DLLAMA_FATAL_WARNINGS=ON \
            -DLLAMA_BUILD_BORINGSSL=ON \
            -DGGML_METAL_USE_BF16=ON \
            -DGGML_METAL_EMBED_LIBRARY=OFF \
            -DGGML_METAL_SHADER_DEBUG=ON \
            -DGGML_RPC=ON
          time cmake --build build --config Release -j $(sysctl -n hw.logicalcpu)
          leaks -atExit -- ./build/bin/test-thread-safety -hf ggml-org/gemma-3-270m-qat-GGUF -ngl 99 -p "$(printf 'hello %.0s' {1..128})" -n 16 -c 512 -ub 32 -np 2 -t 2 -lv 1

      - name: Test
        id: cmake_test
        run: |
          cd build
          ctest -L main -E "test-llama-archs" --verbose --timeout 900

  macOS-latest-x64:
    runs-on: macos-15-intel

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: macOS-latest-x64
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Build
        id: cmake_build
        run: |
          sysctl -a
          # Metal is disabled due to intermittent failures with Github runners not having a GPU:
          # https://github.com/ggml-org/llama.cpp/actions/runs/8635935781/job/23674807267#step:5:2313
          cmake -B build \
            -DCMAKE_BUILD_RPATH="@loader_path" \
            -DLLAMA_FATAL_WARNINGS=ON \
            -DLLAMA_BUILD_BORINGSSL=ON \
            -DGGML_METAL=OFF \
            -DGGML_RPC=ON \
            -DCMAKE_OSX_DEPLOYMENT_TARGET=13.3
          time cmake --build build --config Release -j $(sysctl -n hw.logicalcpu)

      - name: Test
        id: cmake_test
        run: |
          cd build
          ctest -L main --verbose --timeout 900

  macOS-latest-arm64-webgpu:
    runs-on: macos-latest

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: macOS-latest-arm64-webgpu
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Dawn Dependency
        id: dawn-depends
        run: |
          DAWN_VERSION="v2.0.0"
          DAWN_OWNER="reeselevine"
          DAWN_REPO="dawn"
          DAWN_ASSET_NAME="Dawn-5e9a4865b1635796ccc77dd30057f2b4002a1355-macos-latest-Release"
          echo "Fetching release asset from https://github.com/${DAWN_OWNER}/${DAWN_REPO}/releases/download/${DAWN_VERSION}/${DAWN_ASSET_NAME}.zip"
          curl -L -o artifact.zip \
            "https://github.com/${DAWN_OWNER}/${DAWN_REPO}/releases/download/${DAWN_VERSION}/${DAWN_ASSET_NAME}.zip"
          mkdir dawn
          unzip artifact.zip
          tar -xvf ${DAWN_ASSET_NAME}.tar.gz -C dawn --strip-components=1

      - name: Build
        id: cmake_build
        run: |
          export CMAKE_PREFIX_PATH=dawn
          cmake -B build -G "Ninja" -DCMAKE_BUILD_TYPE=Release -DGGML_WEBGPU=ON -DGGML_METAL=OFF -DGGML_BLAS=OFF
          time cmake --build build --config Release -j $(sysctl -n hw.logicalcpu)

      - name: Test
        id: cmake_test
        run: |
          cd build
          ctest -L main --verbose --timeout 900

  ubuntu-cpu:
    strategy:
      matrix:
        include:
          - build: 'x64'
            os: ubuntu-22.04
          - build: 'arm64'
            os: ubuntu-22.04-arm
          - build: 's390x'
            os: ubuntu-24.04-s390x
          - build: 'ppc64le'
            os: ubuntu-24.04-ppc64le

    runs-on: ${{ matrix.os }}

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        if: ${{ matrix.build != 's390x' && matrix.build != 'ppc64le' }}
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: ubuntu-cpu-${{ matrix.build }}
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Build Dependencies
        id: build_depends
        run: |
          sudo apt-get update
          sudo apt-get install -y --no-install-recommends \
            python3 python3-pip python3-dev \
            libjpeg-dev build-essential libssl-dev \
            git-lfs

      - name: Python Dependencies
        id: python_depends
        run: |
          python3 -m pip install --upgrade pip
          pip3 install ./gguf-py

      - name: Swap Endianness
        id: endianness
        if: ${{ matrix.build == 's390x' }}
        run: |
          for f in models/*.gguf; do
            echo YES | python3 gguf-py/gguf/scripts/gguf_convert_endian.py $f big
          done

      - name: Build
        id: cmake_build
        run: |
          cmake -B build \
            -DLLAMA_FATAL_WARNINGS=ON \
            -DGGML_RPC=ON
          time cmake --build build --config Release -j $(nproc)

      - name: Test
        id: cmake_test
        run: |
          cd build
          ctest -L main --verbose --timeout 900

      - name: Test llama2c conversion
        id: llama2c_test
        if: ${{ matrix.build != 's390x' }}
        run: |
          cd build
          echo "Fetch tokenizer"
          wget https://huggingface.co/karpathy/tinyllamas/resolve/main/stories260K/tok512.bin
          echo "Fetch llama2c model"
          wget https://huggingface.co/karpathy/tinyllamas/resolve/main/stories260K/stories260K.bin
          ./bin/llama-convert-llama2c-to-ggml --copy-vocab-from-model ./tok512.bin --llama2c-model stories260K.bin --llama2c-output-model stories260K.gguf
          ./bin/llama-completion -m stories260K.gguf -p "One day, Lily met a Shoggoth" -n 500 -c 256

      - name: Test llama2c (s390x)
        id: llama2c_test_s390x
        if: ${{ matrix.build == 's390x' }}
        run: |
          cd build
          echo "Fetch llama2c big-endian model"
          wget https://huggingface.co/ggml-org/models/resolve/main/tinyllamas/stories260K-be.gguf
          ./bin/llama-completion -m stories260K-be.gguf -p "One day, Lily met a Shoggoth" -n 500 -c 256

  ubuntu-latest-rpc:
    runs-on: ubuntu-latest

    continue-on-error: true

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Dependencies
        id: depends
        run: |
          sudo apt-get update
          sudo apt-get install build-essential libssl-dev ninja-build

      - name: Build
        id: cmake_build
        run: |
          cmake -B build \
            -G "Ninja" \
            -DCMAKE_BUILD_TYPE=Release \
            -DGGML_RPC=ON
          time cmake --build build --config Release -j $(nproc)

      - name: Test
        id: cmake_test
        run: |
          cd build
          ctest -L main --verbose

  ubuntu-24-vulkan:
    runs-on: ${{ 'ubuntu-24.04-arm' || 'ubuntu-24.04' }}

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Dependencies
        id: depends
        run: |
          sudo apt-get install -y glslc libvulkan-dev libssl-dev ninja-build

      - name: Configure
        id: cmake_configure
        run: |
          cmake -B build \
            -G "Ninja" \
            -DCMAKE_BUILD_TYPE=RelWithDebInfo \
            -DGGML_BACKEND_DL=ON \
            -DGGML_CPU_ALL_VARIANTS=ON \
            -DGGML_VULKAN=ON

      - name: Build
        id: cmake_build
        run: |
          time cmake --build build -j $(nproc)

  ubuntu-24-webgpu:
    runs-on: ubuntu-24.04

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: ubuntu-24-webgpu
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Dependencies
        id: depends
        run: |
          sudo add-apt-repository -y ppa:kisak/kisak-mesa
          sudo apt-get update -y
          sudo apt-get install -y build-essential mesa-vulkan-drivers \
            libxcb-xinput0 libxcb-xinerama0 libxcb-cursor-dev libssl-dev

      - name: Get latest Vulkan SDK version
        id: vulkan_sdk_version
        run: |
          echo "VULKAN_SDK_VERSION=$(curl https://vulkan.lunarg.com/sdk/latest/linux.txt)" >> "$GITHUB_ENV"

      - name: Use Vulkan SDK Cache
        uses: actions/cache@v5
        id: cache-sdk
        with:
          path: ./vulkan_sdk
          key: vulkan-sdk-${{ env.VULKAN_SDK_VERSION }}-${{ runner.os }}

      - name: Setup Vulkan SDK
        if: steps.cache-sdk.outputs.cache-hit != 'true'
        uses: ./.github/actions/linux-setup-vulkan
        with:
          path: ./vulkan_sdk
          version: ${{ env.VULKAN_SDK_VERSION }}

      - name: Dawn Dependency
        id: dawn-depends
        run: |
          sudo apt-get install -y libxrandr-dev libxinerama-dev libxcursor-dev mesa-common-dev libx11-xcb-dev libxi-dev
          DAWN_VERSION="v2.0.0"
          DAWN_OWNER="reeselevine"
          DAWN_REPO="dawn"
          DAWN_ASSET_NAME="Dawn-5e9a4865b1635796ccc77dd30057f2b4002a1355-ubuntu-latest-Release"
          echo "Fetching release asset from https://github.com/${DAWN_OWNER}/${DAWN_REPO}/releases/download/${DAWN_VERSION}/${DAWN_ASSET_NAME}.zip"
          curl -L -o artifact.zip \
            "https://github.com/${DAWN_OWNER}/${DAWN_REPO}/releases/download/${DAWN_VERSION}/${DAWN_ASSET_NAME}.zip"
          mkdir dawn
          unzip artifact.zip
          tar -xvf ${DAWN_ASSET_NAME}.tar.gz -C dawn --strip-components=1

      - name: Build
        id: cmake_build
        run: |
          export Dawn_DIR=dawn/lib64/cmake/Dawn
          cmake -B build \
            -DGGML_WEBGPU=ON
          time cmake --build build --config Release -j $(nproc)

      - name: Test
        id: cmake_test
        run: |
          cd build
          # This is using llvmpipe and runs slower than other backends
          ctest -L main --verbose --timeout 900

  ubuntu-24-webgpu-wasm:
    runs-on: ${{ 'ubuntu-24.04-arm' || 'ubuntu-24.04' }}

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Install Emscripten
        run: |
          git clone https://github.com/emscripten-core/emsdk.git
          cd emsdk
          ./emsdk install latest
          ./emsdk activate latest

      - name: Fetch emdawnwebgpu
        run: |
          DAWN_TAG="v20251027.212519"
          EMDAWN_PKG="emdawnwebgpu_pkg-${DAWN_TAG}.zip"
          echo "Downloading ${EMDAWN_PKG}"
          curl -L -o emdawn.zip \
            "https://github.com/google/dawn/releases/download/${DAWN_TAG}/${EMDAWN_PKG}"
          unzip emdawn.zip

      - name: Build WASM WebGPU
        run: |
          source emsdk/emsdk_env.sh
          emcmake cmake -B build-wasm \
            -G "Ninja" \
            -DCMAKE_BUILD_TYPE=Release \
            -DGGML_WEBGPU=ON \
            -DLLAMA_OPENSSL=OFF \
            -DEMDAWNWEBGPU_DIR=emdawnwebgpu_pkg

          time cmake --build build-wasm --config Release --target test-backend-ops -j $(nproc)

  ubuntu-22-hip:
    runs-on: ubuntu-22.04
    container: rocm/dev-ubuntu-22.04:6.1.2

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Dependencies
        id: depends
        run: |
          sudo apt-get update
          sudo apt-get install -y build-essential git cmake rocblas-dev hipblas-dev libssl-dev rocwmma-dev

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: ubuntu-22-hip
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Build with native CMake HIP support
        id: cmake_build
        run: |
          cmake -B build -S . \
            -DCMAKE_HIP_COMPILER="$(hipconfig -l)/clang" \
            -DGGML_HIP_ROCWMMA_FATTN=ON \
            -DGGML_HIP=ON
          cmake --build build --config Release -j $(nproc)

  ubuntu-22-musa:
    runs-on: ubuntu-22.04
    container: mthreads/musa:rc4.3.0-devel-ubuntu22.04-amd64

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Dependencies
        id: depends
        run: |
          apt-get update
          apt-get install -y build-essential git cmake libssl-dev

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: ubuntu-22-musa
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Build with native CMake MUSA support
        id: cmake_build
        run: |
          cmake -B build -S . \
            -DGGML_MUSA=ON
          time cmake --build build --config Release -j $(nproc)

  ubuntu-22-sycl:
    runs-on: ubuntu-22.04

    continue-on-error: true

    steps:
      - uses: actions/checkout@v6

      - name: add oneAPI to apt
        shell: bash
        run: |
          cd /tmp
          wget https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
          sudo apt-key add GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
          rm GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
          sudo add-apt-repository "deb https://apt.repos.intel.com/oneapi all main"

      - name: install oneAPI dpcpp compiler
        shell: bash
        run: |
          sudo apt update
          sudo apt install intel-oneapi-compiler-dpcpp-cpp libssl-dev

      - name: install oneAPI MKL library
        shell: bash
        run: |
          sudo apt install intel-oneapi-mkl-devel

      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: ubuntu-22-sycl
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Build
        id: cmake_build
        run: |
          source /opt/intel/oneapi/setvars.sh
          cmake -B build \
            -DGGML_SYCL=ON \
            -DCMAKE_C_COMPILER=icx \
            -DCMAKE_CXX_COMPILER=icpx
          time cmake --build build --config Release -j $(nproc)

  ubuntu-22-sycl-fp16:
    runs-on: ubuntu-22.04

    continue-on-error: true

    steps:
      - uses: actions/checkout@v6

      - name: add oneAPI to apt
        shell: bash
        run: |
          cd /tmp
          wget https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
          sudo apt-key add GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
          rm GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
          sudo add-apt-repository "deb https://apt.repos.intel.com/oneapi all main"

      - name: install oneAPI dpcpp compiler
        shell: bash
        run: |
          sudo apt update
          sudo apt install intel-oneapi-compiler-dpcpp-cpp libssl-dev ninja-build

      - name: install oneAPI MKL library
        shell: bash
        run: |
          sudo apt install intel-oneapi-mkl-devel

      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: ubuntu-22-sycl-fp16
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Build
        id: cmake_build
        run: |
          source /opt/intel/oneapi/setvars.sh
          cmake -B build \
            -G "Ninja" \
            -DCMAKE_BUILD_TYPE=Release \
            -DGGML_SYCL=ON \
            -DCMAKE_C_COMPILER=icx \
            -DCMAKE_CXX_COMPILER=icpx \
            -DGGML_SYCL_F16=ON
          time cmake --build build --config Release -j $(nproc)

  ubuntu-24-openvino:
      name: ubuntu-24-openvino-${{ matrix.openvino_device }}
      strategy:
        matrix:
          include:
            - variant: cpu
              runner: '"ubuntu-24.04"'
              openvino_device: "CPU"
            - variant: gpu
              runner: '["self-hosted","Linux","X64","Intel"]'
              openvino_device: "GPU"

      runs-on: ${{ fromJSON(matrix.runner) }}

      env:
        # Sync versions in build.yml, build-self-hosted.yml, release.yml, build-cache.yml, .devops/openvino.Dockerfile
        OPENVINO_VERSION_MAJOR: "2026.0"
        OPENVINO_VERSION_FULL: "2026.0.0.20965.c6d6a13a886"

      steps:
        - name: Clone
          id: checkout
          uses: actions/checkout@v6

        - name: ccache
          if: runner.environment == 'github-hosted'
          uses: ggml-org/ccache-action@v1.2.21
          with:
            key: ubuntu-24-openvino-${{ matrix.variant }}-no-preset-v1
            evict-old-files: 1d
            save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

        - name: Dependencies
          id: depends
          run: |
            sudo apt-get update
            sudo apt-get install -y build-essential libssl-dev libtbb12 cmake ninja-build python3-pip
            sudo apt-get install -y ocl-icd-opencl-dev opencl-headers opencl-clhpp-headers intel-opencl-icd

        - name: Use OpenVINO Toolkit Cache
          if: runner.environment == 'github-hosted'
          uses: actions/cache@v5
          id: cache-openvino
          with:
            path: ./openvino_toolkit
            key: openvino-toolkit-v${{ env.OPENVINO_VERSION_FULL }}-${{ runner.os }}

        - name: Setup OpenVINO Toolkit
          if: steps.cache-openvino.outputs.cache-hit != 'true'
          uses: ./.github/actions/linux-setup-openvino
          with:
            path: ./openvino_toolkit
            version_major: ${{ env.OPENVINO_VERSION_MAJOR }}
            version_full: ${{ env.OPENVINO_VERSION_FULL }}

        - name: Install OpenVINO dependencies
          run: |
            cd ./openvino_toolkit
            chmod +x ./install_dependencies/install_openvino_dependencies.sh
            echo "Y" | sudo -E ./install_dependencies/install_openvino_dependencies.sh

        - name: Build
          id: cmake_build
          run: |
            source ./openvino_toolkit/setupvars.sh
            cmake -B build/ReleaseOV -G Ninja \
              -DCMAKE_BUILD_TYPE=Release \
              -DGGML_OPENVINO=ON
            time cmake --build build/ReleaseOV --config Release -j $(nproc)

        - name: Test
          id: cmake_test
          # TODO: fix and re-enable the `test-llama-archs` test below
          run: |
            cd ${{ github.workspace }}
            if [ "${{ matrix.openvino_device }}" = "GPU" ]; then
              export GGML_OPENVINO_DEVICE=GPU
            fi
            ctest --test-dir build/ReleaseOV -L main -E "test-llama-archs" --verbose --timeout 2000

  windows-latest:
    runs-on: windows-2025

    env:
      OPENBLAS_VERSION: 0.3.23
      SDE_VERSION: 9.33.0-2024-01-07
      VULKAN_VERSION: 1.4.313.2

    strategy:
      matrix:
        include:
          - build: 'cpu-x64 (static)'
            arch: 'x64'
            defines: '-G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/x64-windows-llvm.cmake -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DBUILD_SHARED_LIBS=OFF'
          - build: 'openblas-x64'
            arch: 'x64'
            defines: '-G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/x64-windows-llvm.cmake -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_BACKEND_DL=ON -DGGML_CPU_ALL_VARIANTS=ON -DGGML_OPENMP=OFF -DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS -DBLAS_INCLUDE_DIRS="$env:RUNNER_TEMP/openblas/include" -DBLAS_LIBRARIES="$env:RUNNER_TEMP/openblas/lib/openblas.lib"'
          - build: 'vulkan-x64'
            arch: 'x64'
            defines: '-DCMAKE_BUILD_TYPE=Release -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_BACKEND_DL=ON -DGGML_CPU_ALL_VARIANTS=ON -DGGML_VULKAN=ON'
          - build: 'llvm-arm64'
            arch: 'arm64'
            defines: '-G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/arm64-windows-llvm.cmake -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON'
          - build: 'llvm-arm64-opencl-adreno'
            arch: 'arm64'
            defines: '-G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/arm64-windows-llvm.cmake -DCMAKE_PREFIX_PATH="$env:RUNNER_TEMP/opencl-arm64-release" -DGGML_OPENCL=ON -DGGML_OPENCL_USE_ADRENO_KERNELS=ON'

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: windows-latest-${{ matrix.build }}
          variant: ccache
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Download OpenBLAS
        id: get_openblas
        if: ${{ matrix.build == 'openblas-x64' }}
        run: |
          curl.exe -o $env:RUNNER_TEMP/openblas.zip -L "https://github.com/xianyi/OpenBLAS/releases/download/v${env:OPENBLAS_VERSION}/OpenBLAS-${env:OPENBLAS_VERSION}-x64.zip"
          curl.exe -o $env:RUNNER_TEMP/OpenBLAS.LICENSE.txt -L "https://github.com/xianyi/OpenBLAS/raw/v${env:OPENBLAS_VERSION}/LICENSE"
          mkdir $env:RUNNER_TEMP/openblas
          tar.exe -xvf $env:RUNNER_TEMP/openblas.zip -C $env:RUNNER_TEMP/openblas
          $vcdir = $(vswhere -latest -products * -requires Microsoft.VisualStudio.Component.VC.Tools.x86.x64 -property installationPath)
          $msvc = $(join-path $vcdir $('VC\Tools\MSVC\'+$(gc -raw $(join-path $vcdir 'VC\Auxiliary\Build\Microsoft.VCToolsVersion.default.txt')).Trim()))
          $lib =  $(join-path $msvc 'bin\Hostx64\x64\lib.exe')
          & $lib /machine:x64 "/def:${env:RUNNER_TEMP}/openblas/lib/libopenblas.def" "/out:${env:RUNNER_TEMP}/openblas/lib/openblas.lib" /name:openblas.dll

      - name: Install Vulkan SDK
        id: get_vulkan
        if: ${{ matrix.build == 'vulkan-x64' }}
        run: |
          curl.exe -o $env:RUNNER_TEMP/VulkanSDK-Installer.exe -L "https://sdk.lunarg.com/sdk/download/${env:VULKAN_VERSION}/windows/vulkansdk-windows-X64-${env:VULKAN_VERSION}.exe"
          & "$env:RUNNER_TEMP\VulkanSDK-Installer.exe" --accept-licenses --default-answer --confirm-command install
          Add-Content $env:GITHUB_ENV "VULKAN_SDK=C:\VulkanSDK\${env:VULKAN_VERSION}"
          Add-Content $env:GITHUB_PATH "C:\VulkanSDK\${env:VULKAN_VERSION}\bin"

      - name: Install Ninja
        id: install_ninja
        run: |
          choco install ninja

      - name: Install OpenCL Headers and Libs
        id: install_opencl
        if: ${{ matrix.build == 'llvm-arm64-opencl-adreno' }}
        run: |
          git clone https://github.com/KhronosGroup/OpenCL-Headers
          cd OpenCL-Headers
          cmake -B build `
            -DBUILD_TESTING=OFF `
            -DOPENCL_HEADERS_BUILD_TESTING=OFF `
            -DOPENCL_HEADERS_BUILD_CXX_TESTS=OFF `
            -DCMAKE_INSTALL_PREFIX="$env:RUNNER_TEMP/opencl-arm64-release"
          cmake --build build --target install
          git clone https://github.com/KhronosGroup/OpenCL-ICD-Loader
          cd OpenCL-ICD-Loader
          cmake -B build-arm64-release `
            -A arm64 `
            -DCMAKE_PREFIX_PATH="$env:RUNNER_TEMP/opencl-arm64-release" `
            -DCMAKE_INSTALL_PREFIX="$env:RUNNER_TEMP/opencl-arm64-release"
          cmake --build build-arm64-release --target install --config release

      - name: Build
        id: cmake_build
        run: |
          cmake -S . -B build ${{ matrix.defines }} `
            -DLLAMA_BUILD_BORINGSSL=ON
          cmake --build build --config Release -j ${env:NUMBER_OF_PROCESSORS}

      - name: Add libopenblas.dll
        id: add_libopenblas_dll
        if: ${{ matrix.build == 'openblas-x64' }}
        run: |
          cp $env:RUNNER_TEMP/openblas/bin/libopenblas.dll ./build/bin/Release/openblas.dll
          cp $env:RUNNER_TEMP/OpenBLAS.LICENSE.txt ./build/bin/Release/OpenBLAS-${env:OPENBLAS_VERSION}.txt

      - name: Test
        id: cmake_test
        if: ${{ matrix.arch == 'x64' }}
        run: |
          cd build
          ctest -L main -C Release --verbose --timeout 900

      # TODO: disabled for now, consider adding tests for all CPU variants instead
      # - name: Test (Intel SDE)
      #   id: cmake_test_sde
      #   if: ${{ matrix.build == 'avx512-x64' && env.HAS_AVX512F == '0' }} # use Intel SDE for AVX-512 emulation
      #   run: |
      #     curl.exe -o $env:RUNNER_TEMP/sde.tar.xz -L "https://downloadmirror.intel.com/813591/sde-external-${env:SDE_VERSION}-win.tar.xz"
      #     # for some weird reason windows tar doesn't like sde tar.xz
      #     7z x "-o${env:RUNNER_TEMP}" $env:RUNNER_TEMP/sde.tar.xz
      #     7z x "-o${env:RUNNER_TEMP}" $env:RUNNER_TEMP/sde.tar
      #     $sde = $(join-path $env:RUNNER_TEMP sde-external-${env:SDE_VERSION}-win/sde.exe)
      #     cd build
      #     $env:LLAMA_SKIP_TESTS_SLOW_ON_EMULATOR = 1
      #     & $sde -future -- ctest -L main -C Release --verbose --timeout 900

  ubuntu-latest-cuda:
    runs-on: ubuntu-latest
    container: nvidia/cuda:12.6.2-devel-ubuntu24.04

    steps:
        - name: Clone
          id: checkout
          uses: actions/checkout@v6

        - name: Install dependencies
          env:
            DEBIAN_FRONTEND: noninteractive
          run: |
              apt update
              apt install -y cmake build-essential ninja-build libgomp1 git libssl-dev

        - name: ccache
          uses: ggml-org/ccache-action@v1.2.21
          with:
            key: ubuntu-latest-cuda
            evict-old-files: 1d
            save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

        - name: Build with CMake
          # TODO: Remove GGML_CUDA_CUB_3DOT2 flag once CCCL 3.2 is bundled within CTK and that CTK version is used in this project
          run: |
            cmake -S . -B build -G Ninja \
              -DLLAMA_FATAL_WARNINGS=ON \
              -DCMAKE_BUILD_TYPE=Release \
              -DCMAKE_CUDA_ARCHITECTURES=89-real \
              -DCMAKE_EXE_LINKER_FLAGS=-Wl,--allow-shlib-undefined \
              -DGGML_NATIVE=OFF \
              -DGGML_CUDA=ON \
              -DGGML_CUDA_CUB_3DOT2=ON
            cmake --build build

  windows-2022-cuda:
    runs-on: windows-2022

    strategy:
      matrix:
        cuda: ['12.4']

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Install ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: windows-cuda-${{ matrix.cuda }}
          variant: ccache
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Install Cuda Toolkit
        uses: ./.github/actions/windows-setup-cuda
        with:
          cuda_version: ${{ matrix.cuda }}

      - name: Install Ninja
        id: install_ninja
        run: |
          choco install ninja

      - name: Build
        id: cmake_build
        shell: cmd
        # TODO: Remove GGML_CUDA_CUB_3DOT2 flag once CCCL 3.2 is bundled within CTK and that CTK version is used in this project
        run: |
          call "C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Auxiliary\Build\vcvarsall.bat" x64
          cmake -S . -B build -G "Ninja Multi-Config" ^
            -DLLAMA_BUILD_SERVER=ON ^
            -DLLAMA_BUILD_BORINGSSL=ON ^
            -DGGML_NATIVE=OFF ^
            -DGGML_BACKEND_DL=ON ^
            -DGGML_CPU_ALL_VARIANTS=ON ^
            -DGGML_CUDA=ON ^
            -DGGML_RPC=ON ^
            -DGGML_CUDA_CUB_3DOT2=ON
          set /A NINJA_JOBS=%NUMBER_OF_PROCESSORS%-1
          cmake --build build --config Release -j %NINJA_JOBS% -t ggml
          cmake --build build --config Release

  windows-latest-sycl:
    runs-on: windows-2022

    defaults:
      run:
        shell: bash

    env:
      WINDOWS_BASEKIT_URL: https://registrationcenter-download.intel.com/akdlm/IRC_NAS/24751ead-ddc5-4479-b9e6-f9fe2ff8b9f2/intel-deep-learning-essentials-2025.2.1.25_offline.exe
      WINDOWS_DPCPP_MKL: intel.oneapi.win.cpp-dpcpp-common:intel.oneapi.win.mkl.devel:intel.oneapi.win.dnnl:intel.oneapi.win.tbb.devel
      ONEAPI_ROOT: "C:/Program Files (x86)/Intel/oneAPI"
    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: windows-latest-sycl
          variant: ccache
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Install
        run:  |
          scripts/install-oneapi.bat $WINDOWS_BASEKIT_URL $WINDOWS_DPCPP_MKL

      # TODO: add ssl support ; we will also need to modify win-build-sycl.bat to accept user-specified args

      - name: Build
        id: cmake_build
        run:  examples/sycl/win-build-sycl.bat

  windows-latest-hip:
    runs-on: windows-2022

    env:
      # Make sure this is in sync with build-cache.yml
      HIPSDK_INSTALLER_VERSION: "26.Q1"

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Grab rocWMMA package
        id: grab_rocwmma
        run: |
          curl -o rocwmma.deb "https://repo.radeon.com/rocm/apt/7.2/pool/main/r/rocwmma-dev/rocwmma-dev_2.2.0.70200-43~24.04_amd64.deb"
          7z x rocwmma.deb
          7z x data.tar

      - name: Use ROCm Installation Cache
        uses: actions/cache@v5
        id: cache-rocm
        with:
          path: C:\Program Files\AMD\ROCm
          key: rocm-${{ env.HIPSDK_INSTALLER_VERSION }}-${{ runner.os }}

      - name: Setup ROCm
        if: steps.cache-rocm.outputs.cache-hit != 'true'
        uses: ./.github/actions/windows-setup-rocm
        with:
          version: ${{ env.HIPSDK_INSTALLER_VERSION }}

      - name: Verify ROCm
        id: verify
        run: |
          # Find and test ROCm installation
          $clangPath = Get-ChildItem 'C:\Program Files\AMD\ROCm\*\bin\clang.exe' | Select-Object -First 1
          if (-not $clangPath) {
            Write-Error "ROCm installation not found"
            exit 1
          }
          & $clangPath.FullName --version

      - name: Install ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: ${{ github.job }}
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Build
        id: cmake_build
        run: |
          $env:HIP_PATH=$(Resolve-Path 'C:\Program Files\AMD\ROCm\*\bin\clang.exe' | split-path | split-path)
          $env:CMAKE_PREFIX_PATH="${env:HIP_PATH}"
          cmake -G "Unix Makefiles" -B build -S . `
            -DCMAKE_C_COMPILER="${env:HIP_PATH}\bin\clang.exe" `
            -DCMAKE_CXX_COMPILER="${env:HIP_PATH}\bin\clang++.exe" `
            -DCMAKE_CXX_FLAGS="-I$($PWD.Path.Replace('\', '/'))/opt/rocm-7.2.0/include/" `
            -DCMAKE_BUILD_TYPE=Release `
            -DLLAMA_BUILD_BORINGSSL=ON `
            -DROCM_DIR="${env:HIP_PATH}" `
            -DGGML_HIP=ON `
            -DGGML_HIP_ROCWMMA_FATTN=ON `
            -DGGML_RPC=ON
          cmake --build build -j ${env:NUMBER_OF_PROCESSORS}

  ubuntu-cpu-riscv64-native:
    runs-on: RISCV64

    steps:
      - name: Install dependencies
        run: |
          sudo apt-get update

          # Install necessary packages
          sudo apt-get install -y libatomic1 libtsan2 gcc-14 g++-14 rustup cmake build-essential libssl-dev wget ccache git-lfs

          # Set gcc-14 and g++-14 as the default compilers
          sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-14 100
          sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-14 100
          sudo ln -sf /usr/bin/gcc-14 /usr/bin/gcc
          sudo ln -sf /usr/bin/g++-14 /usr/bin/g++

          # Install Rust stable version
          rustup install stable
          rustup default stable

          git lfs install

      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Check environment
        run: |
          uname -a
          gcc --version
          g++ --version
          ldd --version
          cmake --version
          rustc --version

      - name: Setup ccache
        run: |
          # Set unique cache directory for this job
          export CCACHE_DIR="$HOME/.ccache/cpu-cmake-rv64-native"
          mkdir -p "$CCACHE_DIR"

          # Configure ccache for optimal performance
          ccache --set-config=max_size=5G
          ccache --set-config=compression=true
          ccache --set-config=compression_level=6
          ccache --set-config=cache_dir="$CCACHE_DIR"

          # Enable more aggressive caching
          ccache --set-config=sloppiness=file_macro,time_macros,include_file_mtime,include_file_ctime
          ccache --set-config=hash_dir=false

          # Export for subsequent steps
          echo "CCACHE_DIR=$CCACHE_DIR" >> $GITHUB_ENV
          echo "PATH=/usr/lib/ccache:$PATH" >> $GITHUB_ENV

      - name: Build
        id: cmake_build
        run: |
          cmake -B build \
            -DCMAKE_BUILD_TYPE=Release \
            -DGGML_OPENMP=OFF \
            -DLLAMA_BUILD_EXAMPLES=ON \
            -DLLAMA_BUILD_TOOLS=ON \
            -DLLAMA_BUILD_TESTS=ON \
            -DCMAKE_C_COMPILER_LAUNCHER=ccache \
            -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \
            -DGGML_RPC=ON \
            -DCMAKE_C_COMPILER=riscv64-linux-gnu-gcc-14 \
            -DCMAKE_CXX_COMPILER=riscv64-linux-gnu-g++-14

          time cmake --build build --config Release -j $(nproc)

      - name: Test
        id: cmake_test
        run: |
          cd build
          ctest -L main --verbose --timeout 900

      - name: Test llama2c conversion
        id: llama2c_test
        run: |
          cd build
          echo "Fetch tokenizer"
          wget https://huggingface.co/karpathy/tinyllamas/resolve/main/stories260K/tok512.bin
          echo "Fetch llama2c model"
          wget https://huggingface.co/karpathy/tinyllamas/resolve/main/stories260K/stories260K.bin
          ./bin/llama-convert-llama2c-to-ggml --copy-vocab-from-model ./tok512.bin --llama2c-model stories260K.bin --llama2c-output-model stories260K.gguf
          ./bin/llama-completion -m stories260K.gguf -p "One day, Lily met a Shoggoth" -n 500 -c 256

# TODO: simplify the following workflows using a matrix
# TODO: run lighter CI on PRs and the full CI only on master (if needed)
  ggml-ci-x64-cpu-low-perf:
    runs-on: ubuntu-22.04

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: ggml-ci-x64-cpu-low-perf
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Dependencies
        id: depends
        run: |
          sudo apt-get update
          sudo apt-get install build-essential

      - name: Test
        id: ggml-ci
        run: |
          LLAMA_ARG_THREADS=$(nproc) GG_BUILD_LOW_PERF=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt

  ggml-ci-arm64-cpu-low-perf:
    runs-on: ubuntu-22.04-arm

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: ggml-ci-arm64-cpu-low-perf
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Dependencies
        id: depends
        run: |
          sudo apt-get update
          sudo apt-get install build-essential

      - name: Test
        id: ggml-ci
        run: |
          LLAMA_ARG_THREADS=$(nproc) GG_BUILD_LOW_PERF=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt

  ggml-ci-x64-cpu-high-perf:
    runs-on: ubuntu-22.04

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: ggml-ci-x64-cpu-high-perf
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Dependencies
        id: depends
        run: |
          sudo apt-get update
          sudo apt-get install build-essential

      - name: Test
        id: ggml-ci
        run: |
          LLAMA_ARG_THREADS=$(nproc) GG_BUILD_HIGH_PERF=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt

  ggml-ci-arm64-cpu-high-perf:
    runs-on: ubuntu-22.04-arm

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: ggml-ci-arm64-cpu-high-perf
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Dependencies
        id: depends
        run: |
          sudo apt-get update
          sudo apt-get install build-essential

      - name: Test
        id: ggml-ci
        run: |
          LLAMA_ARG_THREADS=$(nproc) GG_BUILD_HIGH_PERF=1 GG_BUILD_NO_SVE=1 GG_BUILD_NO_BF16=1 GG_BUILD_EXTRA_TESTS_0=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt

  ggml-ci-arm64-cpu-high-perf-sve:
    runs-on: ubuntu-22.04-arm

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: ggml-ci-arm64-cpu-high-perf-sve
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Dependencies
        id: depends
        run: |
          sudo apt-get update
          sudo apt-get install build-essential

      - name: Test
        id: ggml-ci
        run: |
          LLAMA_ARG_THREADS=$(nproc) GG_BUILD_NO_BF16=1 GG_BUILD_EXTRA_TESTS_0=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt

  ggml-ci-arm64-cpu-kleidiai:
     runs-on: ubuntu-22.04-arm

     steps:
       - name: Clone
         id: checkout
         uses: actions/checkout@v6

       - name: ccache
         uses: ggml-org/ccache-action@v1.2.21
         with:
           key: ggml-ci-arm64-cpu-kleidiai
           evict-old-files: 1d
           save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

       - name: Dependencies
         id: depends
         run: |
           sudo apt-get update
           sudo apt-get install -y build-essential

       - name: Test
         id: ggml-ci
         run: |
           GG_BUILD_KLEIDIAI=1 GG_BUILD_EXTRA_TESTS_0=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt

  ggml-ci-arm64-cpu-kleidiai-graviton4:
     runs-on: ah-ubuntu_22_04-c8g_8x

     steps:
       - name: Clone
         id: checkout
         uses: actions/checkout@v6

       - name: Dependencies
         id: depends
         run: |
           set -euxo pipefail
           sudo apt-get update
           sudo DEBIAN_FRONTEND=noninteractive NEEDRESTART_MODE=a \
           apt-get install -y \
            build-essential \
            python3-venv \
            gpg \
            wget \
            time \
            git-lfs

           git lfs install

           # install the latest cmake
           sudo install -d /usr/share/keyrings
           wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc \
            | gpg --dearmor \
            | sudo tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null
           echo 'deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ jammy main' \
            | sudo tee /etc/apt/sources.list.d/kitware.list
           sudo apt-get update
           sudo apt-get install -y cmake

       - name: ccache
         uses: ggml-org/ccache-action@v1.2.21
         with:
           key: ggml-ci-arm64-cpu-kleidiai-graviton4
           evict-old-files: 1d
           save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

       - name: Test
         id: ggml-ci
         run: |
           GG_BUILD_KLEIDIAI=1 \
           GG_BUILD_EXTRA_TESTS_0=1 \
           bash ./ci/run.sh ./tmp/results ./tmp/mnt

build-3rd-party .github/workflows/build-3rd-party.yml

Triggers

workflow_dispatch, push

Runs on

${{ 'ubuntu-24.04-arm' || 'ubuntu-24.04' }}

Jobs

ubuntu-24-llguidance

Commands

sudo apt-get update sudo apt-get install build-essential libssl-dev
cmake -B build \ -DLLAMA_FATAL_WARNINGS=ON \ -DLLAMA_LLGUIDANCE=ON cmake --build build --config Release -j $(nproc)
cd build ctest -L main --verbose --timeout 900

View raw YAML

name: CI (3rd-party)

on:
  workflow_dispatch: # allows manual triggering
  push:
    branches:
      - master
    paths: [
      '.github/workflows/build-3rd-party.yml',
      '**/CMakeLists.txt',
      '**/.cmake',
      '**/*.h',
      '**/*.hpp',
      '**/*.c',
      '**/*.cpp'
    ]

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
  cancel-in-progress: true

env:
  GGML_NLOOP: 3
  GGML_N_THREADS: 1
  LLAMA_LOG_COLORS: 1
  LLAMA_LOG_PREFIX: 1
  LLAMA_LOG_TIMESTAMPS: 1

jobs:
  ubuntu-24-llguidance:
    runs-on: ${{ 'ubuntu-24.04-arm' || 'ubuntu-24.04' }}

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Dependencies
        id: depends
        run: |
          sudo apt-get update
          sudo apt-get install build-essential libssl-dev

      - name: Build
        id: cmake_build
        run: |
          cmake -B build \
            -DLLAMA_FATAL_WARNINGS=ON \
            -DLLAMA_LLGUIDANCE=ON
          cmake --build build --config Release -j $(nproc)

      - name: Test
        id: cmake_test
        run: |
          cd build
          ctest -L main --verbose --timeout 900

build-android matrix .github/workflows/build-android.yml

Triggers

workflow_dispatch, push, pull_request

Runs on

ubuntu-latest, ubuntu-latest

Jobs

android, android-ndk

Matrix

include, include.build, include.defines→ --preset arm64-android-snapdragon-release, -D ANDROID_ABI=arm64-v8a -D ANDROID_PLATFORM=android-31 -D CMAKE_TOOLCHAIN_FILE=${ANDROID_NDK_ROOT}/build/cmake/android.toolchain.cmake -D GGML_NATIVE=OFF -DGGML_CPU_ARM_ARCH=armv8.5-a+fp16+i8mm -G Ninja -D LLAMA_OPENSSL=OFF -D GGML_OPENMP=OFF, arm64-cpu, arm64-snapdragon

Actions

android-actions/setup-android

Commands

cd examples/llama.android ./gradlew build --no-daemon
if [[ "${{ matrix.build }}" == "arm64-snapdragon" ]]; then cp docs/backend/snapdragon/CMakeUserPresets.json . fi cmake ${{ matrix.defines }} -B build cmake --build build cmake --install build --prefix pkg-adb/llama.cpp

View raw YAML

name: CI (android)

on:
  workflow_dispatch: # allows manual triggering
  push:
    branches:
      - master
    paths: [
      '.github/workflows/build-android.yml',
      '**/CMakeLists.txt',
      '**/.cmake',
      '**/*.h',
      '**/*.hpp',
      '**/*.c',
      '**/*.cpp'
    ]

  pull_request:
    types: [opened, synchronize, reopened]
    paths: [
      '.github/workflows/build-android.yml',
      'examples/llama.android/**'
    ]

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
  cancel-in-progress: true

env:
  GGML_NLOOP: 3
  GGML_N_THREADS: 1
  LLAMA_LOG_COLORS: 1
  LLAMA_LOG_PREFIX: 1
  LLAMA_LOG_TIMESTAMPS: 1

jobs:
  android:
    runs-on: ubuntu-latest

    steps:
      - name: Clone
        uses: actions/checkout@v6
        with:
          fetch-depth: 0
          lfs: false

      - name: Set up JDK
        uses: actions/setup-java@v5
        with:
          java-version: 17
          distribution: zulu

      - name: Setup Android SDK
        uses: android-actions/setup-android@9fc6c4e9069bf8d3d10b2204b1fb8f6ef7065407 # v3
        with:
          log-accepted-android-sdk-licenses: false

      - name: Build
        run: |
          cd examples/llama.android
          ./gradlew build --no-daemon

  android-ndk:
    runs-on: ubuntu-latest
    container:
      image: 'ghcr.io/snapdragon-toolchain/arm64-android:v0.3'
    defaults:
      run:
        shell: bash
    strategy:
      matrix:
        include:
          - build: 'arm64-cpu'
            defines: '-D ANDROID_ABI=arm64-v8a -D ANDROID_PLATFORM=android-31 -D CMAKE_TOOLCHAIN_FILE=${ANDROID_NDK_ROOT}/build/cmake/android.toolchain.cmake -D GGML_NATIVE=OFF -DGGML_CPU_ARM_ARCH=armv8.5-a+fp16+i8mm -G Ninja -D LLAMA_OPENSSL=OFF -D GGML_OPENMP=OFF'
          - build: 'arm64-snapdragon'
            defines: '--preset arm64-android-snapdragon-release'

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6
        with:
          fetch-depth: 0
          lfs: false

      - name: Build Llama.CPP for Hexagon Android
        id: build_llama_cpp_hexagon_android
        run: |
          if [[ "${{ matrix.build }}" == "arm64-snapdragon" ]]; then
            cp docs/backend/snapdragon/CMakeUserPresets.json .
          fi
          cmake ${{ matrix.defines }} -B build
          cmake --build build
          cmake --install build --prefix pkg-adb/llama.cpp

      - name: Upload Llama.CPP Hexagon Android Build Artifact
        if: ${{ always() && steps.build_llama_cpp_hexagon_android.outcome == 'success' }}
        uses: actions/upload-artifact@v6
        with:
          name: llama-cpp-android-${{ matrix.build }}
          path: pkg-adb/llama.cpp

build-apple matrix .github/workflows/build-apple.yml

Triggers

workflow_dispatch, push, pull_request

Runs on

macos-latest, macos-latest, macos-latest, macos-latest, macos-latest

Jobs

macOS-latest-ios, macos-latest-ios-xcode, macOS-latest-tvos, macOS-latest-visionos, macOS-latest-swift

Matrix

destination→ generic/platform=iOS, generic/platform=macOS, generic/platform=tvOS

Actions

ggml-org/ccache-action, ggml-org/setup-xcode, ggml-org/ccache-action, ggml-org/ccache-action

Commands

sysctl -a cmake -B build -G Xcode \ -DGGML_METAL_USE_BF16=ON \ -DGGML_METAL_EMBED_LIBRARY=ON \ -DLLAMA_BUILD_COMMON=OFF \ -DLLAMA_BUILD_EXAMPLES=OFF \ -DLLAMA_BUILD_TOOLS=OFF \ -DLLAMA_BUILD_TESTS=OFF \ -DLLAMA_BUILD_SERVER=OFF \ -DCMAKE_SYSTEM_NAME=iOS \ -DCMAKE_OSX_DEPLOYMENT_TARGET=14.0 \ -DCMAKE_XCODE_ATTRIBUTE_DEVELOPMENT_TEAM=ggml cmake --build build --config Release -j $(sysctl -n hw.logicalcpu) -- CODE_SIGNING_ALLOWED=NO
sysctl -a cmake -B build -G Xcode \ -DGGML_METAL_USE_BF16=ON \ -DGGML_METAL_EMBED_LIBRARY=ON \ -DLLAMA_OPENSSL=OFF \ -DLLAMA_BUILD_EXAMPLES=OFF \ -DLLAMA_BUILD_TOOLS=OFF \ -DLLAMA_BUILD_TESTS=OFF \ -DLLAMA_BUILD_SERVER=OFF \ -DCMAKE_SYSTEM_NAME=iOS \ -DCMAKE_OSX_DEPLOYMENT_TARGET=14.0 \ -DCMAKE_XCODE_ATTRIBUTE_DEVELOPMENT_TEAM=ggml cmake --build build --config Release -j $(sysctl -n hw.logicalcpu) -- CODE_SIGNING_ALLOWED=NO
./build-xcframework.sh
xcodebuild -downloadPlatform iOS xcodebuild -project examples/llama.swiftui/llama.swiftui.xcodeproj -scheme llama.swiftui -sdk iphoneos CODE_SIGNING_REQUIRED=NO CODE_SIGN_IDENTITY= -destination 'generic/platform=iOS' FRAMEWORK_FOLDER_PATH=./build-ios build
sysctl -a cmake -B build -G Xcode \ -DGGML_METAL_USE_BF16=ON \ -DGGML_METAL_EMBED_LIBRARY=ON \ -DLLAMA_BUILD_COMMON=OFF \ -DLLAMA_BUILD_EXAMPLES=OFF \ -DLLAMA_BUILD_TOOLS=OFF \ -DLLAMA_BUILD_TESTS=OFF \ -DLLAMA_BUILD_SERVER=OFF \ -DCMAKE_SYSTEM_NAME=tvOS \ -DCMAKE_OSX_DEPLOYMENT_TARGET=14.0 \ -DCMAKE_XCODE_ATTRIBUTE_DEVELOPMENT_TEAM=ggml cmake --build build --config Release -j $(sysctl -n hw.logicalcpu) -- CODE_SIGNING_ALLOWED=NO
sysctl -a cmake -B build -G Xcode \ -DGGML_METAL_USE_BF16=ON \ -DGGML_METAL_EMBED_LIBRARY=ON \ -DLLAMA_BUILD_COMMON=OFF \ -DLLAMA_BUILD_EXAMPLES=OFF \ -DLLAMA_BUILD_TOOLS=OFF \ -DLLAMA_BUILD_TESTS=OFF \ -DLLAMA_BUILD_SERVER=OFF \ -DCMAKE_SYSTEM_NAME=visionOS \ -DCMAKE_OSX_DEPLOYMENT_TARGET=1.0 \ -DCMAKE_XCODE_ATTRIBUTE_DEVELOPMENT_TEAM=ggml cmake --build build --config Release -j $(sysctl -n hw.logicalcpu) -- CODE_SIGNING_ALLOWED=NO
sysctl -a cmake -B build -G Xcode \ -DGGML_METAL_USE_BF16=ON \ -DGGML_METAL_EMBED_LIBRARY=ON \ -DLLAMA_OPENSSL=OFF \ -DLLAMA_BUILD_EXAMPLES=OFF \ -DLLAMA_BUILD_TOOLS=OFF \ -DLLAMA_BUILD_TESTS=OFF \ -DLLAMA_BUILD_SERVER=OFF \ -DCMAKE_OSX_ARCHITECTURES="arm64;x86_64" cmake --build build --config Release -j $(sysctl -n hw.logicalcpu)

View raw YAML

name: CI (apple)

on:
  workflow_dispatch: # allows manual triggering
  push:
    branches:
      - master
    paths: [
      '.github/workflows/build-apple.yml',
      '**/CMakeLists.txt',
      '**/.cmake',
      '**/*.h',
      '**/*.hpp',
      '**/*.c',
      '**/*.cpp',
      '**/*.swift',
      '**/*.m',
      '**/*.metal'
    ]

  pull_request:
    types: [opened, synchronize, reopened]
    paths: [
      '.github/workflows/build-apple.yml',
      'ggml/src/ggml-metal/**'
    ]

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
  cancel-in-progress: true

env:
  GGML_NLOOP: 3
  GGML_N_THREADS: 1
  LLAMA_LOG_COLORS: 1
  LLAMA_LOG_PREFIX: 1
  LLAMA_LOG_TIMESTAMPS: 1

jobs:
  macOS-latest-ios:
    runs-on: macos-latest

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: macOS-latest-ios
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Build
        id: cmake_build
        run: |
          sysctl -a
          cmake -B build -G Xcode \
            -DGGML_METAL_USE_BF16=ON \
            -DGGML_METAL_EMBED_LIBRARY=ON \
            -DLLAMA_BUILD_COMMON=OFF \
            -DLLAMA_BUILD_EXAMPLES=OFF \
            -DLLAMA_BUILD_TOOLS=OFF \
            -DLLAMA_BUILD_TESTS=OFF \
            -DLLAMA_BUILD_SERVER=OFF \
            -DCMAKE_SYSTEM_NAME=iOS \
            -DCMAKE_OSX_DEPLOYMENT_TARGET=14.0 \
            -DCMAKE_XCODE_ATTRIBUTE_DEVELOPMENT_TEAM=ggml
          cmake --build build --config Release -j $(sysctl -n hw.logicalcpu) -- CODE_SIGNING_ALLOWED=NO

  macos-latest-ios-xcode:
    runs-on: macos-latest

    steps:
      - name: Checkout code
        uses: actions/checkout@v6

      - name: Setup Xcode
        uses: ggml-org/setup-xcode@v1
        with:
          xcode-version: latest-stable

      - name: Build
        id: cmake_build
        run: |
          sysctl -a
          cmake -B build -G Xcode \
            -DGGML_METAL_USE_BF16=ON \
            -DGGML_METAL_EMBED_LIBRARY=ON \
            -DLLAMA_OPENSSL=OFF \
            -DLLAMA_BUILD_EXAMPLES=OFF \
            -DLLAMA_BUILD_TOOLS=OFF \
            -DLLAMA_BUILD_TESTS=OFF \
            -DLLAMA_BUILD_SERVER=OFF \
            -DCMAKE_SYSTEM_NAME=iOS \
            -DCMAKE_OSX_DEPLOYMENT_TARGET=14.0 \
            -DCMAKE_XCODE_ATTRIBUTE_DEVELOPMENT_TEAM=ggml
          cmake --build build --config Release -j $(sysctl -n hw.logicalcpu) -- CODE_SIGNING_ALLOWED=NO

      - name: xcodebuild for swift package
        id: xcodebuild
        run: |
          ./build-xcframework.sh

      - name: Upload xcframework artifact
        uses: actions/upload-artifact@v6
        with:
          name: llama-xcframework
          path: build-apple/llama.xcframework/
          retention-days: 1

      - name: Build Xcode project
        run: |
          xcodebuild -downloadPlatform iOS
          xcodebuild -project examples/llama.swiftui/llama.swiftui.xcodeproj -scheme llama.swiftui -sdk iphoneos CODE_SIGNING_REQUIRED=NO CODE_SIGN_IDENTITY= -destination 'generic/platform=iOS' FRAMEWORK_FOLDER_PATH=./build-ios build

  macOS-latest-tvos:
    runs-on: macos-latest

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: macOS-latest-tvos
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Build
        id: cmake_build
        run: |
          sysctl -a
          cmake -B build -G Xcode \
            -DGGML_METAL_USE_BF16=ON \
            -DGGML_METAL_EMBED_LIBRARY=ON \
            -DLLAMA_BUILD_COMMON=OFF \
            -DLLAMA_BUILD_EXAMPLES=OFF \
            -DLLAMA_BUILD_TOOLS=OFF \
            -DLLAMA_BUILD_TESTS=OFF \
            -DLLAMA_BUILD_SERVER=OFF \
            -DCMAKE_SYSTEM_NAME=tvOS \
            -DCMAKE_OSX_DEPLOYMENT_TARGET=14.0 \
            -DCMAKE_XCODE_ATTRIBUTE_DEVELOPMENT_TEAM=ggml
          cmake --build build --config Release -j $(sysctl -n hw.logicalcpu) -- CODE_SIGNING_ALLOWED=NO

  macOS-latest-visionos:
    runs-on: macos-latest

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Build
        id: cmake_build
        run: |
          sysctl -a
          cmake -B build -G Xcode \
            -DGGML_METAL_USE_BF16=ON \
            -DGGML_METAL_EMBED_LIBRARY=ON \
            -DLLAMA_BUILD_COMMON=OFF \
            -DLLAMA_BUILD_EXAMPLES=OFF \
            -DLLAMA_BUILD_TOOLS=OFF \
            -DLLAMA_BUILD_TESTS=OFF \
            -DLLAMA_BUILD_SERVER=OFF \
            -DCMAKE_SYSTEM_NAME=visionOS \
            -DCMAKE_OSX_DEPLOYMENT_TARGET=1.0 \
            -DCMAKE_XCODE_ATTRIBUTE_DEVELOPMENT_TEAM=ggml
          cmake --build build --config Release -j $(sysctl -n hw.logicalcpu) -- CODE_SIGNING_ALLOWED=NO

  macOS-latest-swift:
    runs-on: macos-latest
    needs: macos-latest-ios-xcode

    strategy:
      matrix:
        destination: ['generic/platform=macOS', 'generic/platform=iOS', 'generic/platform=tvOS']

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: macOS-latest-swift
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Download xcframework artifact
        uses: actions/download-artifact@v7
        with:
          name: llama-xcframework
          path: build-apple/llama.xcframework/

      - name: Build llama.cpp with CMake
        id: cmake_build
        run: |
          sysctl -a
          cmake -B build -G Xcode \
            -DGGML_METAL_USE_BF16=ON \
            -DGGML_METAL_EMBED_LIBRARY=ON \
            -DLLAMA_OPENSSL=OFF \
            -DLLAMA_BUILD_EXAMPLES=OFF \
            -DLLAMA_BUILD_TOOLS=OFF \
            -DLLAMA_BUILD_TESTS=OFF \
            -DLLAMA_BUILD_SERVER=OFF \
            -DCMAKE_OSX_ARCHITECTURES="arm64;x86_64"
          cmake --build build --config Release -j $(sysctl -n hw.logicalcpu)

build-cache .github/workflows/build-cache.yml

Triggers

workflow_dispatch, schedule

Runs on

ubuntu-24.04, ubuntu-24.04, windows-2022

Jobs

ubuntu-24-vulkan-cache, ubuntu-24-openvino-cache, windows-2022-rocm-cache

Commands

echo "VULKAN_SDK_VERSION=$(curl https://vulkan.lunarg.com/sdk/latest/linux.txt)" >> "$GITHUB_ENV"

View raw YAML

name: Build Actions Cache

on:
  workflow_dispatch: # allows manual triggering
  schedule:
    - cron: '0 * * * *'

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
  cancel-in-progress: true

jobs:
  ubuntu-24-vulkan-cache:
    runs-on: ubuntu-24.04

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Get latest Vulkan SDK version
        id: vulkan_sdk_version
        run: |
          echo "VULKAN_SDK_VERSION=$(curl https://vulkan.lunarg.com/sdk/latest/linux.txt)" >> "$GITHUB_ENV"

      - name: Setup Cache
        uses: actions/cache@v5
        id: cache-sdk
        with:
          path: ./vulkan_sdk
          key: vulkan-sdk-${{ env.VULKAN_SDK_VERSION }}-${{ runner.os }}

      - name: Setup Vulkan SDK
        if: steps.cache-sdk.outputs.cache-hit != 'true'
        uses: ./.github/actions/linux-setup-vulkan
        with:
          path: ./vulkan_sdk
          version: ${{ env.VULKAN_SDK_VERSION }}

  #ubuntu-24-spacemit-cache:
  #  runs-on: ubuntu-24.04

  #  env:
  #    # Make sure this is in sync with build-linux-cross.yml
  #    SPACEMIT_IME_TOOLCHAIN_VERSION: "1.1.2"

  #  steps:
  #    - name: Clone
  #      id: checkout
  #      uses: actions/checkout@v6

  #    - name: Setup Cache
  #      uses: actions/cache@v5
  #      id: cache-toolchain
  #      with:
  #        path: ./spacemit_toolchain
  #        key: spacemit-ime-toolchain-v${{ env.SPACEMIT_IME_TOOLCHAIN_VERSION }}-${{ runner.os }}

  #    - name: Setup SpacemiT Toolchain
  #      if: steps.cache-toolchain.outputs.cache-hit != 'true'
  #      uses: ./.github/actions/linux-setup-spacemit
  #      with:
  #        path: ./spacemit_toolchain
  #        version: ${{ env.SPACEMIT_IME_TOOLCHAIN_VERSION }}

  ubuntu-24-openvino-cache:
    runs-on: ubuntu-24.04

    env:
      # Sync versions in build.yml, build-self-hosted.yml, release.yml, build-cache.yml, .devops/openvino.Dockerfile
      OPENVINO_VERSION_MAJOR: "2026.0"
      OPENVINO_VERSION_FULL: "2026.0.0.20965.c6d6a13a886"

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Setup Cache
        uses: actions/cache@v5
        id: cache-openvino
        with:
          path: ./openvino_toolkit
          key: openvino-toolkit-v${{ env.OPENVINO_VERSION_FULL }}-${{ runner.os }}

      - name: Setup OpenVINO Toolkit
        if: steps.cache-openvino.outputs.cache-hit != 'true'
        uses: ./.github/actions/linux-setup-openvino
        with:
          path: ./openvino_toolkit
          version_major: ${{ env.OPENVINO_VERSION_MAJOR }}
          version_full: ${{ env.OPENVINO_VERSION_FULL }}

  windows-2022-rocm-cache:
    runs-on: windows-2022

    env:
      # Make sure this is in sync with build.yml
      HIPSDK_INSTALLER_VERSION: "26.Q1"

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Setup Cache
        uses: actions/cache@v5
        id: cache-rocm
        with:
          path: C:\Program Files\AMD\ROCm
          key: rocm-${{ env.HIPSDK_INSTALLER_VERSION }}-${{ runner.os }}

      - name: Setup ROCm
        if: steps.cache-rocm.outputs.cache-hit != 'true'
        uses: ./.github/actions/windows-setup-rocm
        with:
          version: ${{ env.HIPSDK_INSTALLER_VERSION }}

build-cann matrix .github/workflows/build-cann.yml

Triggers

workflow_dispatch, push, pull_request

Runs on

${{ matrix.arch == 'aarch64' && 'ubuntu-24.04-arm' || 'ubuntu-24.04' }}

Jobs

openEuler-latest-cann

Matrix

arch, build, chip_type, exclude, exclude.chip_type, exclude.use_acl_graph, use_acl_graph→ 310p, 910b, Release, aarch64, off, on, x86

Actions

ggml-org/free-disk-space

Commands

image="ascendai/cann:${{ matrix.chip_type == '910b' && '8.5.0-910b-openeuler24.03-py3.11' || '8.5.0-310p-openeuler24.03-py3.11' }}" echo "image=${image}" >> "${GITHUB_OUTPUT}"
docker pull "${{ steps.cann-image.outputs.image }}"
HOST_UID=$(id -u) HOST_GID=$(id -g) docker run --rm \ -v "${PWD}:/workspace" \ -w /workspace \ -e SOC_TYPE=${SOC_TYPE} \ -e BUILD_TYPE=${BUILD_TYPE} \ -e USE_ACL_GRAPH=${USE_ACL_GRAPH} \ "${{ steps.cann-image.outputs.image }}" \ bash -lc ' set -e yum install -y --setopt=install_weak_deps=False --setopt=tsflags=nodocs git gcc gcc-c++ make cmake openssl-devel yum clean all && rm -rf /var/cache/yum git config --global --add safe.directory "/workspace" export LD_LIBRARY_PATH=${ASCEND_TOOLKIT_HOME}/lib64:${ASCEND_TOOLKIT_HOME}/$(uname -m)-linux/devlib/:${LD_LIBRARY_PATH} cmake -S . -B build \ -DCMAKE_BUILD_TYPE=${BUILD_TYPE} \ -DGGML_CANN=on \ -DSOC_TYPE=${SOC_TYPE} \ -DUSE_ACL_GRAPH=${USE_ACL_GRAPH} cmake --build build -j $(nproc) chown -R '"${HOST_UID}"':'"${HOST_GID}"' /workspace/build '

View raw YAML

name: CI (cann)

on:
  workflow_dispatch: # allows manual triggering
  push:
    branches:
      - master
    paths: [
      '.github/workflows/build-cann.yml',
      '**/CMakeLists.txt',
      '**/.cmake',
      '**/*.h',
      '**/*.hpp',
      '**/*.c',
      '**/*.cpp'
    ]

  pull_request:
    types: [opened, synchronize, reopened]
    paths: [
      '.github/workflows/build-cann.yml',
      'ggml/src/ggml-cann/**'
    ]

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
  cancel-in-progress: true

env:
  GGML_NLOOP: 3
  GGML_N_THREADS: 1
  LLAMA_LOG_COLORS: 1
  LLAMA_LOG_PREFIX: 1
  LLAMA_LOG_TIMESTAMPS: 1

jobs:
  openEuler-latest-cann:
    defaults:
      run:
        shell: bash -el {0}
    strategy:
      matrix:
        arch: [x86, aarch64]
        chip_type: ['910b', '310p']
        build: ['Release']
        use_acl_graph: ['on', 'off']
        exclude:
          # 310P does not support USE_ACL_GRAPH=on
          - chip_type: '310p'
            use_acl_graph: 'on'
    runs-on: ${{ matrix.arch == 'aarch64' && 'ubuntu-24.04-arm' || 'ubuntu-24.04' }}
    steps:
      - name: Checkout
        uses: actions/checkout@v6
        with:
          fetch-depth: 0

      - name: Free up disk space
        uses: ggml-org/free-disk-space@v1.3.1
        with:
          tool-cache: true

      - name: Set container image
        id: cann-image
        run: |
          image="ascendai/cann:${{ matrix.chip_type == '910b' &&  '8.5.0-910b-openeuler24.03-py3.11' || '8.5.0-310p-openeuler24.03-py3.11' }}"
          echo "image=${image}" >> "${GITHUB_OUTPUT}"

      - name: Pull container image
        run: docker pull "${{ steps.cann-image.outputs.image }}"

      - name: Build
        env:
          BUILD_TYPE: ${{ matrix.build }}
          SOC_TYPE: ascend${{ matrix.chip_type }}
          USE_ACL_GRAPH: ${{ matrix.use_acl_graph }}
        run: |
          HOST_UID=$(id -u)
          HOST_GID=$(id -g)

          docker run --rm \
            -v "${PWD}:/workspace" \
            -w /workspace \
            -e SOC_TYPE=${SOC_TYPE} \
            -e BUILD_TYPE=${BUILD_TYPE} \
            -e USE_ACL_GRAPH=${USE_ACL_GRAPH} \
            "${{ steps.cann-image.outputs.image }}" \
            bash -lc '
              set -e
              yum install -y --setopt=install_weak_deps=False --setopt=tsflags=nodocs git gcc gcc-c++ make cmake openssl-devel
              yum clean all && rm -rf /var/cache/yum
              git config --global --add safe.directory "/workspace"
              export LD_LIBRARY_PATH=${ASCEND_TOOLKIT_HOME}/lib64:${ASCEND_TOOLKIT_HOME}/$(uname -m)-linux/devlib/:${LD_LIBRARY_PATH}
              cmake -S . -B build \
                  -DCMAKE_BUILD_TYPE=${BUILD_TYPE} \
                  -DGGML_CANN=on \
                  -DSOC_TYPE=${SOC_TYPE} \
                  -DUSE_ACL_GRAPH=${USE_ACL_GRAPH}
              cmake --build build -j $(nproc)

              chown -R '"${HOST_UID}"':'"${HOST_GID}"' /workspace/build
            '

build-cmake-pkg .github/workflows/build-cmake-pkg.yml

Triggers

workflow_dispatch, workflow_call

Runs on

ubuntu-slim

Jobs

linux

Commands

sudo apt update sudo apt install -y build-essential tcl cmake
PREFIX="$(pwd)"/inst cmake -S . -B build -DCMAKE_PREFIX_PATH="$PREFIX" \ -DLLAMA_OPENSSL=OFF -DLLAMA_BUILD_TESTS=OFF -DLLAMA_BUILD_TOOLS=OFF \ -DLLAMA_BUILD_EXAMPLES=OFF -DCMAKE_BUILD_TYPE=Release cmake --build build --config Release cmake --install build --prefix "$PREFIX" --config Release export LLAMA_CONFIG="$PREFIX"/lib/cmake/llama/llama-config.cmake tclsh <<'EOF' set build(commit) [string trim [exec git rev-parse --short HEAD]] set build(number) [string trim [exec git rev-list --count HEAD]] set build(version) "0.0.$build(number)" set llamaconfig [read [open "$env(LLAMA_CONFIG)" r]] set checks [list "set\$LLAMA_VERSION \\s+$build(version)\$" \ "set\$LLAMA_BUILD_COMMIT\\s+$build(commit)\$" \ "set\$LLAMA_BUILD_NUMBER\\s+$build(number)\$"] puts -nonewline "Checking llama-config.cmake version... " foreach check $checks { if {![regexp -expanded -- $check $llamaconfig]} { puts "\"$check\" failed!" exit 1 } } puts "success." EOF cd examples/simple-cmake-pkg cmake -S . -B build -DCMAKE_PREFIX_PATH="$PREFIX"/lib/cmake cmake --build build

View raw YAML

name: Build relocatable cmake package
on:
  workflow_dispatch:
  workflow_call:

jobs:
  linux:
    runs-on: ubuntu-slim
    steps:
      - uses: actions/checkout@v6
        with:
          fetch-depth: 0

      - name: Install dependencies
        run: |
          sudo apt update
          sudo apt install -y build-essential tcl cmake

      - name: Build
        run: |
          PREFIX="$(pwd)"/inst
          cmake -S . -B build -DCMAKE_PREFIX_PATH="$PREFIX" \
                -DLLAMA_OPENSSL=OFF -DLLAMA_BUILD_TESTS=OFF -DLLAMA_BUILD_TOOLS=OFF \
                -DLLAMA_BUILD_EXAMPLES=OFF -DCMAKE_BUILD_TYPE=Release
          cmake --build build --config Release
          cmake --install build --prefix "$PREFIX" --config Release

          export LLAMA_CONFIG="$PREFIX"/lib/cmake/llama/llama-config.cmake
          tclsh <<'EOF'
          set build(commit)  [string trim [exec git rev-parse --short HEAD]]
          set build(number)  [string trim [exec git rev-list  --count HEAD]]
          set build(version) "0.0.$build(number)"

          set llamaconfig [read [open "$env(LLAMA_CONFIG)" r]]
          set checks [list "set\\(LLAMA_VERSION     \\s+$build(version)\\)" \
                           "set\\(LLAMA_BUILD_COMMIT\\s+$build(commit)\\)" \
                           "set\\(LLAMA_BUILD_NUMBER\\s+$build(number)\\)"]

          puts -nonewline "Checking llama-config.cmake version... "
          foreach check $checks {
              if {![regexp -expanded -- $check $llamaconfig]} {
                  puts "\"$check\" failed!"
                  exit 1
              }
          }
          puts "success."
          EOF

          cd examples/simple-cmake-pkg
          cmake -S . -B build -DCMAKE_PREFIX_PATH="$PREFIX"/lib/cmake
          cmake --build build

build-cross .github/workflows/build-cross.yml

Triggers

workflow_dispatch, push, schedule

Runs on

${{ 'ubuntu-24.04-arm' || 'ubuntu-24.04' }}, ${{ 'ubuntu-24.04-arm' || 'ubuntu-24.04' }}, ubuntu-24.04

Jobs

debian-13-loongarch64-cpu-cross, debian-13-loongarch64-vulkan-cross, ubuntu-24-riscv64-cpu-spacemit-ime-cross

Commands

rm -f /etc/apt/sources.list.d/* cat << EOF | tee /etc/apt/sources.list.d/debian-ports.list deb http://snapshot.debian.org/archive/debian/20250515T202920Z/ trixie main EOF ( echo 'quiet "true";'; \ echo 'APT::Get::Assume-Yes "true";'; \ echo 'APT::Install-Recommends "false";'; \ echo 'Acquire::Check-Valid-Until "false";'; \ echo 'Acquire::Retries "5";'; \ ) > /etc/apt/apt.conf.d/99snapshot-repos apt-get update apt-get install -y ca-certificates debian-ports-archive-keyring cmake git zip dpkg --add-architecture loong64 # Add arch-specific repositories for non-amd64 architectures cat << EOF | tee /etc/apt/sources.list.d/loong64-ports.list deb [arch=loong64] http://snapshot.debian.org/archive/debian-ports/20250515T194251Z/ sid main EOF apt-get update || true ;# Prevent failure due to missing URLs. apt-get install -y --no-install-recommends \ build-essential \ gcc-14-loongarch64-linux-gnu \ g++-14-loongarch64-linux-gnu
cmake -B build -DLLAMA_OPENSSL=OFF \ -DCMAKE_BUILD_TYPE=Release \ -DGGML_OPENMP=OFF \ -DLLAMA_BUILD_EXAMPLES=ON \ -DLLAMA_BUILD_TOOLS=ON \ -DLLAMA_BUILD_TESTS=OFF \ -DCMAKE_SYSTEM_NAME=Linux \ -DCMAKE_SYSTEM_PROCESSOR=loongarch64 \ -DCMAKE_C_COMPILER=loongarch64-linux-gnu-gcc-14 \ -DCMAKE_CXX_COMPILER=loongarch64-linux-gnu-g++-14 \ -DCMAKE_POSITION_INDEPENDENT_CODE=ON \ -DCMAKE_FIND_ROOT_PATH=/usr/lib/loongarch64-linux-gnu \ -DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER \ -DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY \ -DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=BOTH cmake --build build --config Release -j $(nproc)
rm -f /etc/apt/sources.list.d/* cat << EOF | tee /etc/apt/sources.list.d/debian-ports.list deb http://snapshot.debian.org/archive/debian/20250515T202920Z/ trixie main EOF ( echo 'quiet "true";'; \ echo 'APT::Get::Assume-Yes "true";'; \ echo 'APT::Install-Recommends "false";'; \ echo 'Acquire::Check-Valid-Until "false";'; \ echo 'Acquire::Retries "5";'; \ ) > /etc/apt/apt.conf.d/99snapshot-repos apt-get update apt-get install -y ca-certificates debian-ports-archive-keyring cmake git zip dpkg --add-architecture loong64 # Add arch-specific repositories for non-amd64 architectures cat << EOF | tee /etc/apt/sources.list.d/loong64-ports.list deb [arch=loong64] http://snapshot.debian.org/archive/debian-ports/20250515T194251Z/ sid main EOF apt-get update || true ;# Prevent failure due to missing URLs. apt-get install -y --no-install-recommends \ build-essential \ glslc \ gcc-14-loongarch64-linux-gnu \ g++-14-loongarch64-linux-gnu \ libvulkan-dev:loong64
cmake -B build -DLLAMA_OPENSSL=OFF \ -DCMAKE_BUILD_TYPE=Release \ -DGGML_VULKAN=ON \ -DGGML_OPENMP=OFF \ -DLLAMA_BUILD_EXAMPLES=ON \ -DLLAMA_BUILD_TOOLS=ON \ -DLLAMA_BUILD_TESTS=OFF \ -DCMAKE_SYSTEM_NAME=Linux \ -DCMAKE_SYSTEM_PROCESSOR=loongarch64 \ -DCMAKE_C_COMPILER=loongarch64-linux-gnu-gcc-14 \ -DCMAKE_CXX_COMPILER=loongarch64-linux-gnu-g++-14 \ -DCMAKE_POSITION_INDEPENDENT_CODE=ON \ -DCMAKE_FIND_ROOT_PATH=/usr/lib/loongarch64-linux-gnu \ -DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER \ -DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY \ -DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=BOTH cmake --build build --config Release -j $(nproc)
export RISCV_ROOT_PATH=${PWD}/spacemit_toolchain cmake -B build -DLLAMA_OPENSSL=OFF \ -DCMAKE_BUILD_TYPE=Release \ -DGGML_OPENMP=OFF \ -DLLAMA_BUILD_EXAMPLES=ON \ -DLLAMA_BUILD_TOOLS=ON \ -DLLAMA_BUILD_TESTS=OFF \ -DGGML_CPU_RISCV64_SPACEMIT=ON \ -DGGML_RVV=ON \ -DGGML_RV_ZFH=ON \ -DGGML_RV_ZICBOP=ON \ -DGGML_RV_ZIHINTPAUSE=ON \ -DRISCV64_SPACEMIT_IME_SPEC=RISCV64_SPACEMIT_IME1 \ -DCMAKE_TOOLCHAIN_FILE=${PWD}/cmake/riscv64-spacemit-linux-gnu-gcc.cmake cmake --build build --config Release -j $(nproc)

View raw YAML

name: CI (cross)
on:
  # only manual triggers due to low-importance of the workflows
  # TODO: for regular runs, provision dedicated self-hosted runners
  workflow_dispatch:
  push:
    branches:
      - master
    paths: [
      '.github/workflows/build-cross.yml',
      'ggml/src/spacemit/*',
      'ggml/src/arch/loongarch/*'
    ]
  # run once every week
  schedule:
    - cron: '0 0 * * 0'

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
  cancel-in-progress: true


jobs:
  # ubuntu-24-riscv64-cpu-cross:
  #   runs-on: ubuntu-24.04

  #   steps:
  #     - uses: actions/checkout@v6
  #     - name: Setup Riscv
  #       run: |
  #         sudo dpkg --add-architecture riscv64

  #         # Add arch-specific repositories for non-amd64 architectures
  #         cat << EOF | sudo tee /etc/apt/sources.list.d/riscv64-ports.list
  #         deb [arch=riscv64] http://ports.ubuntu.com/ubuntu-ports/ noble main universe
  #         deb [arch=riscv64] http://ports.ubuntu.com/ubuntu-ports/ noble-updates main universe
  #         deb [arch=riscv64] http://ports.ubuntu.com/ubuntu-ports/ noble-security main universe
  #         deb [arch=riscv64] http://ports.ubuntu.com/ubuntu-ports/ noble-backports main universe
  #         EOF

  #         sudo apt-get update || true    ;# Prevent failure due to missing URLs.

  #         sudo apt-get install -y --no-install-recommends \
  #                 build-essential \
  #                 gcc-14-riscv64-linux-gnu \
  #                 g++-14-riscv64-linux-gnu

  #     - name: Build
  #       run: |
  #         cmake -B build -DLLAMA_OPENSSL=OFF \
  #                        -DCMAKE_BUILD_TYPE=Release \
  #                        -DGGML_OPENMP=OFF \
  #                        -DLLAMA_BUILD_EXAMPLES=ON \
  #                        -DLLAMA_BUILD_TOOLS=ON \
  #                        -DLLAMA_BUILD_TESTS=OFF \
  #                        -DCMAKE_SYSTEM_NAME=Linux \
  #                        -DCMAKE_SYSTEM_PROCESSOR=riscv64 \
  #                        -DCMAKE_C_COMPILER=riscv64-linux-gnu-gcc-14 \
  #                        -DCMAKE_CXX_COMPILER=riscv64-linux-gnu-g++-14 \
  #                        -DCMAKE_POSITION_INDEPENDENT_CODE=ON \
  #                        -DCMAKE_FIND_ROOT_PATH=/usr/lib/riscv64-linux-gnu \
  #                        -DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER \
  #                        -DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY \
  #                        -DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=BOTH

  #         cmake --build build --config Release -j $(nproc)

  # ubuntu-24-riscv64-vulkan-cross:
  #   runs-on: ubuntu-24.04

  #   steps:
  #     - uses: actions/checkout@v6
  #     - name: Setup Riscv
  #       run: |
  #         sudo dpkg --add-architecture riscv64

  #         # Add arch-specific repositories for non-amd64 architectures
  #         cat << EOF | sudo tee /etc/apt/sources.list.d/riscv64-ports.list
  #         deb [arch=riscv64] http://ports.ubuntu.com/ubuntu-ports/ noble main universe
  #         deb [arch=riscv64] http://ports.ubuntu.com/ubuntu-ports/ noble-updates main universe
  #         deb [arch=riscv64] http://ports.ubuntu.com/ubuntu-ports/ noble-security main universe
  #         deb [arch=riscv64] http://ports.ubuntu.com/ubuntu-ports/ noble-backports main universe
  #         EOF

  #         sudo apt-get update || true    ;# Prevent failure due to missing URLs.

  #         sudo apt-get install -y --no-install-recommends \
  #                 build-essential \
  #                 glslc \
  #                 gcc-14-riscv64-linux-gnu \
  #                 g++-14-riscv64-linux-gnu \
  #                 libvulkan-dev:riscv64

  #     - name: Build
  #       run: |
  #         cmake -B build -DLLAMA_OPENSSL=OFF \
  #                        -DCMAKE_BUILD_TYPE=Release \
  #                        -DGGML_VULKAN=ON \
  #                        -DGGML_OPENMP=OFF \
  #                        -DLLAMA_BUILD_EXAMPLES=ON \
  #                        -DLLAMA_BUILD_TOOLS=ON \
  #                        -DLLAMA_BUILD_TESTS=OFF \
  #                        -DCMAKE_SYSTEM_NAME=Linux \
  #                        -DCMAKE_SYSTEM_PROCESSOR=riscv64 \
  #                        -DCMAKE_C_COMPILER=riscv64-linux-gnu-gcc-14 \
  #                        -DCMAKE_CXX_COMPILER=riscv64-linux-gnu-g++-14 \
  #                        -DCMAKE_POSITION_INDEPENDENT_CODE=ON \
  #                        -DCMAKE_FIND_ROOT_PATH=/usr/lib/riscv64-linux-gnu \
  #                        -DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER \
  #                        -DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY \
  #                        -DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=BOTH

  #         cmake --build build --config Release -j $(nproc)

  # ubuntu-24-arm64-vulkan-cross:
  #   runs-on: ubuntu-24.04

  #   steps:
  #     - uses: actions/checkout@v6
  #     - name: Setup Arm64
  #       run: |
  #         sudo dpkg --add-architecture arm64

  #         # Add arch-specific repositories for non-amd64 architectures
  #         cat << EOF | sudo tee /etc/apt/sources.list.d/arm64-ports.list
  #         deb [arch=arm64] http://ports.ubuntu.com/ubuntu-ports/ noble main universe
  #         deb [arch=arm64] http://ports.ubuntu.com/ubuntu-ports/ noble-updates main universe
  #         deb [arch=arm64] http://ports.ubuntu.com/ubuntu-ports/ noble-security main universe
  #         deb [arch=arm64] http://ports.ubuntu.com/ubuntu-ports/ noble-backports main universe
  #         EOF

  #         sudo apt-get update || true    ;# Prevent failure due to missing URLs.

  #         sudo apt-get install -y --no-install-recommends \
  #                 build-essential \
  #                 glslc \
  #                 crossbuild-essential-arm64 \
  #                 libvulkan-dev:arm64

  #     - name: Build
  #       run: |
  #         cmake -B build -DLLAMA_OPENSSL=OFF \
  #                        -DCMAKE_BUILD_TYPE=Release \
  #                        -DGGML_VULKAN=ON \
  #                        -DGGML_OPENMP=OFF \
  #                        -DLLAMA_BUILD_EXAMPLES=ON \
  #                        -DLLAMA_BUILD_TOOLS=ON \
  #                        -DLLAMA_BUILD_TESTS=OFF \
  #                        -DCMAKE_SYSTEM_NAME=Linux \
  #                        -DCMAKE_SYSTEM_PROCESSOR=aarch64 \
  #                        -DCMAKE_C_COMPILER=aarch64-linux-gnu-gcc \
  #                        -DCMAKE_CXX_COMPILER=aarch64-linux-gnu-g++ \
  #                        -DCMAKE_POSITION_INDEPENDENT_CODE=ON \
  #                        -DCMAKE_FIND_ROOT_PATH=/usr/lib/aarch64-linux-gnu \
  #                        -DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER \
  #                        -DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY \
  #                        -DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=BOTH

  #         cmake --build build --config Release -j $(nproc)

  debian-13-loongarch64-cpu-cross:
    runs-on: ${{ 'ubuntu-24.04-arm' || 'ubuntu-24.04' }}
    container: debian@sha256:653dfb9f86c3782e8369d5f7d29bb8faba1f4bff9025db46e807fa4c22903671

    steps:
      - uses: actions/checkout@v6
      - name: Setup LoongArch
        run: |
          rm -f /etc/apt/sources.list.d/*
          cat << EOF | tee /etc/apt/sources.list.d/debian-ports.list
          deb http://snapshot.debian.org/archive/debian/20250515T202920Z/ trixie main
          EOF
          ( echo 'quiet "true";'; \
            echo 'APT::Get::Assume-Yes "true";'; \
            echo 'APT::Install-Recommends "false";'; \
            echo 'Acquire::Check-Valid-Until "false";'; \
            echo 'Acquire::Retries "5";'; \
          ) > /etc/apt/apt.conf.d/99snapshot-repos

          apt-get update
          apt-get install -y ca-certificates debian-ports-archive-keyring cmake git zip
          dpkg --add-architecture loong64

          # Add arch-specific repositories for non-amd64 architectures
          cat << EOF | tee /etc/apt/sources.list.d/loong64-ports.list
          deb [arch=loong64] http://snapshot.debian.org/archive/debian-ports/20250515T194251Z/ sid main
          EOF

          apt-get update || true    ;# Prevent failure due to missing URLs.

          apt-get install -y --no-install-recommends \
                  build-essential \
                  gcc-14-loongarch64-linux-gnu \
                  g++-14-loongarch64-linux-gnu

      - name: Build
        run: |
          cmake -B build -DLLAMA_OPENSSL=OFF \
                         -DCMAKE_BUILD_TYPE=Release \
                         -DGGML_OPENMP=OFF \
                         -DLLAMA_BUILD_EXAMPLES=ON \
                         -DLLAMA_BUILD_TOOLS=ON \
                         -DLLAMA_BUILD_TESTS=OFF \
                         -DCMAKE_SYSTEM_NAME=Linux \
                         -DCMAKE_SYSTEM_PROCESSOR=loongarch64 \
                         -DCMAKE_C_COMPILER=loongarch64-linux-gnu-gcc-14 \
                         -DCMAKE_CXX_COMPILER=loongarch64-linux-gnu-g++-14 \
                         -DCMAKE_POSITION_INDEPENDENT_CODE=ON \
                         -DCMAKE_FIND_ROOT_PATH=/usr/lib/loongarch64-linux-gnu \
                         -DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER \
                         -DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY \
                         -DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=BOTH

          cmake --build build --config Release -j $(nproc)

  debian-13-loongarch64-vulkan-cross:
    runs-on: ${{ 'ubuntu-24.04-arm' || 'ubuntu-24.04' }}
    container: debian@sha256:653dfb9f86c3782e8369d5f7d29bb8faba1f4bff9025db46e807fa4c22903671

    steps:
      - uses: actions/checkout@v6
      - name: Setup LoongArch
        run: |
          rm -f /etc/apt/sources.list.d/*
          cat << EOF | tee /etc/apt/sources.list.d/debian-ports.list
          deb http://snapshot.debian.org/archive/debian/20250515T202920Z/ trixie main
          EOF
          ( echo 'quiet "true";'; \
            echo 'APT::Get::Assume-Yes "true";'; \
            echo 'APT::Install-Recommends "false";'; \
            echo 'Acquire::Check-Valid-Until "false";'; \
            echo 'Acquire::Retries "5";'; \
          ) > /etc/apt/apt.conf.d/99snapshot-repos

          apt-get update
          apt-get install -y ca-certificates debian-ports-archive-keyring cmake git zip
          dpkg --add-architecture loong64

          # Add arch-specific repositories for non-amd64 architectures
          cat << EOF | tee /etc/apt/sources.list.d/loong64-ports.list
          deb [arch=loong64] http://snapshot.debian.org/archive/debian-ports/20250515T194251Z/ sid main
          EOF

          apt-get update || true    ;# Prevent failure due to missing URLs.

          apt-get install -y --no-install-recommends \
                  build-essential \
                  glslc \
                  gcc-14-loongarch64-linux-gnu \
                  g++-14-loongarch64-linux-gnu \
                  libvulkan-dev:loong64

      - name: Build
        run: |
          cmake -B build -DLLAMA_OPENSSL=OFF \
                         -DCMAKE_BUILD_TYPE=Release \
                         -DGGML_VULKAN=ON \
                         -DGGML_OPENMP=OFF \
                         -DLLAMA_BUILD_EXAMPLES=ON \
                         -DLLAMA_BUILD_TOOLS=ON \
                         -DLLAMA_BUILD_TESTS=OFF \
                         -DCMAKE_SYSTEM_NAME=Linux \
                         -DCMAKE_SYSTEM_PROCESSOR=loongarch64 \
                         -DCMAKE_C_COMPILER=loongarch64-linux-gnu-gcc-14 \
                         -DCMAKE_CXX_COMPILER=loongarch64-linux-gnu-g++-14 \
                         -DCMAKE_POSITION_INDEPENDENT_CODE=ON \
                         -DCMAKE_FIND_ROOT_PATH=/usr/lib/loongarch64-linux-gnu \
                         -DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER \
                         -DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY \
                         -DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=BOTH

          cmake --build build --config Release -j $(nproc)

  ubuntu-24-riscv64-cpu-spacemit-ime-cross:
    runs-on: ubuntu-24.04

    env:
      # Make sure this is in sync with build-cache.yml
      SPACEMIT_IME_TOOLCHAIN_VERSION: "1.1.2"

    steps:
      - uses: actions/checkout@v6

      #- name: Use SpacemiT Toolchain Cache
      #  uses: actions/cache@v5
      #  id: cache-toolchain
      #  with:
      #    path: ./spacemit_toolchain
      #    key: spacemit-ime-toolchain-v${{ env.SPACEMIT_IME_TOOLCHAIN_VERSION }}-${{ runner.os }}

      - name: Setup SpacemiT Toolchain
        #if: steps.cache-toolchain.outputs.cache-hit != 'true'
        uses: ./.github/actions/linux-setup-spacemit
        with:
          path: ./spacemit_toolchain
          version: ${{ env.SPACEMIT_IME_TOOLCHAIN_VERSION }}

      - name: Build
        run: |
          export RISCV_ROOT_PATH=${PWD}/spacemit_toolchain
          cmake -B build -DLLAMA_OPENSSL=OFF \
                         -DCMAKE_BUILD_TYPE=Release \
                         -DGGML_OPENMP=OFF \
                         -DLLAMA_BUILD_EXAMPLES=ON \
                         -DLLAMA_BUILD_TOOLS=ON \
                         -DLLAMA_BUILD_TESTS=OFF \
                         -DGGML_CPU_RISCV64_SPACEMIT=ON \
                         -DGGML_RVV=ON \
                         -DGGML_RV_ZFH=ON \
                         -DGGML_RV_ZICBOP=ON \
                         -DGGML_RV_ZIHINTPAUSE=ON \
                         -DRISCV64_SPACEMIT_IME_SPEC=RISCV64_SPACEMIT_IME1 \
                         -DCMAKE_TOOLCHAIN_FILE=${PWD}/cmake/riscv64-spacemit-linux-gnu-gcc.cmake

          cmake --build build --config Release -j $(nproc)

build-msys matrix .github/workflows/build-msys.yml

Triggers

workflow_dispatch, schedule

Runs on

windows-2025

Jobs

windows-msys2

Matrix

include, include.build, include.env, include.sys→ CLANG64, Release, UCRT64, clang-x86_64, ucrt-x86_64

Actions

msys2/setup-msys2

Commands

cmake -B build cmake --build build --config ${{ matrix.build }} -j $(nproc)
rm -rf build
cmake -B build -DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS cmake --build build --config ${{ matrix.build }} -j $(nproc)

View raw YAML

name: CI (msys)

on:
  # only manual triggers due to low-importance of the workflows
  # TODO: for regular runs, provision dedicated self-hosted runners
  workflow_dispatch:
  # run once every week
  schedule:
    - cron: '0 0 * * 0'

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
  cancel-in-progress: true

env:
  GGML_NLOOP: 3
  GGML_N_THREADS: 1
  LLAMA_LOG_COLORS: 1
  LLAMA_LOG_PREFIX: 1
  LLAMA_LOG_TIMESTAMPS: 1

jobs:
  windows-msys2:
    runs-on: windows-2025

    strategy:
      fail-fast: false
      matrix:
        include:
          - { sys: UCRT64,  env: ucrt-x86_64,  build: Release }
          - { sys: CLANG64, env: clang-x86_64, build: Release }

    steps:
      - name: Clone
        uses: actions/checkout@v6

      #- name: ccache
      #  uses: ggml-org/ccache-action@v1.2.16
      #  with:
      #    key: windows-msys2
      #    variant: ccache
      #    evict-old-files: 1d
      #    save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Setup ${{ matrix.sys }}
        uses: msys2/setup-msys2@cafece8e6baf9247cf9b1bf95097b0b983cc558d # v2
        with:
          update: true
          msystem: ${{matrix.sys}}
          install: >-
            base-devel
            git
            mingw-w64-${{matrix.env}}-toolchain
            mingw-w64-${{matrix.env}}-cmake
            mingw-w64-${{matrix.env}}-openblas

      - name: Build using CMake
        shell: msys2 {0}
        run: |
            cmake -B build
            cmake --build build --config ${{ matrix.build }} -j $(nproc)

      - name: Clean after building using CMake
        shell: msys2 {0}
        run: |
            rm -rf build

      - name: Build using CMake w/ OpenBLAS
        shell: msys2 {0}
        run: |
            cmake -B build -DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS
            cmake --build build --config ${{ matrix.build }} -j $(nproc)

build-riscv matrix .github/workflows/build-riscv.yml

Triggers

workflow_dispatch, push, pull_request

Runs on

RISCV64

Jobs

ubuntu-riscv64-native-sanitizer

Matrix

build_type, sanitizer→ ADDRESS, Debug, THREAD, UNDEFINED

Commands

sudo apt-get update # Install necessary packages sudo apt-get install -y libatomic1 libtsan2 gcc-14 g++-14 rustup cmake build-essential wget ccache git-lfs # Set gcc-14 and g++-14 as the default compilers sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-14 100 sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-14 100 sudo ln -sf /usr/bin/gcc-14 /usr/bin/gcc sudo ln -sf /usr/bin/g++-14 /usr/bin/g++ # Install Rust stable version rustup install stable rustup default stable git lfs install
gcc --version g++ --version
# Unique cache directory per matrix combination export CCACHE_DIR="$HOME/.ccache/sanitizer-${{ matrix.sanitizer }}-${{ matrix.build_type }}" mkdir -p "$CCACHE_DIR" # Configure ccache ccache --set-config=max_size=5G ccache --set-config=compression=true ccache --set-config=compression_level=6 ccache --set-config=cache_dir="$CCACHE_DIR" ccache --set-config=sloppiness=file_macro,time_macros,include_file_mtime,include_file_ctime ccache --set-config=hash_dir=false # Export for subsequent steps echo "CCACHE_DIR=$CCACHE_DIR" >> $GITHUB_ENV echo "PATH=/usr/lib/ccache:$PATH" >> $GITHUB_ENV
cmake -B build \ -DLLAMA_OPENSSL=OFF \ -DCMAKE_BUILD_TYPE=${{ matrix.build_type }} \ -DGGML_OPENMP=ON \ -DLLAMA_BUILD_EXAMPLES=ON \ -DLLAMA_BUILD_TOOLS=ON \ -DLLAMA_BUILD_TESTS=OFF \ -DCMAKE_C_COMPILER_LAUNCHER=ccache \ -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \ -DLLAMA_SANITIZE_${{ matrix.sanitizer }}=ON \ -DCMAKE_C_COMPILER=riscv64-linux-gnu-gcc-14 \ -DCMAKE_CXX_COMPILER=riscv64-linux-gnu-g++-14 cmake --build build --config ${{ matrix.build_type }} -j $(nproc)
cmake -B build \ -DLLAMA_OPENSSL=OFF \ -DCMAKE_BUILD_TYPE=${{ matrix.build_type }} \ -DGGML_OPENMP=OFF \ -DLLAMA_BUILD_EXAMPLES=ON \ -DLLAMA_BUILD_TOOLS=ON \ -DLLAMA_BUILD_TESTS=OFF \ -DCMAKE_C_COMPILER_LAUNCHER=ccache \ -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \ -DLLAMA_SANITIZE_${{ matrix.sanitizer }}=ON \ -DCMAKE_C_COMPILER=riscv64-linux-gnu-gcc-14 \ -DCMAKE_CXX_COMPILER=riscv64-linux-gnu-g++-14 cmake --build build --config ${{ matrix.build_type }} -j $(nproc)
cd build ctest -L main --verbose --timeout 900

View raw YAML

name: CI (riscv)

on:
  workflow_dispatch: # allows manual triggering
  push:
    branches:
      - master
    paths: [
      '.github/workflows/build-riscv.yml',
      '**/CMakeLists.txt',
      '**/.cmake',
      '**/*.h',
      '**/*.hpp',
      '**/*.c',
      '**/*.cpp'
    ]

  pull_request:
    types: [opened, synchronize, reopened]
    paths: [
      '.github/workflows/build-riscv.yml',
      'ggml/src/ggml-cpu/arch/riscv/**'
    ]

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
  cancel-in-progress: true

env:
  GGML_NLOOP: 3
  GGML_N_THREADS: 1
  LLAMA_LOG_COLORS: 1
  LLAMA_LOG_PREFIX: 1
  LLAMA_LOG_TIMESTAMPS: 1

jobs:
  ubuntu-riscv64-native-sanitizer:
    runs-on: RISCV64

    continue-on-error: true

    strategy:
      matrix:
        sanitizer: [ADDRESS, THREAD, UNDEFINED]
        build_type: [Debug]

    steps:
      - name: Install dependencies
        run: |
          sudo apt-get update

          # Install necessary packages
          sudo apt-get install -y libatomic1 libtsan2 gcc-14 g++-14 rustup cmake build-essential wget ccache git-lfs

          # Set gcc-14 and g++-14 as the default compilers
          sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-14 100
          sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-14 100
          sudo ln -sf /usr/bin/gcc-14 /usr/bin/gcc
          sudo ln -sf /usr/bin/g++-14 /usr/bin/g++

          # Install Rust stable version
          rustup install stable
          rustup default stable

          git lfs install

      - name: GCC version check
        run: |
          gcc --version
          g++ --version

      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Setup ccache
        run: |
          # Unique cache directory per matrix combination
          export CCACHE_DIR="$HOME/.ccache/sanitizer-${{ matrix.sanitizer }}-${{ matrix.build_type }}"
          mkdir -p "$CCACHE_DIR"

          # Configure ccache
          ccache --set-config=max_size=5G
          ccache --set-config=compression=true
          ccache --set-config=compression_level=6
          ccache --set-config=cache_dir="$CCACHE_DIR"
          ccache --set-config=sloppiness=file_macro,time_macros,include_file_mtime,include_file_ctime
          ccache --set-config=hash_dir=false

          # Export for subsequent steps
          echo "CCACHE_DIR=$CCACHE_DIR" >> $GITHUB_ENV
          echo "PATH=/usr/lib/ccache:$PATH" >> $GITHUB_ENV

      - name: Build
        id: cmake_build
        if: ${{ matrix.sanitizer != 'THREAD' }}
        run: |
          cmake -B build \
            -DLLAMA_OPENSSL=OFF \
            -DCMAKE_BUILD_TYPE=${{ matrix.build_type }} \
            -DGGML_OPENMP=ON \
            -DLLAMA_BUILD_EXAMPLES=ON \
            -DLLAMA_BUILD_TOOLS=ON \
            -DLLAMA_BUILD_TESTS=OFF \
            -DCMAKE_C_COMPILER_LAUNCHER=ccache \
            -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \
            -DLLAMA_SANITIZE_${{ matrix.sanitizer }}=ON \
            -DCMAKE_C_COMPILER=riscv64-linux-gnu-gcc-14 \
            -DCMAKE_CXX_COMPILER=riscv64-linux-gnu-g++-14

          cmake --build build --config ${{ matrix.build_type }} -j $(nproc)

      - name: Build (no OpenMP)
        id: cmake_build_no_openmp
        if: ${{ matrix.sanitizer == 'THREAD' }}
        run: |
          cmake -B build \
            -DLLAMA_OPENSSL=OFF \
            -DCMAKE_BUILD_TYPE=${{ matrix.build_type }} \
            -DGGML_OPENMP=OFF \
            -DLLAMA_BUILD_EXAMPLES=ON \
            -DLLAMA_BUILD_TOOLS=ON \
            -DLLAMA_BUILD_TESTS=OFF \
            -DCMAKE_C_COMPILER_LAUNCHER=ccache \
            -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \
            -DLLAMA_SANITIZE_${{ matrix.sanitizer }}=ON \
            -DCMAKE_C_COMPILER=riscv64-linux-gnu-gcc-14 \
            -DCMAKE_CXX_COMPILER=riscv64-linux-gnu-g++-14

          cmake --build build --config ${{ matrix.build_type }} -j $(nproc)

      - name: Test
        id: cmake_test
        run: |
          cd build
          ctest -L main --verbose --timeout 900

build-sanitize matrix .github/workflows/build-sanitize.yml

Triggers

workflow_dispatch, push

Runs on

ubuntu-latest

Jobs

ubuntu-latest-sanitizer

Matrix

build_type, sanitizer→ ADDRESS, Debug, THREAD, UNDEFINED

Actions

ggml-org/ccache-action

Commands

sudo apt-get update sudo apt-get install build-essential libssl-dev
cmake -B build \ -DLLAMA_FATAL_WARNINGS=ON \ -DLLAMA_SANITIZE_${{ matrix.sanitizer }}=ON \ -DGGML_SANITIZE_${{ matrix.sanitizer }}=ON \ -DCMAKE_BUILD_TYPE=${{ matrix.build_type }} cmake --build build --config ${{ matrix.build_type }} -j $(nproc)
cmake -B build \ -DLLAMA_FATAL_WARNINGS=ON \ -DLLAMA_SANITIZE_${{ matrix.sanitizer }}=ON \ -DGGML_SANITIZE_${{ matrix.sanitizer }}=ON \ -DCMAKE_BUILD_TYPE=${{ matrix.build_type }} \ -DGGML_OPENMP=OFF cmake --build build --config ${{ matrix.build_type }} -j $(nproc)
cd build ctest -L main --verbose --timeout 900

View raw YAML

name: CI (sanitize)

on:
  workflow_dispatch: # allows manual triggering
  push:
    branches:
      - master
    paths: [
      '.github/workflows/build-sanitize.yml',
      '**/CMakeLists.txt',
      '**/.cmake',
      '**/*.h',
      '**/*.hpp',
      '**/*.c',
      '**/*.cpp'
    ]

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
  cancel-in-progress: true

env:
  GGML_NLOOP: 3
  GGML_N_THREADS: 1
  LLAMA_LOG_COLORS: 1
  LLAMA_LOG_PREFIX: 1
  LLAMA_LOG_TIMESTAMPS: 1

jobs:
  ubuntu-latest-sanitizer:
    runs-on: ubuntu-latest

    continue-on-error: true

    strategy:
      matrix:
        sanitizer: [ADDRESS, THREAD, UNDEFINED]
        build_type: [Debug]

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: ubuntu-latest-sanitizer-${{ matrix.sanitizer }}
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Dependencies
        id: depends
        run: |
          sudo apt-get update
          sudo apt-get install build-essential libssl-dev

      - name: Build
        id: cmake_build
        if: ${{ matrix.sanitizer != 'THREAD' }}
        run: |
          cmake -B build \
            -DLLAMA_FATAL_WARNINGS=ON \
            -DLLAMA_SANITIZE_${{ matrix.sanitizer }}=ON \
            -DGGML_SANITIZE_${{ matrix.sanitizer }}=ON \
            -DCMAKE_BUILD_TYPE=${{ matrix.build_type }}

          cmake --build build --config ${{ matrix.build_type }} -j $(nproc)

      - name: Build (no OpenMP)
        id: cmake_build_no_openmp
        if: ${{ matrix.sanitizer == 'THREAD' }}
        run: |
          cmake -B build \
            -DLLAMA_FATAL_WARNINGS=ON \
            -DLLAMA_SANITIZE_${{ matrix.sanitizer }}=ON \
            -DGGML_SANITIZE_${{ matrix.sanitizer }}=ON \
            -DCMAKE_BUILD_TYPE=${{ matrix.build_type }} \
            -DGGML_OPENMP=OFF

          cmake --build build --config ${{ matrix.build_type }} -j $(nproc)

      - name: Test
        id: cmake_test
        run: |
          cd build
          ctest -L main --verbose --timeout 900

build-self-hosted .github/workflows/build-self-hosted.yml

Triggers

workflow_dispatch, push, pull_request

Runs on

self-hosted, Linux, NVIDIA, self-hosted, Linux, NVIDIA, self-hosted, Linux, NVIDIA, COOPMAT2, self-hosted, Linux, Intel, self-hosted, Linux, Intel, OpenVINO

Jobs

ggml-ci-nvidia-cuda, ggml-ci-nvidia-vulkan-cm, ggml-ci-nvidia-vulkan-cm2, ggml-ci-linux-intel-vulkan, ggml-ci-intel-openvino-gpu-low-perf

Commands

nvidia-smi GG_BUILD_CUDA=1 bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp
vulkaninfo --summary GG_BUILD_VULKAN=1 GGML_VK_DISABLE_COOPMAT2=1 bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp
vulkaninfo --summary GG_BUILD_VULKAN=1 bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp
vulkaninfo --summary GG_BUILD_VULKAN=1 bash ./ci/run.sh ~/results/llama.cpp ~/mnt/llama.cpp
cd ./openvino_toolkit chmod +x ./install_dependencies/install_openvino_dependencies.sh echo "Y" | sudo -E ./install_dependencies/install_openvino_dependencies.sh
source ./openvino_toolkit/setupvars.sh GG_BUILD_OPENVINO=1 GGML_OPENVINO_DEVICE=GPU GG_BUILD_LOW_PERF=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt

View raw YAML

name: CI (self-hosted)

on:
  workflow_dispatch: # allows manual triggering
  push:
    branches:
      - master
    paths: [
      '.github/workflows/build.yml',
      '**/CMakeLists.txt',
      '**/.cmake',
      '**/*.h',
      '**/*.hpp',
      '**/*.c',
      '**/*.cpp',
      '**/*.cu',
      '**/*.cuh',
      '**/*.swift',
      '**/*.m',
      '**/*.metal',
      '**/*.comp',
      '**/*.glsl',
      '**/*.wgsl'
    ]

  pull_request:
    types: [opened, synchronize, reopened]
    paths: [
      '.github/workflows/build-self-hosted.yml',
      '**/CMakeLists.txt',
      '**/.cmake',
      '**/*.h',
      '**/*.hpp',
      '**/*.c',
      '**/*.cpp',
      '**/*.cu',
      '**/*.cuh',
      '**/*.swift',
      '**/*.m',
      '**/*.metal',
      '**/*.comp',
      '**/*.glsl',
      '**/*.wgsl'
    ]

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
  cancel-in-progress: true

env:
  GGML_NLOOP: 3
  GGML_N_THREADS: 1
  LLAMA_LOG_COLORS: 1
  LLAMA_LOG_PREFIX: 1
  LLAMA_LOG_TIMESTAMPS: 1

jobs:
  ggml-ci-nvidia-cuda:
    runs-on: [self-hosted, Linux, NVIDIA]

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Test
        id: ggml-ci
        run: |
          nvidia-smi
          GG_BUILD_CUDA=1 bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp

  ggml-ci-nvidia-vulkan-cm:
    runs-on: [self-hosted, Linux, NVIDIA]

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Test
        id: ggml-ci
        run: |
          vulkaninfo --summary
          GG_BUILD_VULKAN=1 GGML_VK_DISABLE_COOPMAT2=1 bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp

  ggml-ci-nvidia-vulkan-cm2:
    runs-on: [self-hosted, Linux, NVIDIA, COOPMAT2]

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Test
        id: ggml-ci
        run: |
          vulkaninfo --summary
          GG_BUILD_VULKAN=1 bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp

  # TODO: provision AMX-compatible machine
  #ggml-ci-cpu-amx:
  #  runs-on: [self-hosted, Linux, CPU, AMX]

  #  steps:
  #    - name: Clone
  #      id: checkout
  #      uses: actions/checkout@v6

  #    - name: Test
  #      id: ggml-ci
  #      run: |
  #        bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp

  # TODO: provision AMD GPU machine
  # ggml-ci-amd-vulkan:
  #   runs-on: [self-hosted, Linux, AMD]

  #   steps:
  #     - name: Clone
  #       id: checkout
  #       uses: actions/checkout@v6

  #     - name: Test
  #       id: ggml-ci
  #       run: |
  #         vulkaninfo --summary
  #         GG_BUILD_VULKAN=1 bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp

  # TODO: provision AMD GPU machine
  # ggml-ci-amd-rocm:
  #   runs-on: [self-hosted, Linux, AMD]

  #   steps:
  #     - name: Clone
  #       id: checkout
  #       uses: actions/checkout@v6

  #     - name: Test
  #       id: ggml-ci
  #       run: |
  #         amd-smi static
  #         GG_BUILD_ROCM=1 GG_BUILD_AMDGPU_TARGETS="gfx1101" bash ./ci/run.sh ~/results/llama.cpp /mnt/llama.cpp

  # TODO: sandbox Mac runners
  #  ggml-ci-mac-metal:
  #    runs-on: [self-hosted, macOS, ARM64]
  #
  #    steps:
  #      - name: Clone
  #        id: checkout
  #        uses: actions/checkout@v6
  #
  #      - name: Test
  #        id: ggml-ci
  #        run: |
  #          GG_BUILD_METAL=1 bash ./ci/run.sh ~/results/llama.cpp ~/mnt/llama.cpp
  #
  #  ggml-ci-mac-webgpu:
  #    runs-on: [self-hosted, macOS, ARM64]
  #
  #    steps:
  #      - name: Clone
  #        id: checkout
  #        uses: actions/checkout@v6
  #
  #      - name: Dawn Dependency
  #        id: dawn-depends
  #        run: |
  #          DAWN_VERSION="v2.0.0"
  #          DAWN_OWNER="reeselevine"
  #          DAWN_REPO="dawn"
  #          DAWN_ASSET_NAME="Dawn-5e9a4865b1635796ccc77dd30057f2b4002a1355-macos-latest-Release"
  #          echo "Fetching release asset from https://github.com/${DAWN_OWNER}/${DAWN_REPO}/releases/download/${DAWN_VERSION}/${DAWN_ASSET_NAME}.zip"
  #          curl -L -o artifact.zip \
  #            "https://github.com/${DAWN_OWNER}/${DAWN_REPO}/releases/download/${DAWN_VERSION}/${DAWN_ASSET_NAME}.zip"
  #          mkdir dawn
  #          unzip artifact.zip
  #          tar -xvf ${DAWN_ASSET_NAME}.tar.gz -C dawn --strip-components=1
  #
  #      - name: Test
  #        id: ggml-ci
  #        run: |
  #          GG_BUILD_WEBGPU=1 GG_BUILD_WEBGPU_DAWN_PREFIX="$GITHUB_WORKSPACE/dawn" \
  #            bash ./ci/run.sh ~/results/llama.cpp ~/mnt/llama.cpp
  #
  #  ggml-ci-mac-vulkan:
  #    runs-on: [self-hosted, macOS, ARM64]
  #
  #    steps:
  #      - name: Clone
  #        id: checkout
  #        uses: actions/checkout@v6
  #
  #      - name: Test
  #        id: ggml-ci
  #        run: |
  #          vulkaninfo --summary
  #          GG_BUILD_VULKAN=1 bash ./ci/run.sh ~/results/llama.cpp ~/mnt/llama.cpp

  ggml-ci-linux-intel-vulkan:
    runs-on: [self-hosted, Linux, Intel]

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6
        with:
          persist-credentials: false

      - name: Test
        id: ggml-ci
        run: |
          vulkaninfo --summary
          GG_BUILD_VULKAN=1 bash ./ci/run.sh ~/results/llama.cpp ~/mnt/llama.cpp

  ggml-ci-intel-openvino-gpu-low-perf:
    runs-on: [self-hosted, Linux, Intel, OpenVINO]

    env:
      # Sync versions in build.yml, build-self-hosted.yml, release.yml, build-cache.yml, .devops/openvino.Dockerfile
      OPENVINO_VERSION_MAJOR: "2026.0"
      OPENVINO_VERSION_FULL: "2026.0.0.20965.c6d6a13a886"

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Setup OpenVINO Toolkit
        uses: ./.github/actions/linux-setup-openvino
        with:
          path: ./openvino_toolkit
          version_major: ${{ env.OPENVINO_VERSION_MAJOR }}
          version_full: ${{ env.OPENVINO_VERSION_FULL }}

      - name: Install OpenVINO dependencies
        run: |
          cd ./openvino_toolkit
          chmod +x ./install_dependencies/install_openvino_dependencies.sh
          echo "Y" | sudo -E ./install_dependencies/install_openvino_dependencies.sh

      - name: Test
        id: ggml-ci
        run: |
          source ./openvino_toolkit/setupvars.sh
          GG_BUILD_OPENVINO=1 GGML_OPENVINO_DEVICE=GPU GG_BUILD_LOW_PERF=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt

build-vulkan .github/workflows/build-vulkan.yml

Triggers

workflow_dispatch, push, pull_request

Runs on

ubuntu-24.04

Jobs

ubuntu-24-vulkan-llvmpipe

Actions

ggml-org/ccache-action

Commands

sudo add-apt-repository -y ppa:kisak/kisak-mesa sudo apt-get update -y sudo apt-get install -y build-essential mesa-vulkan-drivers libxcb-xinput0 libxcb-xinerama0 libxcb-cursor-dev libssl-dev
echo "VULKAN_SDK_VERSION=$(curl https://vulkan.lunarg.com/sdk/latest/linux.txt)" >> "$GITHUB_ENV"
source ./vulkan_sdk/setup-env.sh cmake -B build \ -DGGML_VULKAN=ON cmake --build build --config Release -j $(nproc)
cd build export GGML_VK_VISIBLE_DEVICES=0 export GGML_VK_DISABLE_F16=1 export GGML_VK_DISABLE_COOPMAT=1 # This is using llvmpipe and runs slower than other backends ctest -L main --verbose --timeout 4800

View raw YAML

name: CI (vulkan)

on:
  workflow_dispatch: # allows manual triggering
  push:
    branches:
      - master
    paths: [
      '.github/workflows/build-vulkan.yml',
      '**/CMakeLists.txt',
      '**/.cmake',
      '**/*.h',
      '**/*.hpp',
      '**/*.c',
      '**/*.cpp',
      '**/*.comp',
      '**/*.glsl'
    ]

  pull_request:
    types: [opened, synchronize, reopened]
    paths: [
      '.github/workflows/build-vulkan.yml',
      'ggml/src/ggml-vulkan/**'
    ]

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
  cancel-in-progress: true

env:
  GGML_NLOOP: 3
  GGML_N_THREADS: 1
  LLAMA_LOG_COLORS: 1
  LLAMA_LOG_PREFIX: 1
  LLAMA_LOG_TIMESTAMPS: 1

jobs:
  ubuntu-24-vulkan-llvmpipe:
    runs-on: ubuntu-24.04

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: ubuntu-24-vulkan-llvmpipe
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Dependencies
        id: depends
        run: |
          sudo add-apt-repository -y ppa:kisak/kisak-mesa
          sudo apt-get update -y
          sudo apt-get install -y build-essential mesa-vulkan-drivers libxcb-xinput0 libxcb-xinerama0 libxcb-cursor-dev libssl-dev

      - name: Get latest Vulkan SDK version
        id: vulkan_sdk_version
        run: |
          echo "VULKAN_SDK_VERSION=$(curl https://vulkan.lunarg.com/sdk/latest/linux.txt)" >> "$GITHUB_ENV"

      - name: Use Vulkan SDK Cache
        uses: actions/cache@v5
        id: cache-sdk
        with:
          path: ./vulkan_sdk
          key: vulkan-sdk-${{ env.VULKAN_SDK_VERSION }}-${{ runner.os }}

      - name: Setup Vulkan SDK
        if: steps.cache-sdk.outputs.cache-hit != 'true'
        uses: ./.github/actions/linux-setup-vulkan-llvmpipe
        with:
          path: ./vulkan_sdk
          version: ${{ env.VULKAN_SDK_VERSION }}

      - name: Build
        id: cmake_build
        run: |
          source ./vulkan_sdk/setup-env.sh
          cmake -B build \
            -DGGML_VULKAN=ON
          cmake --build build --config Release -j $(nproc)

      - name: Test
        id: cmake_test
        run: |
          cd build
          export GGML_VK_VISIBLE_DEVICES=0
          export GGML_VK_DISABLE_F16=1
          export GGML_VK_DISABLE_COOPMAT=1
          # This is using llvmpipe and runs slower than other backends
          ctest -L main --verbose --timeout 4800

check-vendor .github/workflows/check-vendor.yml

Triggers

workflow_dispatch, push, pull_request

Runs on

ubuntu-slim

Jobs

check-vendor

Commands

set -euo pipefail python3 scripts/sync_vendor.py
set -euo pipefail # detect modified or untracked files changed=$(git status --porcelain --untracked-files=all || true) if [ -n "$changed" ]; then echo "Vendor sync modified files:" echo "$changed" | awk '{ print $2 }' | sed '/^$/d' echo "Failing because vendor files mismatch. Please update scripts/sync_vendor.py" exit 1 else echo "Vendor files are up-to-date." fi

View raw YAML

name: Check vendor

on:
  workflow_dispatch: # allows manual triggering
  push:
    branches:
      - master
    paths: [
      'vendor/**',
      'scripts/sync_vendor.py'
    ]

  pull_request:
    types: [opened, synchronize, reopened]
    paths: [
      'vendor/**',
      'scripts/sync_vendor.py'
    ]

jobs:
  check-vendor:
    runs-on: ubuntu-slim

    steps:
      - name: Checkout
        uses: actions/checkout@v6
        with:
          fetch-depth: 0

      - name: Setup Python
        uses: actions/setup-python@v6
        with:
          python-version: '3.x'

      - name: Run vendor sync
        run: |
          set -euo pipefail
          python3 scripts/sync_vendor.py

      - name: Check for changes
        run: |
          set -euo pipefail
          # detect modified or untracked files
          changed=$(git status --porcelain --untracked-files=all || true)
          if [ -n "$changed" ]; then
            echo "Vendor sync modified files:"
            echo "$changed" | awk '{ print $2 }' | sed '/^$/d'
            echo "Failing because vendor files mismatch. Please update scripts/sync_vendor.py"
            exit 1
          else
            echo "Vendor files are up-to-date."
          fi

close-issue perms .github/workflows/close-issue.yml

Triggers: schedule
Runs on: ubuntu-slim
Jobs: close-issues
Actions: actions/stale

View raw YAML

name: Close inactive issues
on:
  schedule:
    - cron: "42 0 * * *"

# Fine-grant permission
# https://docs.github.com/en/actions/security-for-github-actions/security-guides/automatic-token-authentication#modifying-the-permissions-for-the-github_token
permissions:
  issues: write

jobs:
  close-issues:
    runs-on: ubuntu-slim
    permissions:
      issues: write
      pull-requests: write
    steps:
      - uses: actions/stale@v10
        with:
          exempt-issue-labels: "refactoring,help wanted,good first issue,research 🔬,bug,roadmap"
          days-before-issue-stale: 30
          days-before-issue-close: 14
          stale-issue-label: "stale"
          close-issue-message: "This issue was closed because it has been inactive for 14 days since being marked as stale."
          days-before-pr-stale: -1
          days-before-pr-close: -1
          operations-per-run: 10000
          repo-token: ${{ secrets.GITHUB_TOKEN }}

copilot-setup-steps .github/workflows/copilot-setup-steps.yml

Triggers

workflow_dispatch, push, pull_request

Runs on

ubuntu-latest

Jobs

copilot-setup-steps

Actions

ggml-org/ccache-action

Commands

sudo apt-get update sudo apt-get install build-essential libssl-dev # Install git-clang-format script for formatting only changed code wget -O /tmp/git-clang-format https://raw.githubusercontent.com/llvm/llvm-project/release/18.x/clang/tools/clang-format/git-clang-format sudo cp /tmp/git-clang-format /usr/local/bin/git-clang-format sudo chmod +x /usr/local/bin/git-clang-format
python3 -m venv .venv source .venv/bin/activate pip install -r requirements/requirements-all.txt -r tools/server/tests/requirements.txt

View raw YAML

name: "Copilot Setup Steps"

# Automatically run the setup steps when they are changed to allow for easy validation, and
# allow manual testing through the repository's "Actions" tab
on:
  workflow_dispatch:
  push:
    paths:
      - .github/workflows/copilot-setup-steps.yml
  pull_request:
    paths:
      - .github/workflows/copilot-setup-steps.yml

jobs:
  # The job MUST be called `copilot-setup-steps` or it will not be picked up by Copilot.
  copilot-setup-steps:
    runs-on: ubuntu-latest

    # Set the permissions to the lowest permissions possible needed for your steps.
    # Copilot will be given its own token for its operations.
    permissions:
      # If you want to clone the repository as part of your setup steps, for example to install dependencies, you'll need the `contents: read` permission. If you don't clone the repository in your setup steps, Copilot will do this for you automatically after the steps complete.
      contents: read

    # You can define any steps you want, and they will run before the agent starts.
    # If you do not check out your code, Copilot will do this for you.
    steps:
      - name: Checkout code
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: copilot-setup-steps
          evict-old-files: 1d

      - name: Dependencies
        id: depends
        run: |
          sudo apt-get update
          sudo apt-get install build-essential libssl-dev
          # Install git-clang-format script for formatting only changed code
          wget -O /tmp/git-clang-format https://raw.githubusercontent.com/llvm/llvm-project/release/18.x/clang/tools/clang-format/git-clang-format
          sudo cp /tmp/git-clang-format /usr/local/bin/git-clang-format
          sudo chmod +x /usr/local/bin/git-clang-format

      - name: Set up Python
        uses: actions/setup-python@v6
        with:
          python-version: '3.11'

      - name: Install Python dependencies
        run: |
          python3 -m venv .venv
          source .venv/bin/activate
          pip install -r requirements/requirements-all.txt -r tools/server/tests/requirements.txt

docker matrix perms .github/workflows/docker.yml

Triggers

workflow_dispatch, schedule

Runs on

${{ matrix.config.runs_on }}, ubuntu-22.04

Jobs

push_to_registry, create_tag

Matrix

config, config.cuda_version, config.dockerfile, config.free_disk_space, config.full, config.light, config.platforms, config.runs_on, config.server, config.tag, config.ubuntu_version→ .devops/cpu.Dockerfile, .devops/cuda-new.Dockerfile, .devops/cuda.Dockerfile, .devops/intel.Dockerfile, .devops/musa.Dockerfile, .devops/openvino.Dockerfile, .devops/rocm.Dockerfile, .devops/s390x.Dockerfile, .devops/vulkan.Dockerfile, 12.4.0, 13.1.0, 22.04, 24.04, False, True, cpu, cuda cuda12, cuda13, intel, linux/amd64, linux/arm64, linux/s390x, musa, openvino, rocm, s390x, ubuntu-24.04, ubuntu-24.04-s390x, vulkan

Actions

docker/setup-qemu-action, docker/setup-buildx-action, docker/login-action, ggml-org/free-disk-space, docker/build-push-action, docker/build-push-action, docker/build-push-action

Commands

REPO_OWNER="${GITHUB_REPOSITORY_OWNER@L}" # to lower case REPO_NAME="${{ github.event.repository.name }}" PREFIX="ghcr.io/${REPO_OWNER}/${REPO_NAME}:" # list all tags possible tags="${{ matrix.config.tag }}" for tag in $tags; do if [[ "$tag" == "cpu" ]]; then TYPE="" else TYPE="-$tag" fi CACHETAGS="${PREFIX}buildcache${TYPE}" FULLTAGS="${FULLTAGS:+$FULLTAGS,}${PREFIX}full${TYPE},${PREFIX}full${TYPE}-${{ steps.srctag.outputs.name }}" LIGHTTAGS="${LIGHTTAGS:+$LIGHTTAGS,}${PREFIX}light${TYPE},${PREFIX}light${TYPE}-${{ steps.srctag.outputs.name }}" SERVERTAGS="${SERVERTAGS:+$SERVERTAGS,}${PREFIX}server${TYPE},${PREFIX}server${TYPE}-${{ steps.srctag.outputs.name }}" done echo "cache_output_tags=$CACHETAGS" >> $GITHUB_OUTPUT echo "full_output_tags=$FULLTAGS" >> $GITHUB_OUTPUT echo "light_output_tags=$LIGHTTAGS" >> $GITHUB_OUTPUT echo "server_output_tags=$SERVERTAGS" >> $GITHUB_OUTPUT echo "cache_output_tags=$CACHETAGS" # print out for debugging echo "full_output_tags=$FULLTAGS" # print out for debugging echo "light_output_tags=$LIGHTTAGS" # print out for debugging echo "server_output_tags=$SERVERTAGS" # print out for debugging
git tag ${{ steps.srctag.outputs.name }} || exit 0 git push origin ${{ steps.srctag.outputs.name }} || exit 0

View raw YAML

# This workflow uses actions that are not certified by GitHub.
# They are provided by a third-party and are governed by
# separate terms of service, privacy policy, and support
# documentation.

# GitHub recommends pinning actions to a commit SHA.
# To get a newer version, you will need to update the SHA.
# You can also reference a tag or branch, but the action may change without warning.

name: Publish Docker image

on:
  workflow_dispatch: # allows manual triggering
  schedule:
    # Rebuild daily rather than on every push because it is expensive
    - cron: '12 4 * * *'

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
  cancel-in-progress: true

# Fine-grant permission
# https://docs.github.com/en/actions/security-for-github-actions/security-guides/automatic-token-authentication#modifying-the-permissions-for-the-github_token
permissions:
  packages: write

jobs:
  push_to_registry:
    name: Push Docker image to Docker Hub

    runs-on: ${{ matrix.config.runs_on }}
    env:
      COMMIT_SHA: ${{ github.sha }}
    strategy:
      fail-fast: false
      matrix:
        config:
          # Multi-stage build
          - { tag: "cpu", dockerfile: ".devops/cpu.Dockerfile", platforms: "linux/arm64", full: true, light: true, server: true, free_disk_space: false, runs_on: "ubuntu-24.04" }
          - { tag: "cpu", dockerfile: ".devops/cpu.Dockerfile", platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: false, runs_on: "ubuntu-24.04" }
          - { tag: "cuda cuda12", dockerfile: ".devops/cuda.Dockerfile", platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: true,  runs_on: "ubuntu-24.04", cuda_version: "12.4.0", ubuntu_version: "22.04" }
          - { tag: "cuda13", dockerfile: ".devops/cuda-new.Dockerfile",  platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: true,  runs_on: "ubuntu-24.04", cuda_version: "13.1.0", ubuntu_version: "24.04" }
          - { tag: "musa",   dockerfile: ".devops/musa.Dockerfile",   platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: true,  runs_on: "ubuntu-24.04" }
          - { tag: "intel",  dockerfile: ".devops/intel.Dockerfile",  platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: true,  runs_on: "ubuntu-24.04" }
          - { tag: "vulkan", dockerfile: ".devops/vulkan.Dockerfile", platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: false, runs_on: "ubuntu-24.04" }
          - { tag: "s390x",  dockerfile: ".devops/s390x.Dockerfile",  platforms: "linux/s390x", full: true, light: true, server: true, free_disk_space: false, runs_on: "ubuntu-24.04-s390x" }
          - { tag: "rocm",   dockerfile: ".devops/rocm.Dockerfile",   platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: true,  runs_on: "ubuntu-24.04" }
          - { tag: "openvino", dockerfile: ".devops/openvino.Dockerfile", platforms: "linux/amd64", full: true, light: true, server: true, free_disk_space: false, runs_on: "ubuntu-24.04" }
    steps:
      - name: Check out the repo
        uses: actions/checkout@v6
        with:
          fetch-depth: 0 # preserve git history, so we can determine the build number

      - name: Set up QEMU
        if: ${{ matrix.config.tag != 's390x' }}
        uses: docker/setup-qemu-action@c7c53464625b32c7a7e944ae62b3e17d2b600130 # v3
        with:
          image: tonistiigi/binfmt:qemu-v10.2.1

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3

      - name: Log in to Docker Hub
        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3
        with:
          registry: ghcr.io
          username: ${{ github.repository_owner }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Determine source tag name
        id: srctag
        uses: ./.github/actions/get-tag-name
        env:
          BRANCH_NAME: ${{ github.head_ref || github.ref_name }}

      - name: Determine image tag name
        id: tag
        shell: bash
        run: |
          REPO_OWNER="${GITHUB_REPOSITORY_OWNER@L}"  # to lower case
          REPO_NAME="${{ github.event.repository.name }}"
          PREFIX="ghcr.io/${REPO_OWNER}/${REPO_NAME}:"

          # list all tags possible
          tags="${{ matrix.config.tag }}"
          for tag in $tags; do
              if [[ "$tag" == "cpu" ]]; then
                  TYPE=""
              else
                  TYPE="-$tag"
              fi
              CACHETAGS="${PREFIX}buildcache${TYPE}"
              FULLTAGS="${FULLTAGS:+$FULLTAGS,}${PREFIX}full${TYPE},${PREFIX}full${TYPE}-${{ steps.srctag.outputs.name }}"
              LIGHTTAGS="${LIGHTTAGS:+$LIGHTTAGS,}${PREFIX}light${TYPE},${PREFIX}light${TYPE}-${{ steps.srctag.outputs.name }}"
              SERVERTAGS="${SERVERTAGS:+$SERVERTAGS,}${PREFIX}server${TYPE},${PREFIX}server${TYPE}-${{ steps.srctag.outputs.name }}"
          done
          echo "cache_output_tags=$CACHETAGS" >> $GITHUB_OUTPUT
          echo "full_output_tags=$FULLTAGS" >> $GITHUB_OUTPUT
          echo "light_output_tags=$LIGHTTAGS" >> $GITHUB_OUTPUT
          echo "server_output_tags=$SERVERTAGS" >> $GITHUB_OUTPUT
          echo "cache_output_tags=$CACHETAGS"  # print out for debugging
          echo "full_output_tags=$FULLTAGS"  # print out for debugging
          echo "light_output_tags=$LIGHTTAGS"  # print out for debugging
          echo "server_output_tags=$SERVERTAGS"  # print out for debugging
        env:
          GITHUB_REPOSITORY_OWNER: '${{ github.repository_owner }}'

      - name: Free Disk Space (Ubuntu)
        if: ${{ matrix.config.free_disk_space == true }}
        uses: ggml-org/free-disk-space@v1.3.1
        with:
          # this might remove tools that are actually needed,
          # if set to "true" but frees about 6 GB
          tool-cache: false

          # all of these default to true, but feel free to set to
          # "false" if necessary for your workflow
          android: true
          dotnet: true
          haskell: true
          large-packages: true
          docker-images: true
          swap-storage: true

      - name: Build and push Full Docker image (tagged + versioned)
        if: ${{ (github.event_name == 'push' || github.event_name == 'schedule' || github.event_name == 'workflow_dispatch') && matrix.config.full == true }}
        uses: docker/build-push-action@10e90e3645eae34f1e60eeb005ba3a3d33f178e8 # v6
        with:
          context: .
          push: true
          platforms: ${{ matrix.config.platforms }}
          # tag list is generated from step above
          tags: ${{ steps.tag.outputs.full_output_tags }}
          file: ${{ matrix.config.dockerfile }}
          target: full
          provenance: false
          build-args: |
            ${{ matrix.config.ubuntu_version && format('UBUNTU_VERSION={0}', matrix.config.ubuntu_version) || '' }}
            ${{ matrix.config.cuda_version && format('CUDA_VERSION={0}', matrix.config.cuda_version) || '' }}
          # using github experimental cache
          #cache-from: type=gha
          #cache-to: type=gha,mode=max
          # return to this if the experimental github cache is having issues
          #cache-to: type=local,dest=/tmp/.buildx-cache
          #cache-from: type=local,src=/tmp/.buildx-cache
          # using registry cache (no storage limit)
          cache-from: type=registry,ref=${{ steps.tag.outputs.cache_output_tags }}
          cache-to: type=registry,ref=${{ steps.tag.outputs.cache_output_tags }},mode=max

      - name: Build and push Light Docker image (tagged + versioned)
        if: ${{ (github.event_name == 'push' || github.event_name == 'schedule' || github.event_name == 'workflow_dispatch') && matrix.config.light == true }}
        uses: docker/build-push-action@10e90e3645eae34f1e60eeb005ba3a3d33f178e8 # v6
        with:
          context: .
          push: true
          platforms: ${{ matrix.config.platforms }}
          # tag list is generated from step above
          tags: ${{ steps.tag.outputs.light_output_tags }}
          file: ${{ matrix.config.dockerfile }}
          target: light
          provenance: false
          build-args: |
            ${{ matrix.config.ubuntu_version && format('UBUNTU_VERSION={0}', matrix.config.ubuntu_version) || '' }}
            ${{ matrix.config.cuda_version && format('CUDA_VERSION={0}', matrix.config.cuda_version) || '' }}
          # using github experimental cache
          #cache-from: type=gha
          #cache-to: type=gha,mode=max
          # return to this if the experimental github cache is having issues
          #cache-to: type=local,dest=/tmp/.buildx-cache
          #cache-from: type=local,src=/tmp/.buildx-cache
          # using registry cache (no storage limit)
          cache-from: type=registry,ref=${{ steps.tag.outputs.cache_output_tags }}
          cache-to: type=registry,ref=${{ steps.tag.outputs.cache_output_tags }},mode=max

      - name: Build and push Server Docker image (tagged + versioned)
        if: ${{ (github.event_name == 'push' || github.event_name == 'schedule' || github.event_name == 'workflow_dispatch') && matrix.config.server == true }}
        uses: docker/build-push-action@10e90e3645eae34f1e60eeb005ba3a3d33f178e8 # v6
        with:
          context: .
          push: true
          platforms: ${{ matrix.config.platforms }}
          # tag list is generated from step above
          tags: ${{ steps.tag.outputs.server_output_tags }}
          file: ${{ matrix.config.dockerfile }}
          target: server
          provenance: false
          build-args: |
            ${{ matrix.config.ubuntu_version && format('UBUNTU_VERSION={0}', matrix.config.ubuntu_version) || '' }}
            ${{ matrix.config.cuda_version && format('CUDA_VERSION={0}', matrix.config.cuda_version) || '' }}
          # using github experimental cache
          #cache-from: type=gha
          #cache-to: type=gha,mode=max
          # return to this if the experimental github cache is having issues
          #cache-to: type=local,dest=/tmp/.buildx-cache
          #cache-from: type=local,src=/tmp/.buildx-cache
          # using registry cache (no storage limit)
          cache-from: type=registry,ref=${{ steps.tag.outputs.cache_output_tags }}
          cache-to: type=registry,ref=${{ steps.tag.outputs.cache_output_tags }},mode=max

  create_tag:
    name: Create and push git tag
    runs-on: ubuntu-22.04
    permissions:
      contents: write

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6
        with:
          fetch-depth: 0

      - name: Determine source tag name
        id: srctag
        uses: ./.github/actions/get-tag-name
        env:
          BRANCH_NAME: ${{ github.head_ref || github.ref_name }}

      - name: Create and push git tag
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          git tag ${{ steps.srctag.outputs.name }} || exit 0
          git push origin ${{ steps.srctag.outputs.name }} || exit 0

editorconfig .github/workflows/editorconfig.yml

Triggers

workflow_dispatch, push, pull_request

Runs on

ubuntu-slim

Jobs

editorconfig

Actions

editorconfig-checker/action-editorconfig-checker

Commands

editorconfig-checker

View raw YAML

name: EditorConfig Checker

on:
  workflow_dispatch: # allows manual triggering
    inputs:
      create_release:
        description: 'Create new release'
        required: true
        type: boolean
  push:
    branches:
      - master
  pull_request:
    branches:
      - master

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
  cancel-in-progress: true

jobs:
  editorconfig:
    runs-on: ubuntu-slim
    steps:
      - uses: actions/checkout@v6
      - uses: editorconfig-checker/action-editorconfig-checker@840e866d93b8e032123c23bac69dece044d4d84c # v2.2.0
        with:
          version: v3.0.3
      - run: editorconfig-checker

gguf-publish .github/workflows/gguf-publish.yml

Triggers

workflow_dispatch, push

Runs on

ubuntu-latest

Jobs

deploy

Actions

pypa/gh-action-pypi-publish

Commands

cd gguf-py python -m pip install poetry==2.3.2 poetry install
cd gguf-py && poetry build

View raw YAML

# This workflow will upload a Python Package using Twine when a GGUF release is created
# For more information see: https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions#publishing-to-package-registries

# See `gguf-py/README.md` for how to make a release.

# This workflow uses actions that are not certified by GitHub.
# They are provided by a third-party and are governed by
# separate terms of service, privacy policy, and support
# documentation.

name: Upload Python Package

on:
  workflow_dispatch:
  push:
    # Pattern matched against refs/tags
    tags:
      - 'gguf-v*'           # Push events to every version tag


jobs:
  deploy:

    runs-on: ubuntu-latest

    steps:
    - uses: actions/checkout@v6
    - name: Set up Python
      uses: actions/setup-python@v6
      with:
        python-version: '3.11'
    - name: Install dependencies
      run: |
        cd gguf-py
        python -m pip install poetry==2.3.2
        poetry install

    - name: Build package
      run: cd gguf-py && poetry build
    - name: Publish package
      uses: pypa/gh-action-pypi-publish@ed0c53931b1dc9bd32cbe73a98c7f6766f8a527e # release/v1
      with:
        password: ${{ secrets.PYPI_API_TOKEN }}
        packages-dir: gguf-py/dist

hip-quality-check .github/workflows/hip-quality-check.yml

Triggers

workflow_dispatch, push, pull_request

Runs on

ubuntu-22.04

Jobs

ubuntu-22-hip-quality-check

Actions

ggml-org/ccache-action

Commands

sudo apt-get update sudo apt-get install -y build-essential git cmake rocblas-dev hipblas-dev libssl-dev python3
cmake -B build -S . \ -DCMAKE_HIP_COMPILER="$(hipconfig -l)/clang" \ -DGPU_TARGETS=gfx908 \ -DGGML_HIP=ON \ -DGGML_HIP_EXPORT_METRICS=Off \ -DCMAKE_HIP_FLAGS="-Werror -Wno-tautological-compare" \ -DCMAKE_BUILD_TYPE=Release cd build make -j $(nproc)
cmake -B build -S . \ -DCMAKE_HIP_COMPILER="$(hipconfig -l)/clang" \ -DGPU_TARGETS=gfx908 \ -DGGML_HIP=ON \ -DGGML_HIP_EXPORT_METRICS=On \ -DCMAKE_HIP_FLAGS="" \ -DCMAKE_BUILD_TYPE=Release cd build make -j $(nproc) 2>&1 | tee metrics.log | grep -v 'Rpass-analysis=kernel-resource-usage\|remark:\|^$' python3 ../scripts/hip/gcn-cdna-vgpr-check.py metrics.log

View raw YAML

name: HIP quality check

on:
  workflow_dispatch: # allows manual triggering
  push:
    branches:
      - master
    paths: [
      '.github/workflows/hip-quality-check.yml',
      '**/*.cu',
      '**/*.cuh',
      'scripts/hip/gcn-cdna-vgpr-check.py'
    ]

  pull_request:
    types: [opened, synchronize, reopened]
    paths: [
      '.github/workflows/hip-quality-check.yml',
      '**/*.cu',
      '**/*.cuh',
      'scripts/hip/gcn-cdna-vgpr-check.py'
    ]

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
  cancel-in-progress: true

env:
  GGML_NLOOP: 3
  GGML_N_THREADS: 1
  LLAMA_LOG_COLORS: 1
  LLAMA_LOG_PREFIX: 1
  LLAMA_LOG_TIMESTAMPS: 1

jobs:
  ubuntu-22-hip-quality-check:
    runs-on: ubuntu-22.04
    container: rocm/dev-ubuntu-22.04:7.2
    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Dependencies
        id: depends
        run: |
          sudo apt-get update
          sudo apt-get install -y build-essential git cmake rocblas-dev hipblas-dev libssl-dev python3

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: ubuntu-22-hip-quality-check
          evict-old-files: 1d
          save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

      - name: Build with Werror
        id: cmake_build
        run: |
          cmake -B build -S . \
            -DCMAKE_HIP_COMPILER="$(hipconfig -l)/clang" \
            -DGPU_TARGETS=gfx908 \
            -DGGML_HIP=ON \
            -DGGML_HIP_EXPORT_METRICS=Off \
            -DCMAKE_HIP_FLAGS="-Werror -Wno-tautological-compare" \
            -DCMAKE_BUILD_TYPE=Release
          cd build
          make -j $(nproc)

      - name: Check for major VGPR spills
        id: vgpr_check
        run: |
          cmake -B build -S . \
            -DCMAKE_HIP_COMPILER="$(hipconfig -l)/clang" \
            -DGPU_TARGETS=gfx908 \
            -DGGML_HIP=ON \
            -DGGML_HIP_EXPORT_METRICS=On \
            -DCMAKE_HIP_FLAGS="" \
            -DCMAKE_BUILD_TYPE=Release
          cd build
          make -j $(nproc) 2>&1 | tee metrics.log | grep -v 'Rpass-analysis=kernel-resource-usage\|remark:\|^$'
          python3 ../scripts/hip/gcn-cdna-vgpr-check.py metrics.log

labeler .github/workflows/labeler.yml

Triggers: pull_request_target
Runs on: ubuntu-slim
Jobs: labeler
Actions: actions/labeler

View raw YAML

name: "Pull Request Labeler"
on:
- pull_request_target

jobs:
  labeler:
    permissions:
      contents: read
      pull-requests: write
    runs-on: ubuntu-slim
    steps:
    - uses: actions/checkout@v6
      with:
        repository: "ggml-org/llama.cpp"
    - uses: actions/labeler@v6
      with:
        configuration-path: '.github/labeler.yml'

pre-tokenizer-hashes .github/workflows/pre-tokenizer-hashes.yml

Triggers

push, pull_request

Runs on

ubuntu-slim

Jobs

pre-tokenizer-hashes

Commands

python3 -m venv .venv .venv/bin/pip install -r requirements/requirements-convert_hf_to_gguf_update.txt
cp convert_hf_to_gguf.py /tmp .venv/bin/python convert_hf_to_gguf_update.py --check-missing
if ! diff -q convert_hf_to_gguf.py /tmp/convert_hf_to_gguf.py; then echo "Model pre-tokenizer hashes (in convert_hf_to_gguf.py) do not match generated hashes (from convert_hf_to_gguf_update.py)." echo "To fix: run ./convert_hf_to_gguf_update.py and commit the updated convert_hf_to_gguf.py along with your changes" echo "Differences found:" diff convert_hf_to_gguf.py /tmp/convert_hf_to_gguf.py || true exit 1 fi echo "Model pre-tokenizer hashes are up to date."

View raw YAML

name: Check Pre-Tokenizer Hashes

on:
    push:
        paths:
            - 'convert_hf_to_gguf.py'
            - 'convert_hf_to_gguf_update.py'
    pull_request:
        paths:
            - 'convert_hf_to_gguf.py'
            - 'convert_hf_to_gguf_update.py'

jobs:
    pre-tokenizer-hashes:
        runs-on: ubuntu-slim

        steps:
        - name: Checkout repository
          uses: actions/checkout@v6

        - name: Set up Python
          uses: actions/setup-python@v6
          with:
              python-version: '3.11'

        - name: Install Python dependencies
          run: |
              python3 -m venv .venv
              .venv/bin/pip install -r requirements/requirements-convert_hf_to_gguf_update.txt

        - name: Update pre-tokenizer hashes
          run: |
              cp convert_hf_to_gguf.py /tmp
              .venv/bin/python convert_hf_to_gguf_update.py --check-missing

        - name: Check if committed pre-tokenizer hashes matches generated version
          run: |
              if ! diff -q convert_hf_to_gguf.py /tmp/convert_hf_to_gguf.py; then
                  echo "Model pre-tokenizer hashes (in convert_hf_to_gguf.py) do not match generated hashes (from convert_hf_to_gguf_update.py)."
                  echo "To fix: run ./convert_hf_to_gguf_update.py and commit the updated convert_hf_to_gguf.py along with your changes"
                  echo "Differences found:"
                  diff convert_hf_to_gguf.py /tmp/convert_hf_to_gguf.py || true
                  exit 1
              fi
              echo "Model pre-tokenizer hashes are up to date."

python-check-requirements .github/workflows/python-check-requirements.yml

Triggers

push, pull_request

Runs on

ubuntu-slim

Jobs

python-check-requirements

Commands

bash scripts/check-requirements.sh

View raw YAML

name: Python check requirements.txt

on:
  push:
    paths:
      - '.github/workflows/python-check-requirements.yml'
      - 'scripts/check-requirements.sh'
      - 'convert*.py'
      - '**/requirements*.txt'
  pull_request:
    paths:
      - '.github/workflows/python-check-requirements.yml'
      - 'scripts/check-requirements.sh'
      - 'convert*.py'
      - '**/requirements*.txt'

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
  cancel-in-progress: true

jobs:
  python-check-requirements:
    runs-on: ubuntu-slim
    name: check-requirements
    steps:
      - name: Check out source repository
        uses: actions/checkout@v6
      - name: Set up Python environment
        uses: actions/setup-python@v6
        with:
          python-version: "3.11"
      - name: Run check-requirements.sh script
        run:  bash scripts/check-requirements.sh

python-lint .github/workflows/python-lint.yml

Triggers: push, pull_request
Runs on: ubuntu-slim
Jobs: flake8-lint
Actions: py-actions/flake8

View raw YAML

name: flake8 Lint

on:
  push:
    branches:
      - master
    paths: [
      '.github/workflows/python-lint.yml',
      '**/*.py'
    ]
  pull_request:
    types: [opened, synchronize, reopened]
    paths: [
      '.github/workflows/python-lint.yml',
      '**/*.py'
    ]

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
  cancel-in-progress: true

jobs:
  flake8-lint:
    runs-on: ubuntu-slim
    name: Lint
    steps:
      - name: Check out source repository
        uses: actions/checkout@v6
      - name: Set up Python environment
        uses: actions/setup-python@v6
        with:
          python-version: "3.11"
      - name: flake8 Lint
        uses: py-actions/flake8@84ec6726560b6d5bd68f2a5bed83d62b52bb50ba # v2
        with:
            plugins: "flake8-no-print"

python-type-check .github/workflows/python-type-check.yml

Triggers

push, pull_request

Runs on

ubuntu-slim

Jobs

python-type-check

Commands

ty check --output-format=github

View raw YAML

name: Python Type-Check

on:
  push:
    paths:
      - '.github/workflows/python-type-check.yml'
      - 'ty.toml'
      - '**.py'
      - '**/requirements*.txt'
      # - 'pyrightconfig.json'
  pull_request:
    paths:
      - '.github/workflows/python-type-check.yml'
      - 'ty.toml'
      - '**.py'
      - '**/requirements*.txt'
      # - 'pyrightconfig.json'

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
  cancel-in-progress: true

jobs:
  python-type-check:
    runs-on: ubuntu-slim
    name: python type-check
    steps:
      - name: Check out source repository
        uses: actions/checkout@v6
      - name: Set up Python environment
        uses: actions/setup-python@v6
        with:
          python-version: "3.11"
          pip-install: -r requirements/requirements-all.txt ty==0.0.24
      # - name: Type-check with Pyright
      #   uses: jakebailey/pyright-action@v2
      #   with:
      #     version: 1.1.382
      #     level: warning
      #     warnings: true
      - name: Type-check with ty
        run: |
            ty check --output-format=github

release matrix .github/workflows/release.yml

Triggers

workflow_dispatch, push

Runs on

macos-14, macos-15-intel, ${{ matrix.os }}, ubuntu-22.04, ubuntu-24.04, windows-2025, windows-2025, windows-2022, windows-2022, ubuntu-22.04, windows-2022, macos-15, ${{ matrix.arch == 'aarch64' && 'ubuntu-24.04-arm' || 'ubuntu-24.04' }}, ubuntu-slim

Jobs

macOS-arm64, macOS-x64, ubuntu-22-cpu, ubuntu-22-vulkan, ubuntu-24-openvino, windows-cpu, windows, windows-cuda, windows-sycl, ubuntu-22-rocm, windows-hip, ios-xcode-build, openEuler-cann, release

Matrix

cuda, include, include.ROCM_VERSION, include.arch, include.backend, include.build, include.chip_type, include.defines, include.gpu_targets, include.name, include.os, include.target, include.use_acl_graph→ -DGGML_VULKAN=ON, -G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/arm64-windows-llvm.cmake -DCMAKE_PREFIX_PATH="$env:RUNNER_TEMP/opencl-arm64-release" -DGGML_OPENCL=ON -DGGML_OPENCL_USE_ADRENO_KERNELS=ON, 12.4, 13.1, 310p, 7.2, 910b, Release, aarch64, arm64, gfx1150;gfx1151;gfx1200;gfx1201;gfx1100;gfx1101;gfx1102;gfx1030;gfx1031;gfx1032, gfx908;gfx90a;gfx942;gfx1030;gfx1100;gfx1101;gfx1151;gfx1150;gfx1200;gfx1201, ggml-opencl, ggml-vulkan, off, on, opencl-adreno, radeon, s390x, ubuntu-22.04, ubuntu-24.04-s390x, vulkan, x64, x86

Actions

Commands

sysctl -a cmake -B build \ -DCMAKE_INSTALL_RPATH='@loader_path' \ -DCMAKE_BUILD_WITH_INSTALL_RPATH=ON \ -DLLAMA_FATAL_WARNINGS=ON \ -DLLAMA_BUILD_BORINGSSL=ON \ -DGGML_METAL_USE_BF16=ON \ -DGGML_METAL_EMBED_LIBRARY=ON \ -DGGML_RPC=ON \ ${{ env.CMAKE_ARGS }} cmake --build build --config Release -j $(sysctl -n hw.logicalcpu)
cp LICENSE ./build/bin/ tar -czvf llama-${{ steps.tag.outputs.name }}-bin-macos-arm64.tar.gz -s ",./,llama-${{ steps.tag.outputs.name }}/," -C ./build/bin .
sysctl -a # Metal is disabled due to intermittent failures with Github runners not having a GPU: # https://github.com/ggml-org/llama.cpp/actions/runs/8635935781/job/23674807267#step:5:2313 cmake -B build \ -DCMAKE_INSTALL_RPATH='@loader_path' \ -DCMAKE_BUILD_WITH_INSTALL_RPATH=ON \ -DLLAMA_FATAL_WARNINGS=ON \ -DLLAMA_BUILD_BORINGSSL=ON \ -DGGML_METAL=OFF \ -DGGML_RPC=ON \ -DCMAKE_OSX_DEPLOYMENT_TARGET=13.3 cmake --build build --config Release -j $(sysctl -n hw.logicalcpu)
cp LICENSE ./build/bin/ tar -czvf llama-${{ steps.tag.outputs.name }}-bin-macos-x64.tar.gz -s ",./,llama-${{ steps.tag.outputs.name }}/," -C ./build/bin .
sudo apt-get update sudo apt-get install build-essential libssl-dev
cmake -B build \ -DCMAKE_INSTALL_RPATH='$ORIGIN' \ -DCMAKE_BUILD_WITH_INSTALL_RPATH=ON \ -DGGML_BACKEND_DL=ON \ -DGGML_NATIVE=OFF \ -DGGML_CPU_ALL_VARIANTS=ON \ -DLLAMA_FATAL_WARNINGS=ON \ ${{ env.CMAKE_ARGS }} cmake --build build --config Release -j $(nproc)
cp LICENSE ./build/bin/ tar -czvf llama-${{ steps.tag.outputs.name }}-bin-ubuntu-${{ matrix.build }}.tar.gz --transform "s,./,llama-${{ steps.tag.outputs.name }}/," -C ./build/bin .
wget -qO - https://packages.lunarg.com/lunarg-signing-key-pub.asc | sudo apt-key add - sudo wget -qO /etc/apt/sources.list.d/lunarg-vulkan-jammy.list https://packages.lunarg.com/vulkan/lunarg-vulkan-jammy.list sudo apt-get update -y sudo apt-get install -y build-essential mesa-vulkan-drivers vulkan-sdk libssl-dev

View raw YAML

name: Release

on:
  workflow_dispatch: # allows manual triggering
    inputs:
      create_release:
        description: 'Create new release'
        required: true
        type: boolean
  push:
    branches:
      - master
    paths: [
      '.github/workflows/release.yml',
      '**/CMakeLists.txt',
      '**/.cmake',
      '**/*.h',
      '**/*.hpp',
      '**/*.c',
      '**/*.cpp',
      '**/*.cu',
      '**/*.cuh',
      '**/*.swift',
      '**/*.m',
      '**/*.metal',
      '**/*.comp',
      '**/*.glsl'
    ]

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
  cancel-in-progress: true

env:
  BRANCH_NAME: ${{ github.head_ref || github.ref_name }}
  CMAKE_ARGS: "-DLLAMA_BUILD_EXAMPLES=OFF -DLLAMA_BUILD_TESTS=OFF -DLLAMA_BUILD_TOOLS=ON -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON"

jobs:
  macOS-arm64:
    runs-on: macos-14

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6
        with:
          fetch-depth: 0

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: macOS-latest-arm64
          evict-old-files: 1d

      - name: Build
        id: cmake_build
        run: |
          sysctl -a
          cmake -B build \
            -DCMAKE_INSTALL_RPATH='@loader_path' \
            -DCMAKE_BUILD_WITH_INSTALL_RPATH=ON \
            -DLLAMA_FATAL_WARNINGS=ON \
            -DLLAMA_BUILD_BORINGSSL=ON \
            -DGGML_METAL_USE_BF16=ON \
            -DGGML_METAL_EMBED_LIBRARY=ON \
            -DGGML_RPC=ON \
            ${{ env.CMAKE_ARGS }}
          cmake --build build --config Release -j $(sysctl -n hw.logicalcpu)

      - name: Determine tag name
        id: tag
        uses: ./.github/actions/get-tag-name

      - name: Pack artifacts
        id: pack_artifacts
        run: |
          cp LICENSE ./build/bin/
          tar -czvf llama-${{ steps.tag.outputs.name }}-bin-macos-arm64.tar.gz -s ",./,llama-${{ steps.tag.outputs.name }}/," -C ./build/bin .

      - name: Upload artifacts
        uses: actions/upload-artifact@v6
        with:
          path: llama-${{ steps.tag.outputs.name }}-bin-macos-arm64.tar.gz
          name: llama-bin-macos-arm64.tar.gz

  macOS-x64:
    runs-on: macos-15-intel

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6
        with:
          fetch-depth: 0

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: macOS-latest-x64
          evict-old-files: 1d

      - name: Build
        id: cmake_build
        run: |
          sysctl -a
          # Metal is disabled due to intermittent failures with Github runners not having a GPU:
          # https://github.com/ggml-org/llama.cpp/actions/runs/8635935781/job/23674807267#step:5:2313
          cmake -B build \
            -DCMAKE_INSTALL_RPATH='@loader_path' \
            -DCMAKE_BUILD_WITH_INSTALL_RPATH=ON \
            -DLLAMA_FATAL_WARNINGS=ON \
            -DLLAMA_BUILD_BORINGSSL=ON \
            -DGGML_METAL=OFF \
            -DGGML_RPC=ON \
            -DCMAKE_OSX_DEPLOYMENT_TARGET=13.3
          cmake --build build --config Release -j $(sysctl -n hw.logicalcpu)

      - name: Determine tag name
        id: tag
        uses: ./.github/actions/get-tag-name

      - name: Pack artifacts
        id: pack_artifacts
        run: |
          cp LICENSE ./build/bin/
          tar -czvf llama-${{ steps.tag.outputs.name }}-bin-macos-x64.tar.gz -s ",./,llama-${{ steps.tag.outputs.name }}/," -C ./build/bin .

      - name: Upload artifacts
        uses: actions/upload-artifact@v6
        with:
          path: llama-${{ steps.tag.outputs.name }}-bin-macos-x64.tar.gz
          name: llama-bin-macos-x64.tar.gz

  ubuntu-22-cpu:
    strategy:
      matrix:
        include:
          - build: 'x64'
            os: ubuntu-22.04
          - build: 's390x'
            os: ubuntu-24.04-s390x
          # GGML_BACKEND_DL and GGML_CPU_ALL_VARIANTS are not currently supported on arm
          # - build: 'arm64'
          #   os: ubuntu-22.04-arm

    runs-on: ${{ matrix.os }}

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6
        with:
          fetch-depth: 0

      - name: ccache
        if: ${{ matrix.build != 's390x' }}
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: ubuntu-cpu-${{ matrix.build }}
          evict-old-files: 1d

      - name: Dependencies
        id: depends
        run: |
          sudo apt-get update
          sudo apt-get install build-essential libssl-dev

      - name: Build
        id: cmake_build
        run: |
          cmake -B build \
            -DCMAKE_INSTALL_RPATH='$ORIGIN' \
            -DCMAKE_BUILD_WITH_INSTALL_RPATH=ON \
            -DGGML_BACKEND_DL=ON \
            -DGGML_NATIVE=OFF \
            -DGGML_CPU_ALL_VARIANTS=ON \
            -DLLAMA_FATAL_WARNINGS=ON \
            ${{ env.CMAKE_ARGS }}
          cmake --build build --config Release -j $(nproc)

      - name: Determine tag name
        id: tag
        uses: ./.github/actions/get-tag-name

      - name: Pack artifacts
        id: pack_artifacts
        run: |
          cp LICENSE ./build/bin/
          tar -czvf llama-${{ steps.tag.outputs.name }}-bin-ubuntu-${{ matrix.build }}.tar.gz --transform "s,./,llama-${{ steps.tag.outputs.name }}/," -C ./build/bin .

      - name: Upload artifacts
        uses: actions/upload-artifact@v6
        with:
          path: llama-${{ steps.tag.outputs.name }}-bin-ubuntu-${{ matrix.build }}.tar.gz
          name: llama-bin-ubuntu-${{ matrix.build }}.tar.gz

  ubuntu-22-vulkan:
    runs-on: ubuntu-22.04

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6
        with:
          fetch-depth: 0

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: ubuntu-22-vulkan
          evict-old-files: 1d

      - name: Dependencies
        id: depends
        run: |
          wget -qO - https://packages.lunarg.com/lunarg-signing-key-pub.asc | sudo apt-key add -
          sudo wget -qO /etc/apt/sources.list.d/lunarg-vulkan-jammy.list https://packages.lunarg.com/vulkan/lunarg-vulkan-jammy.list
          sudo apt-get update -y
          sudo apt-get install -y build-essential mesa-vulkan-drivers vulkan-sdk libssl-dev

      - name: Build
        id: cmake_build
        run: |
          cmake -B build \
            -DCMAKE_INSTALL_RPATH='$ORIGIN' \
            -DCMAKE_BUILD_WITH_INSTALL_RPATH=ON \
            -DGGML_BACKEND_DL=ON \
            -DGGML_NATIVE=OFF \
            -DGGML_CPU_ALL_VARIANTS=ON \
            -DGGML_VULKAN=ON \
            ${{ env.CMAKE_ARGS }}
          cmake --build build --config Release -j $(nproc)

      - name: Determine tag name
        id: tag
        uses: ./.github/actions/get-tag-name

      - name: Pack artifacts
        id: pack_artifacts
        run: |
          cp LICENSE ./build/bin/
          tar -czvf llama-${{ steps.tag.outputs.name }}-bin-ubuntu-vulkan-x64.tar.gz --transform "s,./,llama-${{ steps.tag.outputs.name }}/," -C ./build/bin .

      - name: Upload artifacts
        uses: actions/upload-artifact@v6
        with:
          path: llama-${{ steps.tag.outputs.name }}-bin-ubuntu-vulkan-x64.tar.gz
          name: llama-bin-ubuntu-vulkan-x64.tar.gz

  ubuntu-24-openvino:
    runs-on: ubuntu-24.04

    outputs:
      openvino_version: ${{ steps.openvino_version.outputs.value }}

    env:
      # Sync versions in build.yml, build-self-hosted.yml, release.yml, build-cache.yml, .devops/openvino.Dockerfile
      OPENVINO_VERSION_MAJOR: "2026.0"
      OPENVINO_VERSION_FULL: "2026.0.0.20965.c6d6a13a886"

    steps:
      - name: Set OpenVINO version output
        id: openvino_version
        run: echo "value=${{ env.OPENVINO_VERSION_MAJOR }}" >> $GITHUB_OUTPUT

      - name: Clone
        id: checkout
        uses: actions/checkout@v6
        with:
          fetch-depth: 0

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: ubuntu-24-openvino-release-no-preset-v1
          evict-old-files: 1d

      - name: Dependencies
        run: |
          sudo apt-get update
          sudo apt-get install -y build-essential libssl-dev libtbb12 cmake ninja-build python3-pip
          sudo apt install ocl-icd-opencl-dev opencl-headers opencl-clhpp-headers intel-opencl-icd

      - name: Use OpenVINO Toolkit Cache
        uses: actions/cache@v5
        id: cache-openvino
        with:
          path: ./openvino_toolkit
          key: openvino-toolkit-v${{ env.OPENVINO_VERSION_FULL }}-${{ runner.os }}

      - name: Setup OpenVINO Toolkit
        if: steps.cache-openvino.outputs.cache-hit != 'true'
        uses: ./.github/actions/linux-setup-openvino
        with:
          path: ./openvino_toolkit
          version_major: ${{ env.OPENVINO_VERSION_MAJOR }}
          version_full: ${{ env.OPENVINO_VERSION_FULL }}

      - name: Install OpenVINO dependencies
        run: |
          cd ./openvino_toolkit
          chmod +x ./install_dependencies/install_openvino_dependencies.sh
          echo "Y" | sudo -E ./install_dependencies/install_openvino_dependencies.sh

      - name: Build
        id: cmake_build
        run: |
          source ./openvino_toolkit/setupvars.sh
          cmake -B build/ReleaseOV -G Ninja \
            -DCMAKE_BUILD_TYPE=Release \
            -DGGML_OPENVINO=ON
          cmake --build build/ReleaseOV --config Release -j $(nproc)

      - name: Determine tag name
        id: tag
        uses: ./.github/actions/get-tag-name

      - name: Pack artifacts
        id: pack_artifacts
        run: |
          cp LICENSE ./build/ReleaseOV/bin/
          tar -czvf llama-${{ steps.tag.outputs.name }}-bin-ubuntu-openvino-${{ env.OPENVINO_VERSION_MAJOR }}-x64.tar.gz --transform "s,./,llama-${{ steps.tag.outputs.name }}/," -C ./build/ReleaseOV/bin .

      - name: Upload artifacts
        uses: actions/upload-artifact@v6
        with:
          path: llama-${{ steps.tag.outputs.name }}-bin-ubuntu-openvino-${{ env.OPENVINO_VERSION_MAJOR }}-x64.tar.gz
          name: llama-bin-ubuntu-openvino-${{ env.OPENVINO_VERSION_MAJOR }}-x64.tar.gz

  windows-cpu:
    runs-on: windows-2025

    strategy:
      matrix:
        include:
          - arch: 'x64'
          - arch: 'arm64'

    steps:
      - name: Clone
        uses: actions/checkout@v6
        with:
          fetch-depth: 0

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: windows-latest-cpu-${{ matrix.arch }}
          variant: ccache
          evict-old-files: 1d

      - name: Install Ninja
        run: |
          choco install ninja

      - name: Build
        shell: cmd
        run: |
          call "C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Auxiliary\Build\vcvarsall.bat" ${{ matrix.arch == 'x64' && 'x64' || 'amd64_arm64' }}
          cmake -S . -B build -G "Ninja Multi-Config" ^
            -D CMAKE_TOOLCHAIN_FILE=cmake/${{ matrix.arch }}-windows-llvm.cmake ^
            -DLLAMA_BUILD_BORINGSSL=ON ^
            -DGGML_NATIVE=OFF ^
            -DGGML_BACKEND_DL=ON ^
            -DGGML_CPU_ALL_VARIANTS=${{ matrix.arch == 'x64' && 'ON' || 'OFF' }} ^
            -DGGML_OPENMP=ON ^
            ${{ env.CMAKE_ARGS }}
          cmake --build build --config Release

      - name: Pack artifacts
        id: pack_artifacts
        run: |
          Copy-Item "C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Redist\MSVC\14.44.35112\debug_nonredist\${{ matrix.arch }}\Microsoft.VC143.OpenMP.LLVM\libomp140.${{ matrix.arch == 'x64' && 'x86_64' || 'aarch64' }}.dll" .\build\bin\Release\
          7z a -snl llama-bin-win-cpu-${{ matrix.arch }}.zip .\build\bin\Release\*

      - name: Upload artifacts
        uses: actions/upload-artifact@v6
        with:
          path: llama-bin-win-cpu-${{ matrix.arch }}.zip
          name: llama-bin-win-cpu-${{ matrix.arch }}.zip

  windows:
    runs-on: windows-2025

    env:
      OPENBLAS_VERSION: 0.3.23
      VULKAN_VERSION: 1.4.313.2

    strategy:
      matrix:
        include:
          - backend: 'vulkan'
            arch: 'x64'
            defines: '-DGGML_VULKAN=ON'
            target: 'ggml-vulkan'
          - backend: 'opencl-adreno'
            arch: 'arm64'
            defines: '-G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/arm64-windows-llvm.cmake -DCMAKE_PREFIX_PATH="$env:RUNNER_TEMP/opencl-arm64-release" -DGGML_OPENCL=ON -DGGML_OPENCL_USE_ADRENO_KERNELS=ON'
            target: 'ggml-opencl'

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: windows-latest-${{ matrix.backend }}-${{ matrix.arch }}
          variant: ccache
          evict-old-files: 1d

      - name: Install Vulkan SDK
        id: get_vulkan
        if: ${{ matrix.backend == 'vulkan' }}
        run: |
          curl.exe -o $env:RUNNER_TEMP/VulkanSDK-Installer.exe -L "https://sdk.lunarg.com/sdk/download/${env:VULKAN_VERSION}/windows/vulkansdk-windows-X64-${env:VULKAN_VERSION}.exe"
          & "$env:RUNNER_TEMP\VulkanSDK-Installer.exe" --accept-licenses --default-answer --confirm-command install
          Add-Content $env:GITHUB_ENV "VULKAN_SDK=C:\VulkanSDK\${env:VULKAN_VERSION}"
          Add-Content $env:GITHUB_PATH "C:\VulkanSDK\${env:VULKAN_VERSION}\bin"

      - name: Install Ninja
        id: install_ninja
        run: |
          choco install ninja

      - name: Install OpenCL Headers and Libs
        id: install_opencl
        if: ${{ matrix.backend == 'opencl-adreno' && matrix.arch == 'arm64' }}
        run: |
          git clone https://github.com/KhronosGroup/OpenCL-Headers
          cd OpenCL-Headers
          cmake -B build `
            -DBUILD_TESTING=OFF `
            -DOPENCL_HEADERS_BUILD_TESTING=OFF `
            -DOPENCL_HEADERS_BUILD_CXX_TESTS=OFF `
            -DCMAKE_INSTALL_PREFIX="$env:RUNNER_TEMP/opencl-arm64-release"
          cmake --build build --target install
          git clone https://github.com/KhronosGroup/OpenCL-ICD-Loader
          cd OpenCL-ICD-Loader
          cmake -B build-arm64-release `
            -A arm64 `
            -DCMAKE_PREFIX_PATH="$env:RUNNER_TEMP/opencl-arm64-release" `
            -DCMAKE_INSTALL_PREFIX="$env:RUNNER_TEMP/opencl-arm64-release"
          cmake --build build-arm64-release --target install --config release

      - name: Build
        id: cmake_build
        run: |
          cmake -S . -B build ${{ matrix.defines }} -DGGML_NATIVE=OFF -DGGML_CPU=OFF -DGGML_BACKEND_DL=ON -DLLAMA_BUILD_BORINGSSL=ON
          cmake --build build --config Release --target ${{ matrix.target }}

      - name: Pack artifacts
        id: pack_artifacts
        run: |
          7z a -snl llama-bin-win-${{ matrix.backend }}-${{ matrix.arch }}.zip .\build\bin\Release\${{ matrix.target }}.dll

      - name: Upload artifacts
        uses: actions/upload-artifact@v6
        with:
          path: llama-bin-win-${{ matrix.backend }}-${{ matrix.arch }}.zip
          name: llama-bin-win-${{ matrix.backend }}-${{ matrix.arch }}.zip

  windows-cuda:
    runs-on: windows-2022

    strategy:
      matrix:
        cuda: ['12.4', '13.1']

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Install ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: windows-cuda-${{ matrix.cuda }}
          variant: ccache
          evict-old-files: 1d

      - name: Install Cuda Toolkit
        uses: ./.github/actions/windows-setup-cuda
        with:
          cuda_version: ${{ matrix.cuda }}

      - name: Install Ninja
        id: install_ninja
        run: |
          choco install ninja

      - name: Build
        id: cmake_build
        shell: cmd
        # TODO: Remove GGML_CUDA_CUB_3DOT2 flag once CCCL 3.2 is bundled within CTK and that CTK version is used in this project
        run: |
          call "C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Auxiliary\Build\vcvarsall.bat" x64
          cmake -S . -B build -G "Ninja Multi-Config" ^
            -DGGML_BACKEND_DL=ON ^
            -DGGML_NATIVE=OFF ^
            -DGGML_CPU=OFF ^
            -DGGML_CUDA=ON ^
            -DLLAMA_BUILD_BORINGSSL=ON ^
            -DGGML_CUDA_CUB_3DOT2=ON
          set /A NINJA_JOBS=%NUMBER_OF_PROCESSORS%-1
          cmake --build build --config Release -j %NINJA_JOBS% --target ggml-cuda

      - name: Pack artifacts
        id: pack_artifacts
        run: |
          7z a -snl llama-bin-win-cuda-${{ matrix.cuda }}-x64.zip .\build\bin\Release\ggml-cuda.dll

      - name: Upload artifacts
        uses: actions/upload-artifact@v6
        with:
          path: llama-bin-win-cuda-${{ matrix.cuda }}-x64.zip
          name: llama-bin-win-cuda-${{ matrix.cuda }}-x64.zip

      - name: Copy and pack Cuda runtime
        run: |
          echo "Cuda install location: ${{ env.CUDA_PATH }}"
          $dst='.\build\bin\cudart\'
          robocopy "${{env.CUDA_PATH}}\bin" $dst cudart64_*.dll cublas64_*.dll cublasLt64_*.dll
          robocopy "${{env.CUDA_PATH}}\lib" $dst cudart64_*.dll cublas64_*.dll cublasLt64_*.dll
          robocopy "${{env.CUDA_PATH}}\bin\x64" $dst cudart64_*.dll cublas64_*.dll cublasLt64_*.dll
          7z a cudart-llama-bin-win-cuda-${{ matrix.cuda }}-x64.zip $dst\*

      - name: Upload Cuda runtime
        uses: actions/upload-artifact@v6
        with:
          path: cudart-llama-bin-win-cuda-${{ matrix.cuda }}-x64.zip
          name: cudart-llama-bin-win-cuda-${{ matrix.cuda }}-x64.zip

  windows-sycl:
    runs-on: windows-2022

    defaults:
      run:
        shell: bash

    env:
      WINDOWS_BASEKIT_URL: https://registrationcenter-download.intel.com/akdlm/IRC_NAS/24751ead-ddc5-4479-b9e6-f9fe2ff8b9f2/intel-deep-learning-essentials-2025.2.1.25_offline.exe
      WINDOWS_DPCPP_MKL: intel.oneapi.win.cpp-dpcpp-common:intel.oneapi.win.mkl.devel:intel.oneapi.win.dnnl:intel.oneapi.win.tbb.devel
      ONEAPI_ROOT: "C:/Program Files (x86)/Intel/oneAPI"

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: windows-latest-sycl
          variant: ccache
          evict-old-files: 1d

      - name: Install
        run:  |
          scripts/install-oneapi.bat $WINDOWS_BASEKIT_URL $WINDOWS_DPCPP_MKL

      - name: Build
        id: cmake_build
        shell: cmd
        run: |
          call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat" intel64 --force
          cmake -G "Ninja" -B build ^
            -DCMAKE_C_COMPILER=cl -DCMAKE_CXX_COMPILER=icx ^
            -DCMAKE_BUILD_TYPE=Release ^
            -DGGML_BACKEND_DL=ON -DBUILD_SHARED_LIBS=ON ^
            -DGGML_CPU=OFF -DGGML_SYCL=ON ^
            -DLLAMA_BUILD_BORINGSSL=ON
          cmake --build build --target ggml-sycl -j

      - name: Build the release package
        id: pack_artifacts
        run: |
          echo "cp oneAPI running time dll files in ${{ env.ONEAPI_ROOT }} to ./build/bin"

          cp "${{ env.ONEAPI_ROOT }}/mkl/latest/bin/mkl_sycl_blas.5.dll" ./build/bin
          cp "${{ env.ONEAPI_ROOT }}/mkl/latest/bin/mkl_core.2.dll" ./build/bin
          cp "${{ env.ONEAPI_ROOT }}/mkl/latest/bin/mkl_tbb_thread.2.dll" ./build/bin

          cp "${{ env.ONEAPI_ROOT }}/compiler/latest/bin/ur_adapter_level_zero.dll" ./build/bin
          cp "${{ env.ONEAPI_ROOT }}/compiler/latest/bin/ur_adapter_level_zero_v2.dll" ./build/bin
          cp "${{ env.ONEAPI_ROOT }}/compiler/latest/bin/ur_adapter_opencl.dll" ./build/bin
          cp "${{ env.ONEAPI_ROOT }}/compiler/latest/bin/ur_loader.dll" ./build/bin
          cp "${{ env.ONEAPI_ROOT }}/compiler/latest/bin/ur_win_proxy_loader.dll" ./build/bin

          cp "${{ env.ONEAPI_ROOT }}/compiler/latest/bin/sycl8.dll" ./build/bin
          cp "${{ env.ONEAPI_ROOT }}/compiler/latest/bin/svml_dispmd.dll" ./build/bin
          cp "${{ env.ONEAPI_ROOT }}/compiler/latest/bin/libmmd.dll" ./build/bin
          cp "${{ env.ONEAPI_ROOT }}/compiler/latest/bin/libiomp5md.dll" ./build/bin
          cp "${{ env.ONEAPI_ROOT }}/compiler/latest/bin/sycl-ls.exe" ./build/bin
          cp "${{ env.ONEAPI_ROOT }}/compiler/latest/bin/libsycl-fallback-bfloat16.spv" ./build/bin
          cp "${{ env.ONEAPI_ROOT }}/compiler/latest/bin/libsycl-native-bfloat16.spv" ./build/bin

          cp "${{ env.ONEAPI_ROOT }}/dnnl/latest/bin/dnnl.dll" ./build/bin
          cp "${{ env.ONEAPI_ROOT }}/tbb/latest/bin/tbb12.dll" ./build/bin

          cp "${{ env.ONEAPI_ROOT }}/tcm/latest/bin/tcm.dll" ./build/bin
          cp "${{ env.ONEAPI_ROOT }}/tcm/latest/bin/libhwloc-15.dll" ./build/bin
          cp "${{ env.ONEAPI_ROOT }}/umf/latest/bin/umf.dll" ./build/bin

          echo "cp oneAPI running time dll files to ./build/bin done"
          7z a -snl llama-bin-win-sycl-x64.zip ./build/bin/*

      - name: Upload the release package
        uses: actions/upload-artifact@v6
        with:
          path: llama-bin-win-sycl-x64.zip
          name: llama-bin-win-sycl-x64.zip

  ubuntu-22-rocm:
    runs-on: ubuntu-22.04

    strategy:
      matrix:
        include:
          - ROCM_VERSION: "7.2"
            gpu_targets: "gfx908;gfx90a;gfx942;gfx1030;gfx1100;gfx1101;gfx1151;gfx1150;gfx1200;gfx1201"
            build: 'x64'

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6
        with:
          fetch-depth: 0

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: ubuntu-rocm-${{ matrix.ROCM_VERSION }}-${{ matrix.build }}
          evict-old-files: 1d

      - name: Dependencies
        id: depends
        run: |
          sudo apt install -y build-essential git cmake wget

      - name: Setup Legacy ROCm
        if: matrix.ROCM_VERSION == '7.2'
        id: legacy_env
        run: |
          sudo mkdir --parents --mode=0755 /etc/apt/keyrings
          wget https://repo.radeon.com/rocm/rocm.gpg.key -O - | \
            gpg --dearmor | sudo tee /etc/apt/keyrings/rocm.gpg > /dev/null

          sudo tee /etc/apt/sources.list.d/rocm.list << EOF
          deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/${{ matrix.ROCM_VERSION }} jammy main
          EOF

          sudo tee /etc/apt/preferences.d/rocm-pin-600 << EOF
          Package: *
          Pin: release o=repo.radeon.com
          Pin-Priority: 600
          EOF

          sudo apt update
          sudo apt-get install -y libssl-dev rocm-hip-sdk

      - name: Setup TheRock
        if: matrix.ROCM_VERSION != '7.2'
        id: therock_env
        run: |
          wget https://repo.amd.com/rocm/tarball/therock-dist-linux-gfx1151-${{ matrix.ROCM_VERSION }}.tar.gz
          mkdir install
          tar -xf *.tar.gz -C install
          export ROCM_PATH=$(pwd)/install
          echo ROCM_PATH=$ROCM_PATH >> $GITHUB_ENV
          echo PATH=$PATH:$ROCM_PATH/bin >> $GITHUB_ENV
          echo LD_LIBRARY_PATH=$ROCM_PATH/lib:$ROCM_PATH/llvm/lib:$ROCM_PATH/lib/rocprofiler-systems >> $GITHUB_ENV

      - name: Build with native CMake HIP support
        id: cmake_build
        run: |
          cmake -B build -S . \
            -DCMAKE_HIP_COMPILER="$(hipconfig -l)/clang" \
            -DCMAKE_HIP_FLAGS="-mllvm --amdgpu-unroll-threshold-local=600" \
            -DCMAKE_BUILD_TYPE=Release \
            -DGGML_BACKEND_DL=ON \
            -DGGML_NATIVE=OFF \
            -DCMAKE_INSTALL_RPATH='$ORIGIN' \
            -DCMAKE_BUILD_WITH_INSTALL_RPATH=ON \
            -DGGML_CPU_ALL_VARIANTS=ON \
            -DGPU_TARGETS="${{ matrix.gpu_targets }}" \
            -DGGML_HIP=ON \
            -DHIP_PLATFORM=amd \
            -DGGML_HIP_ROCWMMA_FATTN=ON \
            ${{ env.CMAKE_ARGS }}
          cmake --build build --config Release -j $(nproc)

      - name: Determine tag name
        id: tag
        uses: ./.github/actions/get-tag-name

      - name: Pack artifacts
        id: pack_artifacts
        run: |
          cp LICENSE ./build/bin/
          tar -czvf llama-${{ steps.tag.outputs.name }}-bin-ubuntu-rocm-${{ matrix.ROCM_VERSION }}-${{ matrix.build }}.tar.gz --transform "s,./,llama-${{ steps.tag.outputs.name }}/," -C ./build/bin .

      - name: Upload artifacts
        uses: actions/upload-artifact@v6
        with:
          path: llama-${{ steps.tag.outputs.name }}-bin-ubuntu-rocm-${{ matrix.ROCM_VERSION }}-${{ matrix.build }}.tar.gz
          name: llama-bin-ubuntu-rocm-${{ matrix.ROCM_VERSION }}-${{ matrix.build }}.tar.gz

  windows-hip:
    runs-on: windows-2022

    env:
      HIPSDK_INSTALLER_VERSION: "26.Q1"

    strategy:
      matrix:
        include:
          - name: "radeon"
            gpu_targets: "gfx1150;gfx1151;gfx1200;gfx1201;gfx1100;gfx1101;gfx1102;gfx1030;gfx1031;gfx1032"

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6

      - name: Grab rocWMMA package
        id: grab_rocwmma
        run: |
          curl -o rocwmma.deb "https://repo.radeon.com/rocm/apt/7.2/pool/main/r/rocwmma-dev/rocwmma-dev_2.2.0.70200-43~24.04_amd64.deb"
          7z x rocwmma.deb
          7z x data.tar

      - name: Cache ROCm Installation
        id: cache-rocm
        uses: actions/cache@v5
        with:
          path: C:\Program Files\AMD\ROCm
          key: rocm-${{ env.HIPSDK_INSTALLER_VERSION }}-${{ runner.os }}

      - name: ccache
        uses: ggml-org/ccache-action@v1.2.21
        with:
          key: windows-latest-hip-${{ env.HIPSDK_INSTALLER_VERSION }}-${{ matrix.name }}-x64
          evict-old-files: 1d

      - name: Install ROCm
        if: steps.cache-rocm.outputs.cache-hit != 'true'
        id: depends
        run: |
          $ErrorActionPreference = "Stop"
          write-host "Downloading AMD HIP SDK Installer"
          Invoke-WebRequest -Uri "https://download.amd.com/developer/eula/rocm-hub/AMD-Software-PRO-Edition-${{ env.HIPSDK_INSTALLER_VERSION }}-Win11-For-HIP.exe" -OutFile "${env:RUNNER_TEMP}\rocm-install.exe"
          write-host "Installing AMD HIP SDK"
          $proc = Start-Process "${env:RUNNER_TEMP}\rocm-install.exe" -ArgumentList '-install' -NoNewWindow -PassThru
          $completed = $proc.WaitForExit(600000)
          if (-not $completed) {
              Write-Error "ROCm installation timed out after 10 minutes. Killing the process"
              $proc.Kill()
              exit 1
          }
          if ($proc.ExitCode -ne 0) {
              Write-Error "ROCm installation failed with exit code $($proc.ExitCode)"
              exit 1
          }
          write-host "Completed AMD HIP SDK installation"

      - name: Verify ROCm
        id: verify
        run: |
          # Find and test ROCm installation
          $clangPath = Get-ChildItem 'C:\Program Files\AMD\ROCm\*\bin\clang.exe' | Select-Object -First 1
          if (-not $clangPath) {
            Write-Error "ROCm installation not found"
            exit 1
          }
          & $clangPath.FullName --version

      - name: Build
        id: cmake_build
        run: |
          $env:HIP_PATH=$(Resolve-Path 'C:\Program Files\AMD\ROCm\*\bin\clang.exe' | split-path | split-path)
          $env:CMAKE_PREFIX_PATH="${env:HIP_PATH}"
          cmake -G "Unix Makefiles" -B build -S . `
            -DCMAKE_C_COMPILER="${env:HIP_PATH}\bin\clang.exe" `
            -DCMAKE_CXX_COMPILER="${env:HIP_PATH}\bin\clang++.exe" `
            -DCMAKE_CXX_FLAGS="-I$($PWD.Path.Replace('\', '/'))/opt/rocm-7.2.0/include/ -Wno-ignored-attributes -Wno-nested-anon-types" `
            -DCMAKE_BUILD_TYPE=Release `
            -DGGML_BACKEND_DL=ON `
            -DGGML_NATIVE=OFF `
            -DGGML_CPU=OFF `
            -DGPU_TARGETS="${{ matrix.gpu_targets }}" `
            -DGGML_HIP_ROCWMMA_FATTN=ON `
            -DGGML_HIP=ON `
            -DLLAMA_BUILD_BORINGSSL=ON
          cmake --build build --target ggml-hip -j ${env:NUMBER_OF_PROCESSORS}
          md "build\bin\rocblas\library\"
          md "build\bin\hipblaslt\library"
          cp "${env:HIP_PATH}\bin\libhipblas.dll" "build\bin\"
          cp "${env:HIP_PATH}\bin\libhipblaslt.dll" "build\bin\"
          cp "${env:HIP_PATH}\bin\rocblas.dll" "build\bin\"
          cp "${env:HIP_PATH}\bin\rocblas\library\*" "build\bin\rocblas\library\"
          cp "${env:HIP_PATH}\bin\hipblaslt\library\*" "build\bin\hipblaslt\library\"

      - name: Pack artifacts
        id: pack_artifacts
        run: |
          7z a -snl llama-bin-win-hip-${{ matrix.name }}-x64.zip .\build\bin\*

      - name: Upload artifacts
        uses: actions/upload-artifact@v6
        with:
          path: llama-bin-win-hip-${{ matrix.name }}-x64.zip
          name: llama-bin-win-hip-${{ matrix.name }}-x64.zip

  ios-xcode-build:
    runs-on: macos-15

    steps:
      - name: Checkout code
        uses: actions/checkout@v6
        with:
          fetch-depth: 0

      - name: Setup Xcode
        run: |
          sudo xcode-select -s /Applications/Xcode_16.4.app

      - name: Build
        id: cmake_build
        run: |
          sysctl -a
          cmake -B build -G Xcode \
            -DGGML_METAL_USE_BF16=ON \
            -DGGML_METAL_EMBED_LIBRARY=ON \
            -DLLAMA_OPENSSL=OFF \
            -DLLAMA_BUILD_EXAMPLES=OFF \
            -DLLAMA_BUILD_TOOLS=OFF \
            -DLLAMA_BUILD_TESTS=OFF \
            -DLLAMA_BUILD_SERVER=OFF \
            -DCMAKE_SYSTEM_NAME=iOS \
            -DCMAKE_OSX_DEPLOYMENT_TARGET=14.0 \
            -DCMAKE_XCODE_ATTRIBUTE_DEVELOPMENT_TEAM=ggml
          cmake --build build --config Release -j $(sysctl -n hw.logicalcpu) -- CODE_SIGNING_ALLOWED=NO

      - name: xcodebuild for swift package
        id: xcodebuild
        run: |
          ./build-xcframework.sh

      - name: Build Xcode project
        run: xcodebuild -project examples/llama.swiftui/llama.swiftui.xcodeproj -scheme llama.swiftui -sdk iphoneos CODE_SIGNING_REQUIRED=NO CODE_SIGN_IDENTITY= -destination 'generic/platform=iOS' FRAMEWORK_FOLDER_PATH=./build-ios build

      - name: Determine tag name
        id: tag
        uses: ./.github/actions/get-tag-name

      - name: Pack artifacts
        id: pack_artifacts
        run: |
          # Zip file is required for Swift Package Manager, which does not support tar.gz for binary targets.
          # For more details, see https://developer.apple.com/documentation/xcode/distributing-binary-frameworks-as-swift-packages
          zip -r -y llama-${{ steps.tag.outputs.name }}-xcframework.zip build-apple/llama.xcframework

      - name: Upload artifacts
        uses: actions/upload-artifact@v6
        with:
          path: llama-${{ steps.tag.outputs.name }}-xcframework.zip
          name: llama-${{ steps.tag.outputs.name }}-xcframework.zip


  openEuler-cann:
    strategy:
      matrix:
        include:
          # 910b with aclgraph (both architectures)
          - arch: x86
            chip_type: '910b'
            build: 'Release'
            use_acl_graph: 'on'
          - arch: aarch64
            chip_type: '910b'
            build: 'Release'
            use_acl_graph: 'on'
          # 310p without aclgraph (both architectures)
          - arch: x86
            chip_type: '310p'
            build: 'Release'
            use_acl_graph: 'off'
          - arch: aarch64
            chip_type: '310p'
            build: 'Release'
            use_acl_graph: 'off'
    runs-on: ${{ matrix.arch == 'aarch64' && 'ubuntu-24.04-arm' || 'ubuntu-24.04' }}
    steps:
      - name: Checkout
        uses: actions/checkout@v6
        with:
          fetch-depth: 0

      - name: Free up disk space
        uses: ggml-org/free-disk-space@v1.3.1
        with:
          tool-cache: true

      - name: Set container image
        id: cann-image
        run: |
          image="ascendai/cann:${{ matrix.chip_type == '910b' &&  '8.5.0-910b-openeuler24.03-py3.11' || '8.5.0-310p-openeuler24.03-py3.11' }}"
          echo "image=${image}" >> "${GITHUB_OUTPUT}"

      - name: Pull container image
        run: docker pull "${{ steps.cann-image.outputs.image }}"

      - name: Build
        env:
          BUILD_TYPE: ${{ matrix.build }}
          SOC_TYPE: ascend${{ matrix.chip_type }}
          USE_ACL_GRAPH: ${{ matrix.use_acl_graph }}
        run: |
          HOST_UID=$(id -u)
          HOST_GID=$(id -g)

          docker run --rm \
            -v "${PWD}:/workspace" \
            -w /workspace \
            -e SOC_TYPE=${SOC_TYPE} \
            -e BUILD_TYPE=${BUILD_TYPE} \
            -e USE_ACL_GRAPH=${USE_ACL_GRAPH} \
            "${{ steps.cann-image.outputs.image }}" \
            bash -lc '
              set -e
              yum install -y --setopt=install_weak_deps=False --setopt=tsflags=nodocs git gcc gcc-c++ make cmake openssl-devel
              yum clean all && rm -rf /var/cache/yum
              git config --global --add safe.directory "/workspace"
              export LD_LIBRARY_PATH=${ASCEND_TOOLKIT_HOME}/lib64:${ASCEND_TOOLKIT_HOME}/$(uname -m)-linux/devlib/:${LD_LIBRARY_PATH}
              cmake -S . -B build \
                  -DCMAKE_BUILD_TYPE=${BUILD_TYPE} \
                  -DGGML_CANN=on \
                  -DSOC_TYPE=${SOC_TYPE} \
                  -DUSE_ACL_GRAPH=${USE_ACL_GRAPH}
              cmake --build build -j $(nproc)

              chown -R '"${HOST_UID}"':'"${HOST_GID}"' /workspace/build
            '

      - name: Determine tag name
        id: tag
        uses: ./.github/actions/get-tag-name

      - name: Pack artifacts
        run: |
          cp LICENSE ./build/bin/
          tar -czvf llama-${{ steps.tag.outputs.name }}-bin-${{ matrix.chip_type }}-openEuler-${{ matrix.arch }}${{ matrix.use_acl_graph == 'on' && '-aclgraph' || '' }}.tar.gz --transform "s,./,llama-${{ steps.tag.outputs.name }}/," -C ./build/bin .

      - name: Upload artifacts
        uses: actions/upload-artifact@v6
        with:
          path: llama-${{ steps.tag.outputs.name }}-bin-${{ matrix.chip_type }}-openEuler-${{ matrix.arch }}${{ matrix.use_acl_graph == 'on' && '-aclgraph' || '' }}.tar.gz
          name: llama-bin-${{ matrix.chip_type }}-openEuler-${{ matrix.arch }}${{ matrix.use_acl_graph == 'on' && '-aclgraph' || '' }}.tar.gz

  release:
    if: ${{ ( github.event_name == 'push' && github.ref == 'refs/heads/master' ) || github.event.inputs.create_release == 'true' }}

    # Fine-grant permission
    # https://docs.github.com/en/actions/security-for-github-actions/security-guides/automatic-token-authentication#modifying-the-permissions-for-the-github_token
    permissions:
        contents: write # for creating release

    runs-on: ubuntu-slim

    needs:
      - windows
      - windows-cpu
      - windows-cuda
      - windows-sycl
      - windows-hip
      - ubuntu-22-rocm
      - ubuntu-22-cpu
      - ubuntu-22-vulkan
      - ubuntu-24-openvino
      - macOS-arm64
      - macOS-x64
      - ios-xcode-build
      - openEuler-cann

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6
        with:
          fetch-depth: 0

      - name: Determine tag name
        id: tag
        uses: ./.github/actions/get-tag-name

      - name: Download artifacts
        id: download-artifact
        uses: actions/download-artifact@v7
        with:
          path: ./artifact
          merge-multiple: true

      - name: Move artifacts
        id: move_artifacts
        run: |
          mkdir -p release

          echo "Adding CPU backend files to existing zips..."
          for arch in x64 arm64; do
            cpu_zip="artifact/llama-bin-win-cpu-${arch}.zip"
            temp_dir=$(mktemp -d)
            echo "Extracting CPU backend for $arch..."
            unzip "$cpu_zip" -d "$temp_dir"

            echo "Adding CPU files to $arch zips..."
            for target_zip in artifact/llama-bin-win-*-${arch}.zip; do
              if [[ "$target_zip" == "$cpu_zip" ]]; then
                continue
              fi
              echo "Adding CPU backend to $(basename "$target_zip")"
              realpath_target_zip=$(realpath "$target_zip")
              (cd "$temp_dir" && zip -r "$realpath_target_zip" .)
            done

            rm -rf "$temp_dir"
          done

          echo "Renaming and moving zips to release..."
          for zip_file in artifact/llama-bin-win-*.zip; do
            base_name=$(basename "$zip_file" .zip)
            zip_name="llama-${{ steps.tag.outputs.name }}-${base_name#llama-}.zip"
            echo "Moving $zip_file to release/$zip_name"
            mv "$zip_file" "release/$zip_name"
          done

          echo "Moving other artifacts..."
          mv -v artifact/*.zip release
          mv -v artifact/*.tar.gz release

      - name: Create release
        id: create_release
        uses: ggml-org/action-create-release@v1
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        with:
          tag_name: ${{ steps.tag.outputs.name }}
          body: |
            <details open>

            ${{ github.event.head_commit.message }}

            </details>

            **macOS/iOS:**
            - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/llama-${{ steps.tag.outputs.name }}-bin-macos-arm64.tar.gz)
            - [macOS Intel (x64)](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/llama-${{ steps.tag.outputs.name }}-bin-macos-x64.tar.gz)
            - [iOS XCFramework](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/llama-${{ steps.tag.outputs.name }}-xcframework.zip)

            **Linux:**
            - [Ubuntu x64 (CPU)](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/llama-${{ steps.tag.outputs.name }}-bin-ubuntu-x64.tar.gz)
            - [Ubuntu x64 (Vulkan)](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/llama-${{ steps.tag.outputs.name }}-bin-ubuntu-vulkan-x64.tar.gz)
            - [Ubuntu x64 (ROCm 7.2)](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/llama-${{ steps.tag.outputs.name }}-bin-ubuntu-rocm-7.2-x64.tar.gz)
            - [Ubuntu s390x (CPU)](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/llama-${{ steps.tag.outputs.name }}-bin-ubuntu-s390x.tar.gz)
            - [Ubuntu x64 (OpenVINO)](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/llama-${{ steps.tag.outputs.name }}-bin-ubuntu-openvino-${{ needs.ubuntu-24-openvino.outputs.openvino_version }}-x64.tar.gz)

            **Windows:**
            - [Windows x64 (CPU)](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/llama-${{ steps.tag.outputs.name }}-bin-win-cpu-x64.zip)
            - [Windows arm64 (CPU)](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/llama-${{ steps.tag.outputs.name }}-bin-win-cpu-arm64.zip)
            - [Windows x64 (CUDA 12)](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/llama-${{ steps.tag.outputs.name }}-bin-win-cuda-12.4-x64.zip) - [CUDA 12.4 DLLs](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/cudart-llama-bin-win-cuda-12.4-x64.zip)
            - [Windows x64 (CUDA 13)](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/llama-${{ steps.tag.outputs.name }}-bin-win-cuda-13.1-x64.zip) - [CUDA 13.1 DLLs](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/cudart-llama-bin-win-cuda-13.1-x64.zip)
            - [Windows x64 (Vulkan)](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/llama-${{ steps.tag.outputs.name }}-bin-win-vulkan-x64.zip)
            - [Windows x64 (SYCL)](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/llama-${{ steps.tag.outputs.name }}-bin-win-sycl-x64.zip)
            - [Windows x64 (HIP)](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/llama-${{ steps.tag.outputs.name }}-bin-win-hip-radeon-x64.zip)

            **openEuler:**
            - [openEuler x86 (310p)](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/llama-${{ steps.tag.outputs.name }}-bin-310p-openEuler-x86.tar.gz)
            - [openEuler x86 (910b, ACL Graph)](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/llama-${{ steps.tag.outputs.name }}-bin-910b-openEuler-x86-aclgraph.tar.gz)
            - [openEuler aarch64 (310p)](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/llama-${{ steps.tag.outputs.name }}-bin-310p-openEuler-aarch64.tar.gz)
            - [openEuler aarch64 (910b, ACL Graph)](https://github.com/ggml-org/llama.cpp/releases/download/${{ steps.tag.outputs.name }}/llama-${{ steps.tag.outputs.name }}-bin-910b-openEuler-aarch64-aclgraph.tar.gz)

      - name: Upload release
        id: upload_release
        uses: actions/github-script@v8
        with:
          github-token: ${{secrets.GITHUB_TOKEN}}
          script: |
            const path = require('path');
            const fs = require('fs');
            const release_id = '${{ steps.create_release.outputs.id }}';
            for (let file of await fs.readdirSync('./release')) {
              if (path.extname(file) === '.zip' || file.endsWith('.tar.gz')) {
                console.log('uploadReleaseAsset', file);
                await github.rest.repos.uploadReleaseAsset({
                  owner: context.repo.owner,
                  repo: context.repo.repo,
                  release_id: release_id,
                  name: file,
                  data: await fs.readFileSync(`./release/${file}`)
                });
              }
            }

server matrix .github/workflows/server.yml

Triggers

workflow_dispatch, push, pull_request

Runs on

ubuntu-latest, windows-2022

Jobs

server, server-windows

Matrix

build_type, include, include.build_type, include.extra_args, include.wf_name, wf_name→ , LLAMA_ARG_BACKEND_SAMPLING=1, Release, backend-sampling, default

Commands

sudo apt-get update sudo apt-get -y install \ build-essential \ xxd \ git \ cmake \ curl \ wget \ language-pack-en \ libssl-dev
cmake -B build \ -DLLAMA_BUILD_BORINGSSL=ON \ -DGGML_SCHED_NO_REALLOC=ON cmake --build build --config ${{ matrix.build_type }} -j $(nproc) --target llama-server
cd tools/server/tests export ${{ matrix.extra_args }} pytest -v -x -m "not slow"
cd tools/server/tests export ${{ matrix.extra_args }} SLOW_TESTS=1 pytest -v -x
cmake -B build -DLLAMA_BUILD_BORINGSSL=ON -DGGML_SCHED_NO_REALLOC=ON cmake --build build --config Release -j ${env:NUMBER_OF_PROCESSORS} --target llama-server
cd tools/server/tests $env:PYTHONIOENCODING = ":replace" pytest -v -x -m "not slow"
cd tools/server/tests $env:SLOW_TESTS = "1" pytest -v -x

View raw YAML

name: Server

on:
  workflow_dispatch: # allows manual triggering
    inputs:
      sha:
        description: 'Commit SHA1 to build'
        required: false
        type: string
      slow_tests:
        description: 'Run slow tests'
        required: true
        type: boolean
  push:
    branches:
      - master
    paths: [
      '.github/workflows/server.yml',
      '**/CMakeLists.txt',
      '**/Makefile',
      '**/*.h',
      '**/*.hpp',
      '**/*.c',
      '**/*.cpp',
      '**/*.cu',
      '**/*.swift',
      '**/*.m',
      'tools/server/**.*'
    ]
  pull_request:
    types: [opened, synchronize, reopened]
    paths: [
      '.github/workflows/server.yml',
      '**/CMakeLists.txt',
      '**/Makefile',
      '**/*.h',
      '**/*.hpp',
      '**/*.c',
      '**/*.cpp',
      '**/*.cu',
      '**/*.swift',
      '**/*.m',
      'tools/server/**.*'
    ]

env:
  LLAMA_LOG_COLORS: 1
  LLAMA_LOG_PREFIX: 1
  LLAMA_LOG_TIMESTAMPS: 1
  LLAMA_LOG_VERBOSITY: 10

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}-${{ github.head_ref || github.run_id }}
  cancel-in-progress: true

jobs:
  server:
    runs-on: ubuntu-latest

    name: server (${{ matrix.wf_name }})
    strategy:
      matrix:
        build_type: [Release]
        wf_name: ["default"]
        include:
          - build_type: Release
            extra_args: ""
            wf_name:    "default"
          - build_type: Release
            extra_args: "LLAMA_ARG_BACKEND_SAMPLING=1"
            wf_name:    "backend-sampling"
      fail-fast: false

    steps:
      - name: Dependencies
        id: depends
        run: |
          sudo apt-get update
          sudo apt-get -y install \
            build-essential \
            xxd \
            git \
            cmake \
            curl \
            wget \
            language-pack-en \
            libssl-dev

      - name: Clone
        id: checkout
        uses: actions/checkout@v6
        with:
          fetch-depth: 0
          ref: ${{ github.event.inputs.sha || github.event.pull_request.head.sha || github.sha || github.head_ref || github.ref_name }}

      - name: Build
        id: cmake_build
        run: |
          cmake -B build \
            -DLLAMA_BUILD_BORINGSSL=ON \
            -DGGML_SCHED_NO_REALLOC=ON
          cmake --build build --config ${{ matrix.build_type }} -j $(nproc) --target llama-server

      - name: Python setup
        id: setup_python
        uses: actions/setup-python@v6
        with:
          python-version: '3.11'
          pip-install: -r tools/server/tests/requirements.txt

      - name: Tests
        id: server_integration_tests
        if: ${{ (!matrix.disabled_on_pr || !github.event.pull_request) }}
        run: |
          cd tools/server/tests
          export ${{ matrix.extra_args }}
          pytest -v -x -m "not slow"

      - name: Slow tests
        id: server_integration_tests_slow
        if: ${{ (github.event.schedule || github.event.inputs.slow_tests == 'true') && matrix.build_type == 'Release' }}
        run: |
          cd tools/server/tests
          export ${{ matrix.extra_args }}
          SLOW_TESTS=1 pytest -v -x

  server-windows:
    runs-on: windows-2022

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6
        with:
          fetch-depth: 0
          ref: ${{ github.event.inputs.sha || github.event.pull_request.head.sha || github.sha || github.head_ref || github.ref_name }}

      - name: Build
        id: cmake_build
        run: |
          cmake -B build -DLLAMA_BUILD_BORINGSSL=ON -DGGML_SCHED_NO_REALLOC=ON
          cmake --build build --config Release -j ${env:NUMBER_OF_PROCESSORS} --target llama-server

      - name: Python setup
        id: setup_python
        uses: actions/setup-python@v6
        with:
          python-version: '3.11'
          pip-install: -r tools/server/tests/requirements.txt

      - name: Tests
        id: server_integration_tests
        if: ${{ !matrix.disabled_on_pr || !github.event.pull_request }}
        run: |
          cd tools/server/tests
          $env:PYTHONIOENCODING = ":replace"
          pytest -v -x -m "not slow"

      - name: Slow tests
        id: server_integration_tests_slow
        if: ${{ (github.event.schedule || github.event.inputs.slow_tests == 'true') && matrix.build_type == 'Release' }}
        run: |
          cd tools/server/tests
          $env:SLOW_TESTS = "1"
          pytest -v -x

server-sanitize matrix .github/workflows/server-sanitize.yml

Triggers

workflow_dispatch, push

Runs on

ubuntu-latest

Jobs

server

Matrix

build_type, sanitizer→ ADDRESS, RelWithDebInfo, UNDEFINED

Commands

sudo apt-get update sudo apt-get -y install \ build-essential \ xxd \ git \ cmake \ curl \ wget \ language-pack-en \ libssl-dev
cmake -B build \ -DLLAMA_BUILD_BORINGSSL=ON \ -DGGML_SCHED_NO_REALLOC=ON \ -DGGML_SANITIZE_ADDRESS=${{ matrix.sanitizer == 'ADDRESS' }} \ -DGGML_SANITIZE_THREAD=${{ matrix.sanitizer == 'THREAD' }} \ -DGGML_SANITIZE_UNDEFINED=${{ matrix.sanitizer == 'UNDEFINED' }} \ -DLLAMA_SANITIZE_ADDRESS=${{ matrix.sanitizer == 'ADDRESS' }} \ -DLLAMA_SANITIZE_THREAD=${{ matrix.sanitizer == 'THREAD' }} \ -DLLAMA_SANITIZE_UNDEFINED=${{ matrix.sanitizer == 'UNDEFINED' }} cmake --build build --config ${{ matrix.build_type }} -j $(nproc) --target llama-server
cd tools/server/tests export ${{ matrix.extra_args }} pytest -v -x -m "not slow"
cd tools/server/tests export ${{ matrix.extra_args }} SLOW_TESTS=1 pytest -v -x

View raw YAML

name: Server (sanitize)

on:
  workflow_dispatch: # allows manual triggering
    inputs:
      sha:
        description: 'Commit SHA1 to build'
        required: false
        type: string
      slow_tests:
        description: 'Run slow tests'
        required: true
        type: boolean
  push:
    branches:
      - master
    paths: [
      '.github/workflows/server-sanitize.yml',
      '**/CMakeLists.txt',
      '**/Makefile',
      '**/*.h',
      '**/*.hpp',
      '**/*.c',
      '**/*.cpp',
      'tools/server/**.*'
    ]

env:
  LLAMA_LOG_COLORS: 1
  LLAMA_LOG_PREFIX: 1
  LLAMA_LOG_TIMESTAMPS: 1
  LLAMA_LOG_VERBOSITY: 10

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}-${{ github.head_ref || github.run_id }}
  cancel-in-progress: true

jobs:
  server:
    runs-on: ubuntu-latest

    strategy:
      matrix:
        sanitizer: [ADDRESS, UNDEFINED] # THREAD is very slow
        build_type: [RelWithDebInfo]
      fail-fast: false

    steps:
      - name: Dependencies
        id: depends
        run: |
          sudo apt-get update
          sudo apt-get -y install \
            build-essential \
            xxd \
            git \
            cmake \
            curl \
            wget \
            language-pack-en \
            libssl-dev

      - name: Clone
        id: checkout
        uses: actions/checkout@v6
        with:
          fetch-depth: 0
          ref: ${{ github.event.inputs.sha || github.event.pull_request.head.sha || github.sha || github.head_ref || github.ref_name }}

      - name: Build
        id: cmake_build
        run: |
          cmake -B build \
            -DLLAMA_BUILD_BORINGSSL=ON \
            -DGGML_SCHED_NO_REALLOC=ON \
            -DGGML_SANITIZE_ADDRESS=${{ matrix.sanitizer == 'ADDRESS' }} \
            -DGGML_SANITIZE_THREAD=${{ matrix.sanitizer == 'THREAD' }} \
            -DGGML_SANITIZE_UNDEFINED=${{ matrix.sanitizer == 'UNDEFINED' }} \
            -DLLAMA_SANITIZE_ADDRESS=${{ matrix.sanitizer == 'ADDRESS' }} \
            -DLLAMA_SANITIZE_THREAD=${{ matrix.sanitizer == 'THREAD' }} \
            -DLLAMA_SANITIZE_UNDEFINED=${{ matrix.sanitizer == 'UNDEFINED' }}
          cmake --build build --config ${{ matrix.build_type }} -j $(nproc) --target llama-server

      - name: Python setup
        id: setup_python
        uses: actions/setup-python@v6
        with:
          python-version: '3.11'
          pip-install: -r tools/server/tests/requirements.txt

      - name: Tests
        id: server_integration_tests
        if: ${{ (!matrix.disabled_on_pr || !github.event.pull_request) }}
        run: |
          cd tools/server/tests
          export ${{ matrix.extra_args }}
          pytest -v -x -m "not slow"

      - name: Slow tests
        id: server_integration_tests_slow
        if: ${{ (github.event.schedule || github.event.inputs.slow_tests == 'true') && matrix.build_type == 'Release' }}
        run: |
          cd tools/server/tests
          export ${{ matrix.extra_args }}
          SLOW_TESTS=1 pytest -v -x

server-self-hosted matrix .github/workflows/server-self-hosted.yml

Triggers

workflow_dispatch, push

Runs on

self-hosted, llama-server, macOS, ARM64, self-hosted, llama-server, Linux, NVIDIA

Jobs

server-metal, server-cuda

Matrix

build_type, include, include.build_type, include.extra_args, include.wf_name, wf_name→ GGML_METAL_DEVICES=2, GGML_METAL_DEVICES=2 LLAMA_ARG_BACKEND_SAMPLING=1, GPUx1, GPUx1, backend-sampling, GPUx2, GPUx2, backend-sampling, LLAMA_ARG_BACKEND_SAMPLING=1, Release

Commands

cmake -B build -DGGML_SCHED_NO_REALLOC=ON cmake --build build --config ${{ matrix.build_type }} -j $(sysctl -n hw.logicalcpu) --target llama-server
cd tools/server/tests python3 -m venv venv source venv/bin/activate pip install -r requirements.txt export ${{ matrix.extra_args }} pytest -v -x -m "not slow"
cmake -B build -DGGML_SCHED_NO_REALLOC=ON cmake --build build --config ${{ matrix.build_type }} -j $(sysctl -n hw.logicalcpu) --target llama-server
cd tools/server/tests python3 -m venv venv source venv/bin/activate pip install -r requirements.txt export ${{ matrix.extra_args }} pytest -v -x -m "not slow"

View raw YAML

name: Server (self-hosted)

on:
  workflow_dispatch: # allows manual triggering
    inputs:
      sha:
        description: 'Commit SHA1 to build'
        required: false
        type: string
      slow_tests:
        description: 'Run slow tests'
        required: true
        type: boolean
  push:
    branches:
      - master
    paths: [
      '.github/workflows/server-self-hosted.yml',
      '**/CMakeLists.txt',
      '**/Makefile',
      '**/*.h',
      '**/*.hpp',
      '**/*.c',
      '**/*.cpp',
      '**/*.cu',
      '**/*.swift',
      '**/*.m',
      'tools/server/**.*'
    ]

env:
  LLAMA_LOG_COLORS: 1
  LLAMA_LOG_PREFIX: 1
  LLAMA_LOG_TIMESTAMPS: 1
  LLAMA_LOG_VERBOSITY: 10

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}-${{ github.head_ref || github.run_id }}
  cancel-in-progress: true

jobs:
  server-metal:
    runs-on: [self-hosted, llama-server, macOS, ARM64]

    name: server-metal (${{ matrix.wf_name }})
    strategy:
      matrix:
        build_type: [Release]
        wf_name: ["GPUx1"]
        include:
          - build_type: Release
            extra_args: "LLAMA_ARG_BACKEND_SAMPLING=1"
            wf_name:    "GPUx1, backend-sampling"
          - build_type: Release
            extra_args: "GGML_METAL_DEVICES=2"
            wf_name:    "GPUx2"
          - build_type: Release
            extra_args: "GGML_METAL_DEVICES=2 LLAMA_ARG_BACKEND_SAMPLING=1"
            wf_name:    "GPUx2, backend-sampling"
      fail-fast: false

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6
        with:
          fetch-depth: 0
          ref: ${{ github.event.inputs.sha || github.event.pull_request.head.sha || github.sha || github.head_ref || github.ref_name }}

      - name: Build
        id: cmake_build
        run: |
          cmake -B build -DGGML_SCHED_NO_REALLOC=ON
          cmake --build build --config ${{ matrix.build_type }} -j $(sysctl -n hw.logicalcpu) --target llama-server

      - name: Tests
        id: server_integration_tests
        if: ${{ (!matrix.disabled_on_pr || !github.event.pull_request) }}
        run: |
          cd tools/server/tests
          python3 -m venv venv
          source venv/bin/activate
          pip install -r requirements.txt
          export ${{ matrix.extra_args }}
          pytest -v -x -m "not slow"

  server-cuda:
    runs-on: [self-hosted, llama-server, Linux, NVIDIA]

    name: server-cuda (${{ matrix.wf_name }})
    strategy:
      matrix:
        build_type: [Release]
        wf_name: ["GPUx1"]
        include:
          - build_type: Release
            extra_args: "LLAMA_ARG_BACKEND_SAMPLING=1"
            wf_name:    "GPUx1, backend-sampling"
      fail-fast: false

    steps:
      - name: Clone
        id: checkout
        uses: actions/checkout@v6
        with:
          fetch-depth: 0
          ref: ${{ github.event.inputs.sha || github.event.pull_request.head.sha || github.sha || github.head_ref || github.ref_name }}

      - name: Build
        id: cmake_build
        run: |
          cmake -B build -DGGML_SCHED_NO_REALLOC=ON
          cmake --build build --config ${{ matrix.build_type }} -j $(sysctl -n hw.logicalcpu) --target llama-server

      - name: Tests
        id: server_integration_tests
        if: ${{ (!matrix.disabled_on_pr || !github.event.pull_request) }}
        run: |
          cd tools/server/tests
          python3 -m venv venv
          source venv/bin/activate
          pip install -r requirements.txt
          export ${{ matrix.extra_args }}
          pytest -v -x -m "not slow"

server-webui .github/workflows/server-webui.yml

Triggers

workflow_dispatch, push, pull_request

Runs on

${{ 'ubuntu-24.04-arm' || 'ubuntu-24.04' }}

Jobs

webui-check

Commands

npm ci
npm run check
npm run lint
npm run build
npx playwright install --with-deps
npm run build-storybook
npm run test:client
npm run test:unit

View raw YAML

name: Server WebUI

on:
  workflow_dispatch: # allows manual triggering
    inputs:
      sha:
        description: 'Commit SHA1 to build'
        required: false
        type: string
  push:
    branches:
      - master
    paths: [
      '.github/workflows/server-webui.yml',
      'tools/server/webui/**.*',
      'tools/server/tests/**.*',
      'tools/server/public/**'
    ]
  pull_request:
    types: [opened, synchronize, reopened]
    paths: [
      '.github/workflows/server-webui.yml',
      'tools/server/webui/**.*',
      'tools/server/tests/**.*',
      'tools/server/public/**'
    ]

env:
  LLAMA_LOG_COLORS: 1
  LLAMA_LOG_PREFIX: 1
  LLAMA_LOG_TIMESTAMPS: 1
  LLAMA_LOG_VERBOSITY: 10

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}-${{ github.head_ref || github.run_id }}
  cancel-in-progress: true

jobs:
  webui-check:
    name: WebUI Checks
    runs-on: ${{ 'ubuntu-24.04-arm' || 'ubuntu-24.04' }}
    continue-on-error: true
    steps:
      - name: Checkout code
        uses: actions/checkout@v6
        with:
          fetch-depth: 0
          ref: ${{ github.event.inputs.sha || github.event.pull_request.head.sha || github.sha || github.head_ref || github.ref_name }}

      - name: Setup Node.js
        id: node
        uses: actions/setup-node@v6
        with:
          node-version: "22"
          cache: "npm"
          cache-dependency-path: "tools/server/webui/package-lock.json"

      - name: Install dependencies
        id: setup
        if: ${{ steps.node.conclusion == 'success' }}
        run: npm ci
        working-directory: tools/server/webui

      - name: Run type checking
        if: ${{ always() && steps.setup.conclusion == 'success' }}
        run: npm run check
        working-directory: tools/server/webui

      - name: Run linting
        if: ${{ always() && steps.setup.conclusion == 'success' }}
        run: npm run lint
        working-directory: tools/server/webui

      - name: Build application
        if: ${{ always() && steps.setup.conclusion == 'success' }}
        run: npm run build
        working-directory: tools/server/webui

      - name: Install Playwright browsers
        id: playwright
        if: ${{ always() && steps.setup.conclusion == 'success' }}
        run: npx playwright install --with-deps
        working-directory: tools/server/webui

      - name: Build Storybook
        if: ${{ always() && steps.playwright.conclusion == 'success' }}
        run: npm run build-storybook
        working-directory: tools/server/webui

      - name: Run Client tests
        if: ${{ always() && steps.playwright.conclusion == 'success' }}
        run: npm run test:client
        working-directory: tools/server/webui

      - name: Run Unit tests
        if: ${{ always() && steps.playwright.conclusion == 'success' }}
        run: npm run test:unit
        working-directory: tools/server/webui

      - name: Run UI tests
        if: ${{ always() && steps.playwright.conclusion == 'success' }}
        run: npm run test:ui -- --testTimeout=60000
        working-directory: tools/server/webui

      - name: Run E2E tests
        if: ${{ always() && steps.playwright.conclusion == 'success' }}
        run: npm run test:e2e
        working-directory: tools/server/webui

update-ops-docs .github/workflows/update-ops-docs.yml

Triggers

push, pull_request

Runs on

ubuntu-slim

Jobs

update-ops-docs

Commands

mkdir -p /tmp/ops_check ./scripts/create_ops_docs.py /tmp/ops_check/ops.md
if ! diff -q docs/ops.md /tmp/ops_check/ops.md; then echo "Operations documentation (docs/ops.md) is not up to date with the backend CSV files." echo "To fix: run ./scripts/create_ops_docs.py and commit the updated docs/ops.md along with your changes" echo "Differences found:" diff docs/ops.md /tmp/ops_check/ops.md || true exit 1 fi echo "Operations documentation is up to date."

View raw YAML

name: Update Operations Documentation

on:
    push:
        paths:
            - 'docs/ops.md'
            - 'docs/ops/**'
            - 'scripts/create_ops_docs.py'
    pull_request:
        paths:
            - 'docs/ops.md'
            - 'docs/ops/**'
            - 'scripts/create_ops_docs.py'

jobs:
    update-ops-docs:
        runs-on: ubuntu-slim

        steps:
        - name: Checkout repository
          uses: actions/checkout@v6

        - name: Set up Python
          uses: actions/setup-python@v6
          with:
              python-version: '3.x'

        - name: Generate operations documentation to temporary file
          run: |
              mkdir -p /tmp/ops_check
              ./scripts/create_ops_docs.py /tmp/ops_check/ops.md

        - name: Check if docs/ops.md matches generated version
          run: |
              if ! diff -q docs/ops.md /tmp/ops_check/ops.md; then
                  echo "Operations documentation (docs/ops.md) is not up to date with the backend CSV files."
                  echo "To fix: run ./scripts/create_ops_docs.py and commit the updated docs/ops.md along with your changes"
                  echo "Differences found:"
                  diff docs/ops.md /tmp/ops_check/ops.md || true
                  exit 1
              fi
              echo "Operations documentation is up to date."

winget .github/workflows/winget.yml

Triggers

workflow_dispatch, schedule

Runs on

ubuntu-latest

Jobs

update

Actions

cargo-bins/cargo-binstall

Commands

cargo binstall komac@2.15.0 -y
echo "Updating manifest..." komac update --version ${{ steps.find_latest_release.outputs.VERSION }} \ --urls "${{ steps.find_latest_release.outputs.ASSETURL }}" \ --token ${{ secrets.WINGET_GITHUB_TOKEN }} \ --submit \ ggml.llamacpp

View raw YAML

name: Update Winget Package

on:
  workflow_dispatch: # allows manual triggering
  schedule:
    - cron: '28 5 * * *' # Update every day at 5:28 UTC

jobs:
  update:
    name: Update Winget Package
    runs-on: ubuntu-latest
    if: github.repository_owner == 'ggml-org'

    steps:
      - name: Install cargo binstall
        uses: cargo-bins/cargo-binstall@268643a6b5ea099f5718ee5cd3ff7dc89a5eb49b

      - name: Install komac
        run: |
          cargo binstall komac@2.15.0 -y

      - name: Find latest release
        id: find_latest_release
        uses: actions/github-script@v8
        with:
          script: |
            const { data: releases } = await github.rest.repos.listReleases({
              owner: context.repo.owner,
              repo: context.repo.repo,
            });
            const { tag_name: version, assets: assets } = releases.find(({assets}) => assets.find(asset => asset.name.includes('win-vulkan')));
            const { browser_download_url: asset_url } = assets.find(asset => asset.name.includes('win-vulkan'));
            console.log("Latest release:", version);
            core.setOutput('VERSION', version);
            core.setOutput('ASSETURL', asset_url);

      - name: Update manifest
        run: |
          echo "Updating manifest..."
          komac update --version ${{ steps.find_latest_release.outputs.VERSION }} \
            --urls "${{ steps.find_latest_release.outputs.ASSETURL }}" \
            --token ${{ secrets.WINGET_GITHUB_TOKEN }} \
            --submit \
            ggml.llamacpp