runpod-serverless-template

/home/avalon/.hermes/skills/.archive/mlops/runpod-serverless-template/SKILL.md · raw

RunPod Serverless Template Management via GraphQL

When to Use

API Endpoint

POST https://api.runpod.io/graphql
Authorization: Bearer <RUNPOD_API_KEY>
Content-Type: application/json

Query Endpoints with Full Detail

{
  myself {
    endpoints {
      id name templateId gpuIds workersMin workersMax idleTimeout
      env { key value }
      template {
        id name imageName dockerArgs env { key value }
        startJupyter startSsh
      }
    }
  }
}

Update Template (saveTemplate)

mutation {
  saveTemplate(input: {
    id: "<TEMPLATE_ID>"
    name: "Template Name"
    imageName: "dockerhub-user/image:tag"
    dockerArgs: ""
    containerDiskInGb: 20
    volumeInGb: 0
    startJupyter: false
    startSsh: false
    isServerless: true
    env: [
      { key: "KEY", value: "VALUE" }
    ]
  }) {
    id name imageName dockerArgs
  }
}

Pitfalls

  1. Mutation type name: It's saveTemplate, NOT updateTemplate. The input type is SaveTemplateInput.
  2. Required fields: containerDiskInGb (Int!) and volumeInGb (Int!) and env ([EnvironmentVariableInput]!) are ALL required even if you're not changing them. Omitting any causes validation errors.
  3. volumeInGb quirk: Serverless templates "don't support" volumeInGb but the field is still required. Use volumeInGb: 0 to satisfy validation without attaching a volume.
  4. Shell escaping: Write the JSON payload to a temp file and use curl -d @/tmp/payload.json to avoid shell escaping hell with nested quotes in GraphQL.
  5. dockerArgs: When using a custom Docker image with a proper CMD/ENTRYPOINT, set dockerArgs: "" (empty string) so the image's own CMD is used. Non-empty dockerArgs override the image CMD.
  6. Common failure: If workers exit with code 2 and logs show "No such file or directory" for a handler path, the dockerArgs likely contain a git clone || true that silently failed (private repo, network issue). Solution: bake handler into Docker image instead of runtime cloning.

Debugging Worker Failures