This commit is contained in:
lolwierd
2026-04-24 13:46:01 +05:30
commit 17cb564448
4 changed files with 520 additions and 0 deletions

24
.gitignore vendored Normal file
View File

@@ -0,0 +1,24 @@
# macOS
.DS_Store
# Editors
.vscode/
.idea/
*.swp
*~
# Agent / local installs
node_modules/
.npm/
.claude/
.agents/
.augment/
# Local builds / tmp
*.log
tmp/
build/
# Env files (never commit)
.env
.env.*

104
AGENTS.md Normal file
View File

@@ -0,0 +1,104 @@
# AGENTS.md
Guidance for AI coding agents (Claude Code, Cursor, Codex, opencode, ...) and human contributors editing this repository. This file is loaded automatically by agents that honour `AGENTS.md` conventions.
## What this repo is
A collection of [agent skills](https://agentskills.io/) for working with Excloud. Skills are `SKILL.md` playbooks that an agent loads on demand; they are prose, not code. They encode Excloud-specific knowledge (auth, safety rails, command syntax, error recovery) that the agent would otherwise have to rediscover each run.
Skills are installed with [`npx skills add`](https://www.npmjs.com/package/skills); see `README.md` for user-facing install instructions.
## Directory layout
```
skills/
<skill-name>/ # kebab-case, matches the `name:` in SKILL.md frontmatter
SKILL.md # required; the playbook
scripts/ # optional; executable helpers the skill references
*.sh # prefer bash; mark chmod +x
references/ # optional; long-form docs the skill links to
*.md
```
One skill per directory. Do not put multiple skills' playbooks in the same `SKILL.md`.
## SKILL.md conventions
### Frontmatter
```yaml
---
name: <kebab-case-name>
description: <one or two sentences: what the skill does and when an agent should load it>
---
```
- `name` must match the parent directory name and be unique across the repo.
- `description` is what the agent sees before it decides to load the skill — write it as a trigger hint. Name the resource or verb, the pain point it addresses, and ideally a couple of trigger phrases.
- Optional: `metadata.internal: true` hides the skill from default discovery; users opt in with `INSTALL_INTERNAL_SKILLS=1`.
### Body
Write for an agent that has never seen this surface. The best skills:
- Tell the agent what to **discover before acting** (`list` / `get` / `--help`) rather than hard-coding IDs, flags, or versions.
- Call out destructive operations explicitly and tell the agent to confirm first.
- State the auth model once, plus what "not authenticated" looks like.
- Capture real error strings the CLI or API emits, paired with the actual cause and fix. These are pure gold — they turn agent flailing into one-shot recovery.
- Document output shapes (table vs. JSON vs. Go-struct dump) so the agent picks the right parser (`awk`, `jq`, or "don't parse this").
- Prefer "here is the shape you'll see" over "here is the schema" when the surface shifts often.
- Open with a disclaimer that the installed tool's `--help` is canonical and this file can drift. The agent should trust the tool over the skill when they disagree.
Avoid:
- Hard-coded account / org / resource IDs.
- Personal paths like `~/Projects/…` or `/Users/<name>/…` (those leak into generated commands).
- Source-tree pointers (consumers of the skill won't have the repo checked out).
- Any secret material (tokens, keys, passwords) — even in examples.
- `"last updated"` timestamps or version numbers in the body; they rot fast.
### Length
Keep `SKILL.md` under ~400 lines when you can. If a topic needs more, split it into a `references/<topic>.md` and link to it from `SKILL.md`.
## Adding a new skill
1. `mkdir -p skills/<skill-name>` (kebab-case; matches `name:` in frontmatter).
2. Create `skills/<skill-name>/SKILL.md` with the frontmatter above.
3. Verify the skill loads cleanly:
```bash
# list only — verifies the skill is discoverable without touching your agent dirs
npx skills add ./ --list
# real install from the local checkout
npx skills add ./ --skill <skill-name>
```
4. Update the "Available skills" section of `README.md` with a short blurb and the main use-cases.
5. Open a PR. One skill per PR keeps review easy.
## Editing an existing skill
- Changes to `SKILL.md` should be reviewable as prose — small, focused edits, with the commit message explaining the _intent_ (e.g. "document --download flag on compute scp" beats "update skill").
- If you verified a new behaviour against a live tool, mention the tool version or the date of verification in the PR description (not in the skill body).
- If a section becomes stale because the upstream tool's surface shifted, prefer rewriting it to point the agent at `--help` rather than chasing every flag change.
## Testing
There is no build step. A "test" is: does an agent loaded with this skill do the right thing on a representative prompt? A useful flow:
1. Install the branch locally: `npx skills add <path-to-local-checkout> --skill <name> -a claude-code`.
2. Ask the agent to do something the skill targets (create a VM, fetch a kubeconfig, etc.).
3. Watch for: does it discover IDs from `list`? Does it confirm before destructive ops? Does it recognise real error strings?
If all three feel right, the skill is doing its job.
## Style
- Markdown only. Plain prose; lists for flag enumerations; fenced code blocks for commands.
- Backticks around every flag, file path, env var, and command substring.
- American or British English, pick one and stay consistent within a skill.
- No emojis in `SKILL.md` unless the user explicitly asked for a playful tone somewhere.
## Release / publish
There is no package to publish. `npx skills add` clones this repo directly from `git.excloud.in` and reads `SKILL.md` files straight from the default branch, so merging to `main` is the release — no tags, no npm publish step.

82
README.md Normal file
View File

@@ -0,0 +1,82 @@
# Excloud Agent Skills
A collection of [agent skills](https://agentskills.io/) for working with [Excloud](https://excloud.in) from AI coding agents (Claude Code, Cursor, Codex, opencode, etc.).
Skills are `SKILL.md` playbooks the agent can load on demand — authenticated paths, safety guardrails, command syntax, and error recovery for a specific surface. Install them with the [`skills`](https://www.npmjs.com/package/skills) CLI.
## Install
This repo lives on Excloud's self-hosted Gitea at <https://git.excloud.in/excloud-in/excloud-skills>. Point `npx skills add` at the full clone URL (note the `.git` suffix — without it, the CLI looks for a `.well-known/agent-skills/index.json` manifest rather than cloning):
```bash
npx skills add https://git.excloud.in/excloud-in/excloud-skills.git
```
Install a single skill:
```bash
npx skills add https://git.excloud.in/excloud-in/excloud-skills.git --skill excloud-cli
```
Install all skills into every supported agent without prompts:
```bash
npx skills add https://git.excloud.in/excloud-in/excloud-skills.git --all
```
Target specific agents with repeated `-a` flags (defaults to global, user-level install):
```bash
npx skills add https://git.excloud.in/excloud-in/excloud-skills.git -a claude-code -a opencode --all
```
Dry-run — list what the repo offers without installing:
```bash
npx skills add https://git.excloud.in/excloud-in/excloud-skills.git --list
```
Private-repo users: set up git auth for `git.excloud.in` first (an HTTPS credential helper / `~/.netrc`, or use the SSH form `git@git.excloud.in:excloud-in/excloud-skills.git`). `npx skills add` shells out to plain `git clone`, so anything `git clone` can read, it can install from.
By default skills land under the agent's standard directory (`~/.claude/skills/`, `.agents/skills/`, `.augment/skills/`, ...). Run `npx skills --help` for the full option list, and `npx skills list` to see what is currently installed.
## Available skills
### `excloud-cli`
Safe end-to-end control of Excloud resources through the `exc` CLI. Covers compute (create / inspect / resize / restart / terminate, delete protection, exec / scp / console), networking (subnets, public IPv4, security groups and rules/bindings), volumes and snapshots, SSH keys, Kubernetes (clusters, workers, kubeconfig fetch / merge), IAM (accounts, service accounts, API keys, policies), billing, quota, serial console logs, and metrics.
**Use when:** the user asks to plan or run `exc` commands, provision / introspect / tear down VMs, attach a public IP, adjust a security group, pull a kubeconfig, debug a stuck boot via serial logs, or exec / scp against a VM.
**Key guidance the skill encodes:**
- Auth precedence (`EXCLOUD_ACCESS_TOKEN` / env tokens / `~/.exc/config`) and what "not authenticated" means.
- Discovery first: lookup tables (`exc compute instancetype list`, `image list`, `subnet list`, `securitygroup list`, ...) as the source of truth — no hard-coded IDs.
- Safety guardrails around `terminate`, `publicip release`, `rule delete`, `cluster delete`, `apikey delete`, and destructive shell commands over `exec`.
- Interactive access patterns: when to use `exec` (one-shot, bash-interpreted), when `scp` (upload/download, symlinks rejected), when `console` (interactive TTY, SSH ↔ WS fallback).
- Output-format buckets so agents pipe to `awk`/`jq` correctly.
- A cheat sheet of error messages the CLI actually emits, paired with what each means and how to fix it.
Prerequisites on the user's machine: the `exc` CLI on `$PATH` and an `exc login` session. `k8s cluster kubeconfig merge` additionally needs `kubectl`.
## Repository layout
```
excloud-skills/
├── README.md # this file
├── AGENTS.md # contributor / agent guidance for editing skills
├── .gitignore
└── skills/
└── excloud-cli/
└── SKILL.md # the skill itself
```
Each skill lives in `skills/<skill-name>/` with a single `SKILL.md`. Scripts (`scripts/`) and reference docs (`references/`) can live alongside `SKILL.md` when a skill needs them.
## Contributing a new skill
See [`AGENTS.md`](./AGENTS.md) for conventions (directory name, `SKILL.md` frontmatter, when to add scripts vs. references, how to keep skills discovery-friendly). Open a PR with the new `skills/<skill-name>/` directory; agent-side install is `npx skills add excloud-in/excloud-skills --skill <name>` once merged.
## License
TBD.

310
skills/excloud-cli/SKILL.md Normal file
View File

@@ -0,0 +1,310 @@
---
name: excloud-cli
description: Drive Excloud resources (compute, networking, security groups, volumes, snapshots, public IPs, IAM, billing, Kubernetes) through the `exc` CLI. Use when a user asks to plan or execute `exc` commands - creating / inspecting / updating / deleting VMs, running commands on them via `exec` / `scp` / `console`, managing security groups and public IPs, or pulling Kubernetes kubeconfigs - with safety guardrails and auth checks.
---
# Excloud CLI
This skill is a _starting guide_, not a spec. The `exc` CLI is generated from a live OpenAPI surface, so commands and flags change. **Whenever a command or flag in this file disagrees with `exc <command> --help`, trust the CLI.** Re-read the relevant `--help` before shaping a real command, and prefer discovering the surface interactively over memorising it from here.
```
exc --help
exc <group> --help
exc <group> <subcommand> --help
```
Everything below has been observed working at some point; the model should still verify before running anything destructive.
---
## Workflow principles
- Prefer `exc` for all Excloud actions unless the user explicitly asks for direct API / SDK use.
- Confirm before anything destructive (see Safety).
- If authentication is missing or expired, tell the user to run `exc login` and stop — do not invent tokens.
- When flag names or behaviours look odd, run `exc <...> --help` rather than guessing. Generated CLIs evolve between releases.
- Read `list` / `get` output shapes carefully before trying to parse them; there is no universal `-o json` flag today (see Output formats).
## Authentication
The CLI reads credentials in this precedence order:
1. `EXCLOUD_ACCESS_TOKEN` or `ACCESS_TOKEN` env var.
2. `EXCLOUD_ID_TOKEN` or `ID_TOKEN` env var.
3. `~/.exc/config` (JSON) written by `exc login` — contains the default account, default org, default zone, and per-account `id_token` / `access_token` material.
If none of those are present or valid, commands that need a token (`exec`, `scp`, `console`, `k8s cluster kubeconfig get/merge`) fail with `not authenticated; run \`exc login\``. `exc login` opens a browser flow and serves a callback on `http://localhost:7899/callback`.
`exc me`, `exc org list`, `exc account list`, `exc config list` are useful "where am I?" probes after login.
## Safety guardrails
Require explicit user confirmation before running any of these:
- `exc compute terminate` (especially with `--delete_root_volume`).
- `exc compute volume delete`, `exc compute snapshot delete`, `exc compute key delete`.
- `exc compute publicip release`, `exc compute publicip disassociate`.
- `exc securitygroup delete`, `exc securitygroup rule delete`, `exc securitygroup binding delete`.
- `exc k8s cluster delete`, `exc k8s cluster worker delete`.
- `exc account revoke`, `exc serviceaccount delete`, `exc apikey delete`, `exc policy delete`, `exc policy binding delete`.
For shell commands delivered through `exc compute exec` or an `exec` script file, refuse or confirm explicitly before running anything like `shutdown`, `reboot`, `rm -rf`, `mkfs`, `dd`, `wipefs`, rewrites of `/etc/fstab`, bootloader edits, or `systemctl stop ssh*` (the last one will make the VM unreachable over SSH — see Interactive access).
## Discoverability and authoritative lookups
The skill does _not_ hard-code IDs, instance type names, image IDs, subnet IDs, security group IDs, or zone IDs. Those change per account and over time. Before any `create` / `rule create` / `binding create` call, confirm the IDs with the relevant `list` command:
- `exc compute instancetype list` — CPU / memory / disk for each advertised type. Pick the smallest type whose CPU/MEMORY columns cover the workload; default to the cheapest advertised micro for scratch work and step up for real workloads.
- `exc compute instancetype capacity --instance_type <type>` — per-zone availability probe (`available=true|false`). Unknown types return `false` gracefully rather than 404, so `true` is the only reliable signal.
- `exc compute image list` — authoritative image catalog. Image IDs vary per org; do not hard-code them.
- `exc compute subnet list` + `exc compute subnet get --id <id>` — check `DISABLE_IPV4_PUBLIC_IP`: subnets with this set cannot take `--allocate_public_ipv4=true` at create time.
- `exc securitygroup list` + `exc securitygroup rule list --security_group_id <id>` + `exc securitygroup binding list --security_group_id <id>` (or `--interface_id <id>`) — confirm what a SG allows and where it's bound before relying on it.
- `exc compute publicip list` / `exc compute key list` / `exc compute volume list` / `exc compute snapshot list` — authoritative inventories for each resource type.
If `--help` on the installed CLI shows commands or flags not documented here, prefer `--help`.
## Common VM lifecycle
### Create
Required flags for `exc compute create`:
- `--name <dns-compatible-name>` (lowercase, `[a-z0-9][a-z0-9-]*[a-z0-9]`).
- `--subnet_id <id>` (zone of the subnet must match your default zone).
- `--allocate_public_ipv4=true|false` — the flag must be explicit.
- `--image_id <id>`
- `--instance_type <type>`
- `--root_volume_size_gib <n>`
Useful optional flags (verify via `--help`):
- `--security_group_ids <id1,id2>` — attach one or more SGs to the primary interface at create time. **If you omit this, the VM may come up with no SG attached** — set at least one.
- `--ssh_pubkey "<key or key name>"` — inline SSH public key string _or_ the `name` of a key managed via `exc compute key`.
- `--public_ipv4_reservation_id <id>` — attach an existing reserved public IPv4 instead of allocating a new ephemeral one.
- `--root_password <pw>` — for console / emergency access only; SSH keys are strongly preferred.
- `--root_volume_id <id>` **or** `--root_volume_source_snapshot_id <id>` (mutually exclusive) — reuse an existing volume or clone from a snapshot for the root disk.
- `--root_volume_baseline_iops <n>` / `--root_volume_baseline_throughput_mbps <n>` — provisioned performance for EBS-backed roots.
- `--user_data <inline>` or `--user-data-file <path>` — first-boot script. See User data below.
Do not pass flags the help output does not list; deprecated flags (e.g. `--root_volume_perf_tier`) are removed or hidden and will error or be ignored.
`create` prints a one-row table with at minimum `ID`, `NAME`, `STATE` (usually `STARTING` or `CREATING`), `ZONE`, `SUBNET`, `ROOT_VOLUME_ID`, `PUBLIC_IPV4`, `INTERFACE_IPV4`, `INTERFACE_IPV6`. Note that this row does **not** include `INTERFACE_ID`; fetch that later with `exc compute get --id <vm_id>`.
### Wait for RUNNING (no native `--wait`)
The CLI does not provide a wait primitive. Poll `compute get` and key off the `STATE` column:
```bash
until [ "$(exc compute get --id <vm_id> | awk 'NR==2 {for (i=1;i<=NF;i++) if ($i ~ /^(CREATING|STARTING|RUNNING|STOPPING|STOPPED|RESTARTING|TERMINATING|TERMINATED)$/) print $i}')" = "RUNNING" ]; do sleep 3; done
```
(Using column-name matching rather than a fixed index because the header ordering in `compute get` has shifted between releases; trust the header row rather than a hard-coded `$4`.)
Typical progression for a fresh VM: `CREATING` → `STARTING` → `RUNNING` in roughly half a minute, plus another 1520 seconds before cloud-init finishes and SSH answers. After RUNNING, wait a bit before the first `exc compute exec` or SSH connection will be reliable.
### Inspect and control
- `exc compute list` — hides `TERMINATED` VMs by default. Use this for "what is alive now".
- `exc compute instances list` — rich-metadata variant that shows **all** states unless filtered; add `--states running,stopped`, `--created_after <rfc3339>`, `--created_before <rfc3339>` as appropriate.
- `exc compute get --id <vm_id>` — single VM detail. Shows `INTERFACE_ID` (needed for publicip / SG binding ops) but not `ROOT_VOLUME_ID`.
- `exc compute rename --vm_id <id> --name <new_name>`
- `exc compute resize --vm_id <id> --instance_type <type>` — generally requires the VM to be STOPPED first.
- `exc compute start --vm_id <id>`
- `exc compute stop --vm_id <id> [--reserve_public_ipv4]` — pass `--reserve_public_ipv4` to keep the ephemeral public IPv4 across the stop.
- `exc compute restart --vm_id <id>` — a full API-level restart; useful to recover a VM whose SSH stack you broke from `exec`.
- `exc compute terminate --vm_id <id> [--delete_root_volume]` — without `--delete_root_volume` the root volume is kept and can be reused via `create --root_volume_id <id>`.
### Delete protection
Three commands can change the `delete_protection` flag; all return the updated VM as JSON:
- `exc compute protect --vm-id <id>` — enable protection.
- `exc compute unprotect --vm-id <id>` — disable protection.
- `exc compute rename --vm_id <id> --name <name> [--delete_protection=true|false]` — rename the VM and, if `--delete_protection` is passed, set protection in the same call. Omitting the flag on `rename` leaves the protection flag untouched, so a bare rename will not accidentally clear it.
While protection is enabled, `exc compute terminate` returns `VM delete protection is enabled. Disable delete protection before terminating this instance.` (exit 1). Run `unprotect` first, then retry `terminate`.
### Termination clean-up
After terminate with `--delete_root_volume`, confirm both with:
```bash
exc compute get --id <vm_id> # STATE should become TERMINATED in a few seconds
exc compute volume list # the root volume should disappear / move to DELETING
```
## User data
- `--user-data-file <path>` wins over `--user_data <inline>` if both are set (the inline one is ignored with a warning).
- The CLI is permissive — it only warns when content looks neither like a shell script nor a cloud-init document. Accepted heuristics:
- Shebang start: `#!/bin/bash`, `#!/usr/bin/env bash`, `#!/bin/sh`.
- First non-empty line begins with `#cloud-` (e.g. `#cloud-config`, `#cloud-boothook`).
- Prefer real `#!/bin/bash` scripts or `#cloud-config` YAML; other content will run but triggers the warning.
## Interactive access: `connect`, `exec`, `scp`, `console`
`exc compute connect` is the low-level session primitive; `exec`, `scp` and `console` all build on it.
- `exc compute connect --vm_id <id> [--user ubuntu] [--return_private_key]` — returns a short-lived session ID and, when `--return_private_key` is set, a base64-encoded PEM authorised for the VM.
- `exc compute exec --vm-id <id> (--command "<cmd>" | --script-file <path>) [--user ubuntu] [--timeout <seconds>]`
- `--command` and `--script-file` are mutually exclusive; exactly one is required.
- `--script-file` is **interpreted as bash on the VM** (piped into `bash -s`). It is not a plain upload — plain-text files that contain non-command lines will fail with `command not found`. For transferring files verbatim, use `scp`.
- `--timeout` has a sensible default (tens of seconds) and a hard backend cap (check `--help`). A timed-out command prints `command timed out` and returns exit 124.
- Remote exit codes propagate: `exit 42` on the VM → local exit 42, with `Process exited with status 42` on stderr.
- On success and failure alike the command emits `warning: host key not verified` on stderr — that is expected (the CLI trusts the instance-connect key without pinning). Redirect stderr when scripting.
- SSH targets are tried in order: public IPv4 → any interface private IPv4 → any interface IPv6. If all SSH targets fail, `exec` automatically falls back to the WebSocket console transport. The fallback uses a unique marker to capture the remote exit code. Whether the WS transport succeeds depends on the compute service — if it rejects the session (`unknown session`) or times out (`Timeout connecting to the instance`), `exec` will fail with a 255 exit. In that case, confirm the VM is actually reachable via its public IPv4 (security group / sshd status) rather than relying on WS.
- `exc compute scp --vm-id <id> --src <src> --dst <dst> [--user ubuntu] [--recursive] [--download] [--timeout <seconds>]`
- Default direction is **upload** (local → VM). Pass `--download` to pull files from the VM to local.
- `--recursive` is required for directory transfers in either direction.
- Symlinks are **rejected** — an encountered symlink fails the whole transfer with `symlink entries are not supported: <path>` (exit 1). Dereference or archive them locally (e.g. `tar -czhf ...`) before calling `scp`.
- `scp` does **not** fall back to the WebSocket transport when SSH is unreachable; it errors out. Use `scp` only on VMs whose SSH is reachable.
- If the destination requires elevation, upload to a writable path (e.g. `/tmp/...`) and move with `sudo` via `exc compute exec`.
- `exc compute console --vm-id <id> [--user ubuntu] [--timeout <seconds>] [--ssh | --ws]`
- Opens an **interactive** shell on the VM. By default it tries SSH first, then falls back to the WebSocket console.
- `--ssh` forces SSH only, `--ws` forces WebSocket only.
- Requires a real TTY — piping input or running inside a non-interactive shell will fail with `failed to set terminal to raw mode: inappropriate ioctl for device`. For scripted one-shots use `exec`; for interactive work suggest the user run `exc compute console` directly.
### Troubleshooting SSH / exec failures
1. Does the VM have a reachable address? `exc compute get --id <vm_id>` — check `PUBLIC_IPV4`, `INTERFACE_IPV4`, `INTERFACE_IPV6`.
2. Is a security group bound and does it permit SSH?
- `exc securitygroup binding list --interface_id <if_id>`
- `exc securitygroup binding create --interface_id <if_id> --security_group_id <sg_id>`
- `exc securitygroup rule list --security_group_id <sg_id>`
3. Is there an ingress rule for port 22 from your source IP? If not, create one:
- `exc securitygroup rule create --security_group_id <sg_id> --is_ingress=true --protocol TCPv4 --port_range 22 --cidr "<your_ip>/32"`
4. Is there an egress rule for the VM to reach the internet? Most setups want a broad egress rule:
- `exc securitygroup rule create --security_group_id <sg_id> --is_ingress=false --protocol IPv4 --port_range ANY --cidr 0.0.0.0/0`
5. If `exec` says `connection refused` on port 22, sshd is likely not running. `exc compute restart --vm_id <id>` brings it back (the API-level restart does not need SSH).
## Serial console logs
`exc compute seriallogs --id <vm_id> [--boot_id <id>] [--offset <n> --direction older|newer] [--limit <n>] [-f]`
- Omitting `--boot_id` returns the latest boot.
- `--offset` and `--direction` must be set together; the valid directions are `older` and `newer`.
- `--limit` must be positive when set; typical default is ~200 and the backend has a hard cap.
- `-f / --follow` polls for newer lines every couple of seconds — not a native stream.
- Lines are prefixed with `[<rfc3339 timestamp> offset=<n>]`. Look for `Cloud-init ... finished`, `Reached target ... cloud-init.target`, and the login banner (`Ubuntu X.Y.Z ip-a-b-c-d ttyS0`) to confirm a clean boot.
## Networking
### Subnets
- `exc compute subnet list` — the `DISABLE_IPV4_PUBLIC_IP` column is the gate on whether `--allocate_public_ipv4=true` is legal.
- `exc compute subnet get --id <id>`
### Public IPv4
- `exc compute publicip list` / `exc compute publicip get --id <reservation_id>`
- `exc compute publicip reserve --name <name> [--interface_id <if_id>]` — if `--interface_id` is passed the new reservation is also attached in one step.
- `exc compute publicip associate --interface_id <if_id> --reservation_id <id>`
- `exc compute publicip disassociate --reservation_id <id>`
- `exc compute publicip rename --reservation_id <id> --name <new_name>`
- `exc compute publicip release --reservation_id <id>` (destructive).
### Local IP check
`exc compute localip --ip <addr>` asks the service whether a given IP falls inside Excloud's local ranges. It returns `{ip, is_local}` and is a backend-defined membership probe — not a "what is my public IP" helper (observed returning `is_local=true` for some clearly non-Excloud addresses, so do not use it as a precise classifier). To learn the caller's public IP, use an external service (e.g. `curl -s https://api.ipify.org`).
## Security groups
- `exc securitygroup create --name <name> [--description "..."]`
- `exc securitygroup list`
- `exc securitygroup get --id <sg_id>` (note: the flag here is `--id`, not `--security_group_id`).
- `exc securitygroup delete --security_group_id <sg_id>`
### Rules
- `exc securitygroup rule create --security_group_id <id> --is_ingress=true|false --protocol <proto> --port_range <range> --cidr <cidr> [--description "..."]`
- `--is_ingress` is **required**. Pass `=true` for ingress, `=false` for egress. Omitting it errors with `required flag(s) "is_ingress" not set`.
- `--protocol` takes Excloud family strings such as `TCPv4`, `UDPv4`, `ICMPv4`, `IPv4` — verify current valid values via a successful `rule list` if unsure.
- `--port_range` accepts single ports (`22`), ranges (`80-443`), or `ANY`.
- Rules are not updatable — to change one, `rule delete` and `rule create` again.
- `exc securitygroup rule list --security_group_id <id>`
- `exc securitygroup rule delete --security_group_rule_id <id>` (destructive).
### Bindings
- `exc securitygroup binding create --interface_id <if_id> --security_group_id <sg_id>`
- `exc securitygroup binding list (--interface_id <id> | --security_group_id <id>)` — at least one filter is required.
- `exc securitygroup binding delete --interface_id <if_id> --security_group_id <sg_id>`
## Volumes and snapshots
- `exc compute volume list` / `exc compute volume get --id <id>`
- `exc compute volume create --name <name> --size_gib <n> [--source_snapshot_id <id>] [--baseline_iops <n>] [--baseline_throughput_mbps <n>]` — zone is injected from config; there is no `--zone_id` flag.
- `exc compute volume rename --volume_id <id> --name <new_name>`
- `exc compute volume resize --volume_id <id> --new_size_gib <n> [--baseline_iops <n>] [--baseline_throughput_mbps <n>]`
- `exc compute volume delete --volume_id <id>` (destructive).
- `exc compute snapshot list` / `exc compute snapshot create --volume_id <id>` / `exc compute snapshot delete --snapshot_id <id>`
## SSH key catalog
- `exc compute key list` / `exc compute key get --id <id>`
- `exc compute key create --name <name> (--ssh-public-key "<pub>" | --ssh-public-key-path <file>)`
- `exc compute key delete --id <id>`
- The key `name` can be passed to `compute create --ssh_pubkey` in place of a raw public key string.
## Kubernetes
- `exc k8s health`
- `exc k8s cluster list`
- `exc k8s cluster create --control_plane_image_id <id> --control_plane_instance_type <type> --subnet_id <id> --root_volume_size_gib <n> [--allocate_public_ipv4] [--security_group_ids <id1,id2>] [--ssh_pubkey "<pubkey>"] [-o <path>]`
- The response contains the admin kubeconfig inline. Passing `-o <path>` writes it to disk (mode 0600, creating parent dirs) and strips it from stdout — strongly preferred.
- `exc k8s cluster delete --cluster_id <id>` (destructive).
- `exc k8s cluster worker list --cluster_id <id>`
- `exc k8s cluster worker create --cluster_id <id> --worker_image_id <id> --worker_instance_type <type> --subnet_id <id> --root_volume_size_gib <n> [--allocate_public_ipv4] [--security_group_ids <ids>] [--ssh_pubkey "<pubkey>"]`
- `exc k8s cluster worker delete --cluster_id <id> --worker_id <id>` (destructive).
- `exc k8s cluster kubeconfig get --cluster_id <id> [-o <path>]` — fetches the current kubeconfig and prints to stdout (or writes to `-o` with mode 0600). Returns a clear 404 if the cluster id is unknown.
- `exc k8s cluster kubeconfig merge --cluster_id <id> [--kubeconfig <path>] [--backup=true|false]` — merges into `~/.kube/config` (or `--kubeconfig`) using `kubectl config view --merge --flatten --raw`. Requires `kubectl` on PATH. `--backup` defaults to `true` and writes `<path>.bak`, `<path>.bak1`, ... before overwriting.
- `exc k8s bootstrap controlplane get --vm_id <id> --x-exc-imds-token <token>` — operator bootstrap path; the IMDS token must come from inside the VM's IMDS agent, not be invented.
## IAM, billing, quota
- `exc org list`
- `exc account list` / `exc account invite --email <email>` / `exc account revoke --email <email>` (the revoke flag is `--email`, not an invite id).
- `exc serviceaccount list` / `exc serviceaccount delete --name <name>`
- `exc apikey list` / `exc apikey create` (prints the new key once — capture it immediately) / `exc apikey delete --hash <hash>`
- `exc policy list` / `exc policy delete --id <policy_id>`
- `exc policy binding list (--account_id <id> | --service_account_id <id>)` — at least one filter required; neither errors with `either account_id or service_account_id must be provided`.
- `exc policy binding delete --policy_id <id> (--account_id <id> | --service_account_id <id>)`
- `exc billing get` / `exc quota`
## Config and misc
- `exc me` / `exc version` / `exc completion <bash|zsh|fish|powershell>`
- `exc config list` — shows the current default account / org / zone and configured accounts.
- `exc config set [-a|--account <account_id>] [-o|--org <org_id>]` — no `--zone` here; default zone is set at login time.
## Output formats
Every command either prints a column table (or TSV) or prints JSON — no command should print raw Go-struct dumps anymore. Both shapes are machine-parseable; pick your tool accordingly.
- **Column tables / TSV** (awk / `cut` / `awk -F\t` friendly): `compute list`, `compute instances list`, `compute get`, `compute create`, `compute terminate` (TSV `vm_id\tstate`), `compute instancetype list` / `capacity`, `compute image list`, `compute subnet list`, `compute volume list`, `compute volume get`, `compute snapshot list`, `compute publicip list`, `compute key list`, `securitygroup list` / `rule list` / `binding list`, `org list`, `account list`, `apikey list`, `policy list`, `config list`, `compute seriallogs`.
- **JSON** (pipe through `jq`): `me`, `quota`, `billing get`, `compute health` (`{"raw":"OK"}`), `k8s health`, `compute subnet get`, `compute publicip get`, `compute key get`, `securitygroup get`, `compute metrics`, `compute connect`, `serviceaccount list`, `compute protect`, `compute unprotect`, `compute rename`, `k8s cluster kubeconfig get` (raw kubeconfig YAML, not JSON-wrapped), and the inline `kubeconfig` field inside the JSON response from `k8s cluster create` when `-o` is not set.
Before scripting heavy logic against a command, run it once and check the shape. The split between "table" and "JSON" is not always guessable — lists tend to be tables, getters tend to be JSON, but verify.
## Metrics
`exc compute metrics --vm_id <id> --start <rfc3339> --end <rfc3339> [--family <family>]`
- Only `cpu` is currently supported. Omitting `--family` defaults to CPU. Any other family (`memory`, `network`, `diskio`, ...) returns `Requested metrics family is not supported for this endpoint.` with exit 1. Re-check `--help` and the above claim if the backend later adds families.
- Output is JSON: `{"series":[{"family":"cpu","period_seconds":5,"points":[{"timestamp":"...","average":<n>,"max":<n>,"min":<n>}, ...],"unit":"Percent"}]}`. Parse with `jq` (e.g. `jq '.series[0].points[-1].average'`).
## Error messages to recognise
- `not authenticated; run \`exc login\`` — no valid token in env or `~/.exc/config`.
- `required flag(s) "<name>" not set` — cobra-level enforcement. Read `--help` again.
- `Could not parse your request!! Are you sure you passed the correct flags?` — generic backend 400. Typically means an unknown ID, a value of the wrong type, or a server-side required field that the CLI accepted as empty. Verify every ID against a `list` before retrying.
- `Oops could not find the <Resource> you specified, maybe try checking if the <resource> exists?` — backend 404-ish. Trust the hint.
- `Oops the IP provided is invalid` — syntactic IP validation on `compute localip`.
- `Something went wrong on our end!!` — backend 500. Observed on `compute connect` for a non-existent VM. Verify the VM exists via `compute get`; do not retry blindly.
- `VM delete protection is enabled. Disable delete protection before terminating this instance.` — run `exc compute unprotect --vm-id <id>` first, then retry `terminate`.
- `At least one field must be provided: name or delete_protection.` — you hit `compute rename` / `compute update` with neither flag set. Pass `--name <name>` and/or use `protect` / `unprotect` instead of `rename --delete_protection=...` for protection changes.
- `command timed out` (exit 124) — `exec --timeout` elapsed. Raise the timeout, or launch the work in the background on the VM (`nohup`, systemd unit) and poll with subsequent `exec` calls.
- `invalid --direction "<x>": must be one of older or newer` / `use --offset and --direction together` / `--limit must be greater than 0` — `seriallogs` argument validation.
- `either account_id or service_account_id must be provided` — `policy binding list` needs at least one filter.
- `symlink entries are not supported: <path>` (exit 1) — `scp --recursive` refuses trees containing symlinks; archive or dereference locally first.
- `unknown session` / `Timeout connecting to the instance!` from `exec` WS fallback — the server-side console rejected the session. SSH is the only reliable path right now; tell the user to ensure the VM has a reachable SSH address and permissive SG rather than relying on WS fallback.