Files
2026-05-19 14:53:39 +02:00

1.9 KiB

name, description
name description
sysadmin Linux system administration, networking diagnostics, and production hardening workflows. Use when handling SSH/connectivity incidents, DNS/routing/firewall issues, host health checks, systemd/service failures, disk or memory pressure, log triage, baseline security checks, or when the user asks for repeatable Linux ops runbooks.

Sysadmin

Overview

Execute Linux and network operations with a diagnose-first approach. Prefer minimal-risk commands, capture evidence before changes, and verify outcome after every fix.

Workflow

  1. Confirm scope and blast radius.
  2. Capture current state with scripts/sysdiag.sh when possible.
  3. Isolate layer: host, service, network path, DNS, or policy.
  4. Apply the smallest reversible fix.
  5. Re-check service health and user-facing behavior.
  6. Summarize root cause, change made, and follow-up hardening actions.

Triage Decision Map

  • Connection refused or timeout: Check ss -tulpn, service status, local firewall (nft list ruleset or iptables -S), and routing (ip route).
  • Name resolves incorrectly: Check /etc/resolv.conf, resolvectl status, dig, and local cache behavior.
  • Service flapping: Check systemctl status, journalctl -u <service> --since "-30m", restart policy, and resource pressure.
  • Packet loss or latency spikes: Check ping, mtr (if present), interface errors via ip -s link, and host saturation.
  • Host unhealthy: Check CPU, memory, disk inode usage, and top failing units from systemctl --failed.

Command Guardrails

  • Prefer read-only diagnostics first.
  • Ask before destructive actions (mass deletes, firewall flush, forced reboot).
  • For privileged reads, run with sudo only when required.
  • Before config edits, back up file: cp <file> <file>.bak.<timestamp>.
  • After change, validate with targeted checks and logs.

Resources

  • Incident runbook and command matrix: references/runbook.md
  • Snapshot collector: scripts/sysdiag.sh