1.9 KiB
1.9 KiB
name, description
| name | description |
|---|---|
| sysadmin | Linux system administration, networking diagnostics, and production hardening workflows. Use when handling SSH/connectivity incidents, DNS/routing/firewall issues, host health checks, systemd/service failures, disk or memory pressure, log triage, baseline security checks, or when the user asks for repeatable Linux ops runbooks. |
Sysadmin
Overview
Execute Linux and network operations with a diagnose-first approach. Prefer minimal-risk commands, capture evidence before changes, and verify outcome after every fix.
Workflow
- Confirm scope and blast radius.
- Capture current state with
scripts/sysdiag.shwhen possible. - Isolate layer: host, service, network path, DNS, or policy.
- Apply the smallest reversible fix.
- Re-check service health and user-facing behavior.
- Summarize root cause, change made, and follow-up hardening actions.
Triage Decision Map
- Connection refused or timeout:
Check
ss -tulpn, service status, local firewall (nft list rulesetoriptables -S), and routing (ip route). - Name resolves incorrectly:
Check
/etc/resolv.conf,resolvectl status,dig, and local cache behavior. - Service flapping:
Check
systemctl status,journalctl -u <service> --since "-30m", restart policy, and resource pressure. - Packet loss or latency spikes:
Check
ping,mtr(if present), interface errors viaip -s link, and host saturation. - Host unhealthy:
Check CPU, memory, disk inode usage, and top failing units from
systemctl --failed.
Command Guardrails
- Prefer read-only diagnostics first.
- Ask before destructive actions (mass deletes, firewall flush, forced reboot).
- For privileged reads, run with
sudoonly when required. - Before config edits, back up file:
cp <file> <file>.bak.<timestamp>. - After change, validate with targeted checks and logs.
Resources
- Incident runbook and command matrix:
references/runbook.md - Snapshot collector:
scripts/sysdiag.sh