feat: initial commit

This commit is contained in:
nikola
2026-05-19 14:53:39 +02:00
commit 630a3a0985
5 changed files with 171 additions and 0 deletions
+46
View File
@@ -0,0 +1,46 @@
---
name: sysadmin
description: Linux system administration, networking diagnostics, and production hardening workflows. Use when handling SSH/connectivity incidents, DNS/routing/firewall issues, host health checks, systemd/service failures, disk or memory pressure, log triage, baseline security checks, or when the user asks for repeatable Linux ops runbooks.
---
# Sysadmin
## Overview
Execute Linux and network operations with a diagnose-first approach.
Prefer minimal-risk commands, capture evidence before changes, and verify outcome after every fix.
## Workflow
1. Confirm scope and blast radius.
2. Capture current state with `scripts/sysdiag.sh` when possible.
3. Isolate layer: host, service, network path, DNS, or policy.
4. Apply the smallest reversible fix.
5. Re-check service health and user-facing behavior.
6. Summarize root cause, change made, and follow-up hardening actions.
## Triage Decision Map
- Connection refused or timeout:
Check `ss -tulpn`, service status, local firewall (`nft list ruleset` or `iptables -S`), and routing (`ip route`).
- Name resolves incorrectly:
Check `/etc/resolv.conf`, `resolvectl status`, `dig`, and local cache behavior.
- Service flapping:
Check `systemctl status`, `journalctl -u <service> --since "-30m"`, restart policy, and resource pressure.
- Packet loss or latency spikes:
Check `ping`, `mtr` (if present), interface errors via `ip -s link`, and host saturation.
- Host unhealthy:
Check CPU, memory, disk inode usage, and top failing units from `systemctl --failed`.
## Command Guardrails
- Prefer read-only diagnostics first.
- Ask before destructive actions (mass deletes, firewall flush, forced reboot).
- For privileged reads, run with `sudo` only when required.
- Before config edits, back up file: `cp <file> <file>.bak.<timestamp>`.
- After change, validate with targeted checks and logs.
## Resources
- Incident runbook and command matrix: `references/runbook.md`
- Snapshot collector: `scripts/sysdiag.sh`
+6
View File
@@ -0,0 +1,6 @@
interface:
display_name: "Sysadmin"
short_description: "Linux ops, network triage, and hardening"
default_prompt: "Use $sysadmin to triage and fix this Linux/network issue with verifiable steps."
policy:
allow_implicit_invocation: true
+43
View File
@@ -0,0 +1,43 @@
# Linux/Network Incident Runbook
## 1) SSH nedostupan
- Provera puta: `ping <host>` i `traceroute <host>` (ako postoji)
- Provera porta: `nc -vz <host> 22` ili `telnet <host> 22`
- Na hostu: `ss -tulpn | rg ':22'`
- Servis: `systemctl status sshd` ili `systemctl status ssh`
- Firewall: `nft list ruleset | rg '22|ssh'`
## 2) DNS problemi
- Rezolucija: `dig +short <fqdn>`
- Autoritativna provera: `dig <fqdn> @<dns-server>`
- Lokalni resolver: `resolvectl status`
- Konfiguracija: `cat /etc/resolv.conf`
## 3) Aplikacija ne odgovara
- Proces i socket: `ss -tulpn | rg '<port>|<proc>'`
- Unit health: `systemctl status <service>`
- Logovi: `journalctl -u <service> --since '-30m' --no-pager`
- Resursi: `free -h`, `df -hT`, `top`
## 4) Latencija/packet loss
- RTT i gubitak: `ping -c 20 <target>`
- Hop analiza: `mtr -rwzbc 100 <target>` (ako postoji)
- NIC greške: `ip -s link`
- TCP state: `ss -s`
## 5) Hardening minimum
- Otvoreni portovi: `ss -tulpn`
- Neuspešni servisi: `systemctl --failed`
- Kritični CVE pipeline: proveri SBOM/dependency skener u CI
- Audit konfiguracije: baseline CIS/OS hardening check-list
## Promene bezbedno
- Uvek snimi stanje pre izmene (`scripts/sysdiag.sh`).
- Menjaj jednu stvar po iteraciji.
- Posle izmene uradi health-check i rollback plan.
+70
View File
@@ -0,0 +1,70 @@
#!/usr/bin/env bash
set -euo pipefail
# Collect a concise host and network snapshot for incident triage.
now="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
host="$(hostname 2>/dev/null || echo unknown)"
echo "=== sysdiag snapshot ==="
echo "timestamp_utc: $now"
echo "host: $host"
echo
echo "--- os ---"
uname -a || true
if [ -f /etc/os-release ]; then
sed -n '1,12p' /etc/os-release || true
fi
echo
echo "--- uptime/load ---"
uptime || true
echo
echo "--- cpu/memory ---"
free -h || true
echo
echo "--- disk ---"
df -hT || true
df -ih || true
echo
echo "--- interfaces ---"
ip -br addr 2>/dev/null || true
ip -s link 2>/dev/null || true
echo
echo "--- routing ---"
ip route show 2>/dev/null || true
ip rule show 2>/dev/null || true
echo
echo "--- listening sockets ---"
ss -tulpn 2>/dev/null || true
echo
echo "--- dns ---"
if command -v resolvectl >/dev/null 2>&1; then
resolvectl status 2>/dev/null || true
fi
cat /etc/resolv.conf 2>/dev/null || true
echo
echo "--- firewall ---"
if command -v nft >/dev/null 2>&1; then
nft list ruleset 2>/dev/null || true
elif command -v iptables >/dev/null 2>&1; then
iptables -S 2>/dev/null || true
else
echo "No nftables/iptables binary found"
fi
echo
echo "--- services ---"
systemctl --failed --no-pager 2>/dev/null || true
echo
echo "--- recent critical logs ---"
journalctl -p 3 -xb --no-pager -n 120 2>/dev/null || true