docs(profile): refresh Gitea infrastructure profile

This commit is contained in:
2026-05-27 10:27:14 -04:00
parent 95646eb281
commit 6fcdaa9f84
2 changed files with 152 additions and 166 deletions
+76 -83
View File
@@ -1,126 +1,119 @@
# 🛡️ Patrick Beane
# Patrick Beane
**SRE | Security Engineer | Self-Hosted Infra & Detection**
**Infrastructure & Security Engineer | SRE | Cloud-Native Platforms**
I design and operate **security-first, self-hosted infrastructure** focused on detection, resilience, and sovereignty.
My lab functions as a live production environment where threat intelligence, automation, and reliability engineering intersect.
I design and operate a self-directed production infrastructure platform focused on security automation, reliability engineering, observability, vulnerability management, and recoverability.
The environment spans Kubernetes, Linux, multi-cloud infrastructure, identity controls, threat detection, backup verification, and public operational dashboards. My goal is to build systems that are secure-by-default, observable in production, and recoverable under failure.
---
## 🛰 The Fleet (10 Nodes)
## Production Infrastructure Overview
> This environment blends production, research, and continuous experimentation.
> Availability and controls are intentionally tuned per node role.
This environment blends production operations, security research, and continuous infrastructure improvement. Services are distributed across cloud and self-hosted nodes, with each node scoped to a specific operational role to reduce blast radius and simplify ownership.
| Node | Role | Specs | Status |
| :--- | :--- | :--- | :--- |
| **Argus** | SIEM / Brain / node-health Failover | Xeon E5-2660v2 (1 core) | 🟢 Online |
| **Triton** | High Performance Compute | EPYC 9634 (8 cores) | 🟢 Online |
| **Ares** | Gitea / Kubernetes Management Node (MicroK8s) | Ryzen 9 9950X (8 cores) | 🟢 Online |
| **Zephyrus** | Container Host | Ryzen 9 7950X (4 cores) | 🟢 Online |
| **Iris** | NGINX / PHP Edge | Vultr | 🟢 Online |
| **Vault** | Secrets Management | GCP (Vaultwarden) | 🟢 Online |
| **Apollo** | Intel Dashboard (Flask) | AWS | 🟢 Online |
| **Hermes** | Public API (Frontend) | Oracle Cloud | 🟢 Online |
| **Hades** | Public API (Backend) | Oracle Cloud | 🟢 Online |
| **Zeus** | Monitoring / Metrics NOC | Xeon Gold 6150 (1 core) | 🟢 Online |
| Node | Primary Role | Function |
| :--- | :--- | :--- |
| **Argus** | Security telemetry and failover automation | Threat detection, event correlation, and node-health automation |
| **Triton** | Observability and internal tooling | Prometheus, Grafana, Authelia, code-server, CrowdSec bouncers |
| **Ares** | Kubernetes and source control | Gitea, PostgreSQL, Valkey, CI runners, Kubernetes control plane |
| **Zephyrus** | Container hosting | Docker workloads and service hosting |
| **Iris** | Edge services | NGINX/PHP ingress and public-facing services |
| **Vault** | Secrets and identity-adjacent services | Vaultwarden and protected internal services |
| **Apollo** | Threat intelligence dashboard | Flask-based analytics and reporting |
| **Hermes** | Public API frontend | Public API and frontend service layer |
| **Hades** | Public API backend | Backend service support for public APIs |
| **Zeus** | Monitoring and metrics | Centralized observability and service-health tracking |
---
## 🌐 Infrastructure Strategy
## Infrastructure Strategy
- **Compute Layer:** Zen 5 (9950X), Zen 4 (7950X), EPYC 9634 for sustained workloads.
- **Edge Layer:** Oracle Cloud & Vultr for low-latency public ingress.
- **Sentinel Layer:** **Argus SIEM** correlating telemetry and enforcing distributed decisions across nodes.
- **Observability:** Zeus as the centralized NOC and metrics authority.
- **Compute layer:** Heterogeneous self-hosted and cloud infrastructure scoped by workload type
- **Edge layer:** Cloud and VPS ingress for public services and low-latency routing
- **Security telemetry:** Multi-node detection and mitigation workflows using CrowdSec and custom automation
- **Observability:** Centralized monitoring with Prometheus, Grafana, Netdata, exporters, and public dashboards
- **Resilience:** Automated health checks, DNS failover, backup verification, and role-scoped service design
---
## 🛡️ Detection & Response Lifecycle
## Security Detection and Response
- **Triage:** Telemetry ingested from 7 active nodes into the Argus engine.
- **Escalation:** Post-exploitation indicators (e.g. webshells) trigger immediate `PERM_BAN`.
- **Retention:**
- 24 hours for lower confidence scenarios
- 14 days for high-confidence IOCs
- 7 days for offender watchlist
- **Notification:** High-severity events dynamically pushed to Discord.
Security controls are integrated directly into the platform rather than handled as one-off manual checks.
Current detection and response patterns include:
- Telemetry ingestion from 7 active nodes
- CrowdSec-based detection and mitigation
- MITRE ATT&CK mapping for selected security events
- Escalation logic for high-confidence indicators
- Watchlist and retention policies based on event confidence
- High-severity event notifications through Discord
- Runtime visibility through public and private dashboards
---
## 🛠 The Arsenal
## Technical Stack
**Languages:** Python (Flask, Gunicorn), Bash, JavaScript (React, Node.js)
**Infrastructure:** Kubernetes (K8s), Docker, Caddy, NGINX
**Security:** Argus (Custom SIEM), CrowdSec, Trivy, SQLite, Vaultwarden
**Observability:** Prometheus, Blackbox Exporter, Node Exporter
**Backups:** Borgmatic, Rsync.net (Encrypted Offsite)
**Languages:** Python, Bash, JavaScript, React, Node.js
**Infrastructure:** Kubernetes, Docker, Caddy, NGINX, Linux
**Security:** CrowdSec, Trivy, Authelia, OIDC, MFA, Fail2Ban, Vaultwarden
**Cloud and Networking:** AWS, GCP, Oracle Cloud, Vultr, Cloudflare, DNS automation
**Observability:** Prometheus, Grafana, Netdata, Blackbox Exporter, Node Exporter, cAdvisor
**Backups:** Borgmatic, encrypted offsite backups, restore verification
**CI/CD and Source Control:** Git, GitHub Actions, Gitea, container image scanning
**Infrastructure as Code:** Terraform
---
### 🧠 Supporting Tooling & Concepts
## Operational Metrics
Actively used across this environment or in adjacent projects:
- **Security & Identity:** Fail2Ban, MITRE ATT&CK mapping, OIDC, Authelia, MFA, TLS hardening
- **Infrastructure & Cloud:** Linux (Debian/Ubuntu), Terraform, AWS, GCP, Oracle Cloud, Vultr
- **CI / Ops:** Git, GitHub Actions, container image scanning
- **Observability (Extended):** Grafana, Netdata
Current platform highlights:
- 10-node distributed infrastructure environment
- 7-node security telemetry and detection footprint
- `REPLACE_ME_LOC` lines of custom code across infrastructure, security, and automation projects
- `REPLACE_ME_COMMITS` commits since January 1 across active repositories
- Automated failover between AWS and peer infrastructure
- Public dashboards for uptime, vulnerabilities, backups, threat telemetry, and service health
- Multiple daily encrypted Borgmatic snapshots shipped offsite
- Recurring backup verification and restore-oriented operational workflows
---
## ⚡ Efficiency Metrics
- **Codebase Growth:** `REPLACE_ME_LOC` lines of custom code across all our repositories
- **Commit Velocity:** `REPLACE_ME_COMMITS` commits since Jan 1
- **Ares:** Ryzen 9 9950X sustaining ~0.06 load avg while running Gitea and a Kubernetes control plane
- **Resilience:** Automated failover between AWS and peer nodes
## 📈 Activity
## Activity
![Commit heatmap](public/heatmap.svg)
---
### 🧩 Deployment Patterns
- **Reverse Proxy:** Caddy/NGINX (Cloudflare where applicable)
- **Observability:** Prometheus + Node Exporter + cAdvisor
- **Lifecycle:** Watchtower for controlled auto-updates
- **Access Control:** Authelia where exposed
- **Management:** Portainer (loopback-bound where possible)
## Deployment Patterns
> Nodes are intentionally heterogeneous.
> Each host is scoped to its role to reduce blast radius and cognitive load.
- **Reverse proxy:** Caddy and NGINX, with Cloudflare where applicable
- **Observability:** Prometheus, Grafana, Node Exporter, Blackbox Exporter, cAdvisor, Netdata
- **Access control:** Authelia, OIDC, MFA, TLS hardening, and protected reverse-proxy routes
- **Lifecycle management:** Controlled container updates and service monitoring
- **Service isolation:** Nodes scoped by role to reduce blast radius and simplify recovery
- **Backup strategy:** Encrypted offsite backups with recurring verification
---
#### 📍 Triton
Primary high-density services node running:
- Prometheus + Grafana
- Code-server
- Authelia
- Trilium
- CrowdSec bouncers
## Selected Public Projects
Optimized for sustained workloads and observability aggregation.
---
### 🔗 Live Projects
- **Threat Decisions & Telemetry:** `threats.beane.me`
- **Threat Intelligence & Analytics:** `intel.beane.me`
- **Vulnerability Scanning (Trivy):** `vuln.beane.me`
- **Backups & Restore Verification:** `backups.beane.me`
- **Portfolio:** `beane.me`
- **Threat Decisions and Telemetry:** `threats.beane.me`
- **Threat Intelligence and Analytics:** `intel.beane.me`
- **Vulnerability Scanning and Trends:** `vuln.beane.me`
- **Backup and Restore Verification:** `backups.beane.me`
- **Threat Decision Observability:** `observe.beane.me`
- **Source Control (Gitea + K8s):** `git.beane.me`
- **Health and Failover Dashboard:** `health.beane.me`
- **Source Control:** `git.beane.me`
- **Terraform Threat Modeling:** `tfstride.beane.me`
---
## 🚜 Resource Management
## Engineering Philosophy
- **Compute Density:** Kubernetes control plane with Postgres and CI workloads on Zen 5 hardware
- **Sovereignty:** All code, telemetry, and backups remain self-hosted
- **Backups:** Multiple daily encrypted Borgmatic snapshots shipped offsite
Production systems should be observable, automated, recoverable, and secure from the start.
> *"Production systems should be observable, automated, recoverable, and secure from the start."*
I focus on infrastructure that explains itself: clear telemetry, deterministic automation, evidence-backed security findings, documented recovery paths, and controls that improve reliability without slowing delivery.