docs(profile): refresh Gitea infrastructure profile

This commit is contained in:
2026-05-27 10:27:14 -04:00
parent 95646eb281
commit 6fcdaa9f84
2 changed files with 152 additions and 166 deletions
+76 -83
View File
@@ -1,126 +1,119 @@
# 🛡️ Patrick Beane # Patrick Beane
**SRE | Security Engineer | Self-Hosted Infra & Detection** **Infrastructure & Security Engineer | SRE | Cloud-Native Platforms**
I design and operate **security-first, self-hosted infrastructure** focused on detection, resilience, and sovereignty. I design and operate a self-directed production infrastructure platform focused on security automation, reliability engineering, observability, vulnerability management, and recoverability.
My lab functions as a live production environment where threat intelligence, automation, and reliability engineering intersect.
The environment spans Kubernetes, Linux, multi-cloud infrastructure, identity controls, threat detection, backup verification, and public operational dashboards. My goal is to build systems that are secure-by-default, observable in production, and recoverable under failure.
--- ---
## 🛰 The Fleet (10 Nodes) ## Production Infrastructure Overview
> This environment blends production, research, and continuous experimentation. This environment blends production operations, security research, and continuous infrastructure improvement. Services are distributed across cloud and self-hosted nodes, with each node scoped to a specific operational role to reduce blast radius and simplify ownership.
> Availability and controls are intentionally tuned per node role.
| Node | Role | Specs | Status | | Node | Primary Role | Function |
| :--- | :--- | :--- | :--- | | :--- | :--- | :--- |
| **Argus** | SIEM / Brain / node-health Failover | Xeon E5-2660v2 (1 core) | 🟢 Online | | **Argus** | Security telemetry and failover automation | Threat detection, event correlation, and node-health automation |
| **Triton** | High Performance Compute | EPYC 9634 (8 cores) | 🟢 Online | | **Triton** | Observability and internal tooling | Prometheus, Grafana, Authelia, code-server, CrowdSec bouncers |
| **Ares** | Gitea / Kubernetes Management Node (MicroK8s) | Ryzen 9 9950X (8 cores) | 🟢 Online | | **Ares** | Kubernetes and source control | Gitea, PostgreSQL, Valkey, CI runners, Kubernetes control plane |
| **Zephyrus** | Container Host | Ryzen 9 7950X (4 cores) | 🟢 Online | | **Zephyrus** | Container hosting | Docker workloads and service hosting |
| **Iris** | NGINX / PHP Edge | Vultr | 🟢 Online | | **Iris** | Edge services | NGINX/PHP ingress and public-facing services |
| **Vault** | Secrets Management | GCP (Vaultwarden) | 🟢 Online | | **Vault** | Secrets and identity-adjacent services | Vaultwarden and protected internal services |
| **Apollo** | Intel Dashboard (Flask) | AWS | 🟢 Online | | **Apollo** | Threat intelligence dashboard | Flask-based analytics and reporting |
| **Hermes** | Public API (Frontend) | Oracle Cloud | 🟢 Online | | **Hermes** | Public API frontend | Public API and frontend service layer |
| **Hades** | Public API (Backend) | Oracle Cloud | 🟢 Online | | **Hades** | Public API backend | Backend service support for public APIs |
| **Zeus** | Monitoring / Metrics NOC | Xeon Gold 6150 (1 core) | 🟢 Online | | **Zeus** | Monitoring and metrics | Centralized observability and service-health tracking |
--- ---
## 🌐 Infrastructure Strategy ## Infrastructure Strategy
- **Compute Layer:** Zen 5 (9950X), Zen 4 (7950X), EPYC 9634 for sustained workloads. - **Compute layer:** Heterogeneous self-hosted and cloud infrastructure scoped by workload type
- **Edge Layer:** Oracle Cloud & Vultr for low-latency public ingress. - **Edge layer:** Cloud and VPS ingress for public services and low-latency routing
- **Sentinel Layer:** **Argus SIEM** correlating telemetry and enforcing distributed decisions across nodes. - **Security telemetry:** Multi-node detection and mitigation workflows using CrowdSec and custom automation
- **Observability:** Zeus as the centralized NOC and metrics authority. - **Observability:** Centralized monitoring with Prometheus, Grafana, Netdata, exporters, and public dashboards
- **Resilience:** Automated health checks, DNS failover, backup verification, and role-scoped service design
--- ---
## 🛡️ Detection & Response Lifecycle ## Security Detection and Response
- **Triage:** Telemetry ingested from 7 active nodes into the Argus engine. Security controls are integrated directly into the platform rather than handled as one-off manual checks.
- **Escalation:** Post-exploitation indicators (e.g. webshells) trigger immediate `PERM_BAN`.
- **Retention:** Current detection and response patterns include:
- 24 hours for lower confidence scenarios
- 14 days for high-confidence IOCs - Telemetry ingestion from 7 active nodes
- 7 days for offender watchlist - CrowdSec-based detection and mitigation
- **Notification:** High-severity events dynamically pushed to Discord. - MITRE ATT&CK mapping for selected security events
- Escalation logic for high-confidence indicators
- Watchlist and retention policies based on event confidence
- High-severity event notifications through Discord
- Runtime visibility through public and private dashboards
--- ---
## 🛠 The Arsenal ## Technical Stack
**Languages:** Python (Flask, Gunicorn), Bash, JavaScript (React, Node.js) **Languages:** Python, Bash, JavaScript, React, Node.js
**Infrastructure:** Kubernetes (K8s), Docker, Caddy, NGINX **Infrastructure:** Kubernetes, Docker, Caddy, NGINX, Linux
**Security:** Argus (Custom SIEM), CrowdSec, Trivy, SQLite, Vaultwarden **Security:** CrowdSec, Trivy, Authelia, OIDC, MFA, Fail2Ban, Vaultwarden
**Observability:** Prometheus, Blackbox Exporter, Node Exporter **Cloud and Networking:** AWS, GCP, Oracle Cloud, Vultr, Cloudflare, DNS automation
**Backups:** Borgmatic, Rsync.net (Encrypted Offsite) **Observability:** Prometheus, Grafana, Netdata, Blackbox Exporter, Node Exporter, cAdvisor
**Backups:** Borgmatic, encrypted offsite backups, restore verification
**CI/CD and Source Control:** Git, GitHub Actions, Gitea, container image scanning
**Infrastructure as Code:** Terraform
--- ---
### 🧠 Supporting Tooling & Concepts ## Operational Metrics
Actively used across this environment or in adjacent projects: Current platform highlights:
- **Security & Identity:** Fail2Ban, MITRE ATT&CK mapping, OIDC, Authelia, MFA, TLS hardening
- **Infrastructure & Cloud:** Linux (Debian/Ubuntu), Terraform, AWS, GCP, Oracle Cloud, Vultr
- **CI / Ops:** Git, GitHub Actions, container image scanning
- **Observability (Extended):** Grafana, Netdata
- 10-node distributed infrastructure environment
- 7-node security telemetry and detection footprint
- `107096` lines of custom code across infrastructure, security, and automation projects
- `1366` commits since January 1 across active repositories
- Automated failover between AWS and peer infrastructure
- Public dashboards for uptime, vulnerabilities, backups, threat telemetry, and service health
- Multiple daily encrypted Borgmatic snapshots shipped offsite
- Recurring backup verification and restore-oriented operational workflows
--- ---
## ⚡ Efficiency Metrics ## Activity
- **Codebase Growth:** `107096` lines of custom code across all our repositories
- **Commit Velocity:** `1366` commits since Jan 1
- **Ares:** Ryzen 9 9950X sustaining ~0.06 load avg while running Gitea and a Kubernetes control plane
- **Resilience:** Automated failover between AWS and peer nodes
## 📈 Activity
![Commit heatmap](public/heatmap.svg) ![Commit heatmap](public/heatmap.svg)
--- ---
### 🧩 Deployment Patterns ## Deployment Patterns
- **Reverse Proxy:** Caddy/NGINX (Cloudflare where applicable)
- **Observability:** Prometheus + Node Exporter + cAdvisor
- **Lifecycle:** Watchtower for controlled auto-updates
- **Access Control:** Authelia where exposed
- **Management:** Portainer (loopback-bound where possible)
> Nodes are intentionally heterogeneous. - **Reverse proxy:** Caddy and NGINX, with Cloudflare where applicable
> Each host is scoped to its role to reduce blast radius and cognitive load. - **Observability:** Prometheus, Grafana, Node Exporter, Blackbox Exporter, cAdvisor, Netdata
- **Access control:** Authelia, OIDC, MFA, TLS hardening, and protected reverse-proxy routes
- **Lifecycle management:** Controlled container updates and service monitoring
- **Service isolation:** Nodes scoped by role to reduce blast radius and simplify recovery
- **Backup strategy:** Encrypted offsite backups with recurring verification
--- ---
#### 📍 Triton ## Selected Public Projects
Primary high-density services node running:
- Prometheus + Grafana
- Code-server
- Authelia
- Trilium
- CrowdSec bouncers
Optimized for sustained workloads and observability aggregation. - **Portfolio:** `beane.me`
- **Threat Decisions and Telemetry:** `threats.beane.me`
--- - **Threat Intelligence and Analytics:** `intel.beane.me`
- **Vulnerability Scanning and Trends:** `vuln.beane.me`
### 🔗 Live Projects - **Backup and Restore Verification:** `backups.beane.me`
- **Threat Decisions & Telemetry:** `threats.beane.me`
- **Threat Intelligence & Analytics:** `intel.beane.me`
- **Vulnerability Scanning (Trivy):** `vuln.beane.me`
- **Backups & Restore Verification:** `backups.beane.me`
- **Threat Decision Observability:** `observe.beane.me` - **Threat Decision Observability:** `observe.beane.me`
- **Source Control (Gitea + K8s):** `git.beane.me` - **Health and Failover Dashboard:** `health.beane.me`
- **Source Control:** `git.beane.me`
- **Terraform Threat Modeling:** `tfstride.beane.me`
--- ---
## 🚜 Resource Management ## Engineering Philosophy
- **Compute Density:** Kubernetes control plane with Postgres and CI workloads on Zen 5 hardware Production systems should be observable, automated, recoverable, and secure from the start.
- **Sovereignty:** All code, telemetry, and backups remain self-hosted
- **Backups:** Multiple daily encrypted Borgmatic snapshots shipped offsite
> *"Production systems should be observable, automated, recoverable, and secure from the start."* I focus on infrastructure that explains itself: clear telemetry, deterministic automation, evidence-backed security findings, documented recovery paths, and controls that improve reliability without slowing delivery.
+76 -83
View File
@@ -1,126 +1,119 @@
# 🛡️ Patrick Beane # Patrick Beane
**SRE | Security Engineer | Self-Hosted Infra & Detection** **Infrastructure & Security Engineer | SRE | Cloud-Native Platforms**
I design and operate **security-first, self-hosted infrastructure** focused on detection, resilience, and sovereignty. I design and operate a self-directed production infrastructure platform focused on security automation, reliability engineering, observability, vulnerability management, and recoverability.
My lab functions as a live production environment where threat intelligence, automation, and reliability engineering intersect.
The environment spans Kubernetes, Linux, multi-cloud infrastructure, identity controls, threat detection, backup verification, and public operational dashboards. My goal is to build systems that are secure-by-default, observable in production, and recoverable under failure.
--- ---
## 🛰 The Fleet (10 Nodes) ## Production Infrastructure Overview
> This environment blends production, research, and continuous experimentation. This environment blends production operations, security research, and continuous infrastructure improvement. Services are distributed across cloud and self-hosted nodes, with each node scoped to a specific operational role to reduce blast radius and simplify ownership.
> Availability and controls are intentionally tuned per node role.
| Node | Role | Specs | Status | | Node | Primary Role | Function |
| :--- | :--- | :--- | :--- | | :--- | :--- | :--- |
| **Argus** | SIEM / Brain / node-health Failover | Xeon E5-2660v2 (1 core) | 🟢 Online | | **Argus** | Security telemetry and failover automation | Threat detection, event correlation, and node-health automation |
| **Triton** | High Performance Compute | EPYC 9634 (8 cores) | 🟢 Online | | **Triton** | Observability and internal tooling | Prometheus, Grafana, Authelia, code-server, CrowdSec bouncers |
| **Ares** | Gitea / Kubernetes Management Node (MicroK8s) | Ryzen 9 9950X (8 cores) | 🟢 Online | | **Ares** | Kubernetes and source control | Gitea, PostgreSQL, Valkey, CI runners, Kubernetes control plane |
| **Zephyrus** | Container Host | Ryzen 9 7950X (4 cores) | 🟢 Online | | **Zephyrus** | Container hosting | Docker workloads and service hosting |
| **Iris** | NGINX / PHP Edge | Vultr | 🟢 Online | | **Iris** | Edge services | NGINX/PHP ingress and public-facing services |
| **Vault** | Secrets Management | GCP (Vaultwarden) | 🟢 Online | | **Vault** | Secrets and identity-adjacent services | Vaultwarden and protected internal services |
| **Apollo** | Intel Dashboard (Flask) | AWS | 🟢 Online | | **Apollo** | Threat intelligence dashboard | Flask-based analytics and reporting |
| **Hermes** | Public API (Frontend) | Oracle Cloud | 🟢 Online | | **Hermes** | Public API frontend | Public API and frontend service layer |
| **Hades** | Public API (Backend) | Oracle Cloud | 🟢 Online | | **Hades** | Public API backend | Backend service support for public APIs |
| **Zeus** | Monitoring / Metrics NOC | Xeon Gold 6150 (1 core) | 🟢 Online | | **Zeus** | Monitoring and metrics | Centralized observability and service-health tracking |
--- ---
## 🌐 Infrastructure Strategy ## Infrastructure Strategy
- **Compute Layer:** Zen 5 (9950X), Zen 4 (7950X), EPYC 9634 for sustained workloads. - **Compute layer:** Heterogeneous self-hosted and cloud infrastructure scoped by workload type
- **Edge Layer:** Oracle Cloud & Vultr for low-latency public ingress. - **Edge layer:** Cloud and VPS ingress for public services and low-latency routing
- **Sentinel Layer:** **Argus SIEM** correlating telemetry and enforcing distributed decisions across nodes. - **Security telemetry:** Multi-node detection and mitigation workflows using CrowdSec and custom automation
- **Observability:** Zeus as the centralized NOC and metrics authority. - **Observability:** Centralized monitoring with Prometheus, Grafana, Netdata, exporters, and public dashboards
- **Resilience:** Automated health checks, DNS failover, backup verification, and role-scoped service design
--- ---
## 🛡️ Detection & Response Lifecycle ## Security Detection and Response
- **Triage:** Telemetry ingested from 7 active nodes into the Argus engine. Security controls are integrated directly into the platform rather than handled as one-off manual checks.
- **Escalation:** Post-exploitation indicators (e.g. webshells) trigger immediate `PERM_BAN`.
- **Retention:** Current detection and response patterns include:
- 24 hours for lower confidence scenarios
- 14 days for high-confidence IOCs - Telemetry ingestion from 7 active nodes
- 7 days for offender watchlist - CrowdSec-based detection and mitigation
- **Notification:** High-severity events dynamically pushed to Discord. - MITRE ATT&CK mapping for selected security events
- Escalation logic for high-confidence indicators
- Watchlist and retention policies based on event confidence
- High-severity event notifications through Discord
- Runtime visibility through public and private dashboards
--- ---
## 🛠 The Arsenal ## Technical Stack
**Languages:** Python (Flask, Gunicorn), Bash, JavaScript (React, Node.js) **Languages:** Python, Bash, JavaScript, React, Node.js
**Infrastructure:** Kubernetes (K8s), Docker, Caddy, NGINX **Infrastructure:** Kubernetes, Docker, Caddy, NGINX, Linux
**Security:** Argus (Custom SIEM), CrowdSec, Trivy, SQLite, Vaultwarden **Security:** CrowdSec, Trivy, Authelia, OIDC, MFA, Fail2Ban, Vaultwarden
**Observability:** Prometheus, Blackbox Exporter, Node Exporter **Cloud and Networking:** AWS, GCP, Oracle Cloud, Vultr, Cloudflare, DNS automation
**Backups:** Borgmatic, Rsync.net (Encrypted Offsite) **Observability:** Prometheus, Grafana, Netdata, Blackbox Exporter, Node Exporter, cAdvisor
**Backups:** Borgmatic, encrypted offsite backups, restore verification
**CI/CD and Source Control:** Git, GitHub Actions, Gitea, container image scanning
**Infrastructure as Code:** Terraform
--- ---
### 🧠 Supporting Tooling & Concepts ## Operational Metrics
Actively used across this environment or in adjacent projects: Current platform highlights:
- **Security & Identity:** Fail2Ban, MITRE ATT&CK mapping, OIDC, Authelia, MFA, TLS hardening
- **Infrastructure & Cloud:** Linux (Debian/Ubuntu), Terraform, AWS, GCP, Oracle Cloud, Vultr
- **CI / Ops:** Git, GitHub Actions, container image scanning
- **Observability (Extended):** Grafana, Netdata
- 10-node distributed infrastructure environment
- 7-node security telemetry and detection footprint
- `REPLACE_ME_LOC` lines of custom code across infrastructure, security, and automation projects
- `REPLACE_ME_COMMITS` commits since January 1 across active repositories
- Automated failover between AWS and peer infrastructure
- Public dashboards for uptime, vulnerabilities, backups, threat telemetry, and service health
- Multiple daily encrypted Borgmatic snapshots shipped offsite
- Recurring backup verification and restore-oriented operational workflows
--- ---
## ⚡ Efficiency Metrics ## Activity
- **Codebase Growth:** `REPLACE_ME_LOC` lines of custom code across all our repositories
- **Commit Velocity:** `REPLACE_ME_COMMITS` commits since Jan 1
- **Ares:** Ryzen 9 9950X sustaining ~0.06 load avg while running Gitea and a Kubernetes control plane
- **Resilience:** Automated failover between AWS and peer nodes
## 📈 Activity
![Commit heatmap](public/heatmap.svg) ![Commit heatmap](public/heatmap.svg)
--- ---
### 🧩 Deployment Patterns ## Deployment Patterns
- **Reverse Proxy:** Caddy/NGINX (Cloudflare where applicable)
- **Observability:** Prometheus + Node Exporter + cAdvisor
- **Lifecycle:** Watchtower for controlled auto-updates
- **Access Control:** Authelia where exposed
- **Management:** Portainer (loopback-bound where possible)
> Nodes are intentionally heterogeneous. - **Reverse proxy:** Caddy and NGINX, with Cloudflare where applicable
> Each host is scoped to its role to reduce blast radius and cognitive load. - **Observability:** Prometheus, Grafana, Node Exporter, Blackbox Exporter, cAdvisor, Netdata
- **Access control:** Authelia, OIDC, MFA, TLS hardening, and protected reverse-proxy routes
- **Lifecycle management:** Controlled container updates and service monitoring
- **Service isolation:** Nodes scoped by role to reduce blast radius and simplify recovery
- **Backup strategy:** Encrypted offsite backups with recurring verification
--- ---
#### 📍 Triton ## Selected Public Projects
Primary high-density services node running:
- Prometheus + Grafana
- Code-server
- Authelia
- Trilium
- CrowdSec bouncers
Optimized for sustained workloads and observability aggregation. - **Portfolio:** `beane.me`
- **Threat Decisions and Telemetry:** `threats.beane.me`
--- - **Threat Intelligence and Analytics:** `intel.beane.me`
- **Vulnerability Scanning and Trends:** `vuln.beane.me`
### 🔗 Live Projects - **Backup and Restore Verification:** `backups.beane.me`
- **Threat Decisions & Telemetry:** `threats.beane.me`
- **Threat Intelligence & Analytics:** `intel.beane.me`
- **Vulnerability Scanning (Trivy):** `vuln.beane.me`
- **Backups & Restore Verification:** `backups.beane.me`
- **Threat Decision Observability:** `observe.beane.me` - **Threat Decision Observability:** `observe.beane.me`
- **Source Control (Gitea + K8s):** `git.beane.me` - **Health and Failover Dashboard:** `health.beane.me`
- **Source Control:** `git.beane.me`
- **Terraform Threat Modeling:** `tfstride.beane.me`
--- ---
## 🚜 Resource Management ## Engineering Philosophy
- **Compute Density:** Kubernetes control plane with Postgres and CI workloads on Zen 5 hardware Production systems should be observable, automated, recoverable, and secure from the start.
- **Sovereignty:** All code, telemetry, and backups remain self-hosted
- **Backups:** Multiple daily encrypted Borgmatic snapshots shipped offsite
> *"Production systems should be observable, automated, recoverable, and secure from the start."* I focus on infrastructure that explains itself: clear telemetry, deterministic automation, evidence-backed security findings, documented recovery paths, and controls that improve reliability without slowing delivery.