# Patrick Beane **Infrastructure & Security Engineer | SRE | Cloud-Native Platforms** I design and operate a self-directed production infrastructure platform focused on security automation, reliability engineering, observability, vulnerability management, and recoverability. The environment spans Kubernetes, Linux, multi-cloud infrastructure, identity controls, threat detection, backup verification, and public operational dashboards. My goal is to build systems that are secure-by-default, observable in production, and recoverable under failure. --- ## Production Infrastructure Overview This environment blends production operations, security research, and continuous infrastructure improvement. Services are distributed across cloud and self-hosted nodes, with each node scoped to a specific operational role to reduce blast radius and simplify ownership. | Node | Primary Role | Function | | :--- | :--- | :--- | | **Argus** | Security telemetry and failover automation | Threat detection, event correlation, and node-health automation | | **Triton** | Observability and internal tooling | Prometheus, Grafana, Authelia, code-server, CrowdSec bouncers | | **Ares** | Kubernetes and source control | Gitea, PostgreSQL, Valkey, CI runners, Kubernetes control plane | | **Zephyrus** | Container hosting | Docker workloads and service hosting | | **Iris** | Edge services | NGINX/PHP ingress and public-facing services | | **Vault** | Secrets and identity-adjacent services | Vaultwarden and protected internal services | | **Apollo** | Threat intelligence dashboard | Flask-based analytics and reporting | | **Hermes** | Public API frontend | Public API and frontend service layer | | **Hades** | Public API backend | Backend service support for public APIs | | **Zeus** | Monitoring and metrics | Centralized observability and service-health tracking | --- ## Infrastructure Strategy - **Compute layer:** Heterogeneous self-hosted and cloud infrastructure scoped by workload type - **Edge layer:** Cloud and VPS ingress for public services and low-latency routing - **Security telemetry:** Multi-node detection and mitigation workflows using CrowdSec and custom automation - **Observability:** Centralized monitoring with Prometheus, Grafana, Netdata, exporters, and public dashboards - **Resilience:** Automated health checks, DNS failover, backup verification, and role-scoped service design --- ## Security Detection and Response Security controls are integrated directly into the platform rather than handled as one-off manual checks. Current detection and response patterns include: - Telemetry ingestion from 7 active nodes - CrowdSec-based detection and mitigation - MITRE ATT&CK mapping for selected security events - Escalation logic for high-confidence indicators - Watchlist and retention policies based on event confidence - High-severity event notifications through Discord - Runtime visibility through public and private dashboards --- ## Technical Stack **Languages:** Python, Bash, JavaScript, React, Node.js **Infrastructure:** Kubernetes, Docker, Caddy, NGINX, Linux **Security:** CrowdSec, Trivy, Authelia, OIDC, MFA, Fail2Ban, Vaultwarden **Cloud and Networking:** AWS, GCP, Oracle Cloud, Vultr, Cloudflare, DNS automation **Observability:** Prometheus, Grafana, Netdata, Blackbox Exporter, Node Exporter, cAdvisor **Backups:** Borgmatic, encrypted offsite backups, restore verification **CI/CD and Source Control:** Git, GitHub Actions, Gitea, container image scanning **Infrastructure as Code:** Terraform --- ## Operational Metrics Current platform highlights: - 10-node distributed infrastructure environment - 7-node security telemetry and detection footprint - `132638` lines of custom code across infrastructure, security, and automation projects - `1482` commits since January 1 across active repositories - Automated failover between AWS and peer infrastructure - Public dashboards for uptime, vulnerabilities, backups, threat telemetry, and service health - Multiple daily encrypted Borgmatic snapshots shipped offsite - Recurring backup verification and restore-oriented operational workflows - Nightly metrics refresh via Gitea Actions and `tokei` --- ## Activity ![Commit heatmap](public/heatmap.svg) --- ## Deployment Patterns - **Reverse proxy:** Caddy and NGINX, with Cloudflare where applicable - **Observability:** Prometheus, Grafana, Node Exporter, Blackbox Exporter, cAdvisor, Netdata - **Access control:** Authelia, OIDC, MFA, TLS hardening, and protected reverse-proxy routes - **Lifecycle management:** Controlled container updates and service monitoring - **Service isolation:** Nodes scoped by role to reduce blast radius and simplify recovery - **Backup strategy:** Encrypted offsite backups with recurring verification --- ## Selected Public Projects - **Portfolio:** [beane.me](https://beane.me) - **Threat Decisions and Telemetry:** [threats.beane.me](https://threats.beane.me) - **Threat Intelligence and Analytics:** [intel.beane.me](https://intel.beane.me) - **Vulnerability Scanning and Trends:** [vuln.beane.me](https://vuln.beane.me) - **Backup and Restore Verification:** [backups.beane.me](https://backups.beane.me) - **Threat Decision Observability:** [observe.beane.me](https://observe.beane.me) - **Health and Failover Dashboard:** [health.beane.me](https://health.beane.me) - **Source Control:** [git.beane.me](https://git.beane.me) - **Terraform Threat Modeling:** [tfstride.beane.me](https://tfstride.beane.me) --- ## Engineering Philosophy Production systems should be observable, automated, recoverable, and secure from the start. I focus on infrastructure that explains itself: clear telemetry, deterministic automation, evidence-backed security findings, documented recovery paths, and controls that improve reliability without slowing delivery.