121 lines
5.7 KiB
Markdown
121 lines
5.7 KiB
Markdown
# Patrick Beane
|
|
|
|
**Infrastructure & Security Engineer | SRE | Cloud-Native Platforms**
|
|
|
|
I design and operate a self-directed production infrastructure platform focused on security automation, reliability engineering, observability, vulnerability management, and recoverability.
|
|
|
|
The environment spans Kubernetes, Linux, multi-cloud infrastructure, identity controls, threat detection, backup verification, and public operational dashboards. My goal is to build systems that are secure-by-default, observable in production, and recoverable under failure.
|
|
|
|
---
|
|
|
|
## Production Infrastructure Overview
|
|
|
|
This environment blends production operations, security research, and continuous infrastructure improvement. Services are distributed across cloud and self-hosted nodes, with each node scoped to a specific operational role to reduce blast radius and simplify ownership.
|
|
|
|
| Node | Primary Role | Function |
|
|
| :--- | :--- | :--- |
|
|
| **Argus** | Security telemetry and failover automation | Threat detection, event correlation, and node-health automation |
|
|
| **Triton** | Observability and internal tooling | Prometheus, Grafana, Authelia, code-server, CrowdSec bouncers |
|
|
| **Ares** | Kubernetes and source control | Gitea, PostgreSQL, Valkey, CI runners, Kubernetes control plane |
|
|
| **Zephyrus** | Container hosting | Docker workloads and service hosting |
|
|
| **Iris** | Edge services | NGINX/PHP ingress and public-facing services |
|
|
| **Vault** | Secrets and identity-adjacent services | Vaultwarden and protected internal services |
|
|
| **Apollo** | Threat intelligence dashboard | Flask-based analytics and reporting |
|
|
| **Hermes** | Public API frontend | Public API and frontend service layer |
|
|
| **Hades** | Public API backend | Backend service support for public APIs |
|
|
| **Zeus** | Monitoring and metrics | Centralized observability and service-health tracking |
|
|
|
|
---
|
|
|
|
## Infrastructure Strategy
|
|
|
|
- **Compute layer:** Heterogeneous self-hosted and cloud infrastructure scoped by workload type
|
|
- **Edge layer:** Cloud and VPS ingress for public services and low-latency routing
|
|
- **Security telemetry:** Multi-node detection and mitigation workflows using CrowdSec and custom automation
|
|
- **Observability:** Centralized monitoring with Prometheus, Grafana, Netdata, exporters, and public dashboards
|
|
- **Resilience:** Automated health checks, DNS failover, backup verification, and role-scoped service design
|
|
|
|
---
|
|
|
|
## Security Detection and Response
|
|
|
|
Security controls are integrated directly into the platform rather than handled as one-off manual checks.
|
|
|
|
Current detection and response patterns include:
|
|
|
|
- Telemetry ingestion from 7 active nodes
|
|
- CrowdSec-based detection and mitigation
|
|
- MITRE ATT&CK mapping for selected security events
|
|
- Escalation logic for high-confidence indicators
|
|
- Watchlist and retention policies based on event confidence
|
|
- High-severity event notifications through Discord
|
|
- Runtime visibility through public and private dashboards
|
|
|
|
---
|
|
|
|
## Technical Stack
|
|
|
|
**Languages:** Python, Bash, JavaScript, React, Node.js
|
|
**Infrastructure:** Kubernetes, Docker, Caddy, NGINX, Linux
|
|
**Security:** CrowdSec, Trivy, Authelia, OIDC, MFA, Fail2Ban, Vaultwarden
|
|
**Cloud and Networking:** AWS, GCP, Oracle Cloud, Vultr, Cloudflare, DNS automation
|
|
**Observability:** Prometheus, Grafana, Netdata, Blackbox Exporter, Node Exporter, cAdvisor
|
|
**Backups:** Borgmatic, encrypted offsite backups, restore verification
|
|
**CI/CD and Source Control:** Git, GitHub Actions, Gitea, container image scanning
|
|
**Infrastructure as Code:** Terraform
|
|
|
|
---
|
|
|
|
## Operational Metrics
|
|
|
|
Current platform highlights:
|
|
|
|
- 10-node distributed infrastructure environment
|
|
- 7-node security telemetry and detection footprint
|
|
- `107104` lines of custom code across infrastructure, security, and automation projects
|
|
- `1369` commits since January 1 across active repositories
|
|
- Automated failover between AWS and peer infrastructure
|
|
- Public dashboards for uptime, vulnerabilities, backups, threat telemetry, and service health
|
|
- Multiple daily encrypted Borgmatic snapshots shipped offsite
|
|
- Recurring backup verification and restore-oriented operational workflows
|
|
- Nightly metrics refresh via Gitea Actions and `tokei`
|
|
|
|
---
|
|
|
|
## Activity
|
|
|
|

|
|
|
|
---
|
|
|
|
## Deployment Patterns
|
|
|
|
- **Reverse proxy:** Caddy and NGINX, with Cloudflare where applicable
|
|
- **Observability:** Prometheus, Grafana, Node Exporter, Blackbox Exporter, cAdvisor, Netdata
|
|
- **Access control:** Authelia, OIDC, MFA, TLS hardening, and protected reverse-proxy routes
|
|
- **Lifecycle management:** Controlled container updates and service monitoring
|
|
- **Service isolation:** Nodes scoped by role to reduce blast radius and simplify recovery
|
|
- **Backup strategy:** Encrypted offsite backups with recurring verification
|
|
|
|
---
|
|
|
|
## Selected Public Projects
|
|
|
|
- **Portfolio:** [beane.me](https://beane.me)
|
|
- **Threat Decisions and Telemetry:** [threats.beane.me](https://threats.beane.me)
|
|
- **Threat Intelligence and Analytics:** [intel.beane.me](https://intel.beane.me)
|
|
- **Vulnerability Scanning and Trends:** [vuln.beane.me](https://vuln.beane.me)
|
|
- **Backup and Restore Verification:** [backups.beane.me](https://backups.beane.me)
|
|
- **Threat Decision Observability:** [observe.beane.me](https://observe.beane.me)
|
|
- **Health and Failover Dashboard:** [health.beane.me](https://health.beane.me)
|
|
- **Source Control:** [git.beane.me](https://git.beane.me)
|
|
- **Terraform Threat Modeling:** [tfstride.beane.me](https://tfstride.beane.me)
|
|
|
|
---
|
|
|
|
## Engineering Philosophy
|
|
|
|
Production systems should be observable, automated, recoverable, and secure from the start.
|
|
|
|
I focus on infrastructure that explains itself: clear telemetry, deterministic automation, evidence-backed security findings, documented recovery paths, and controls that improve reliability without slowing delivery.
|