* Initial plan * Add comprehensive agents.md documentation Co-authored-by: allanice001 <700853+allanice001@users.noreply.github.com> * Update agents.md to address code review feedback Co-authored-by: allanice001 <700853+allanice001@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: allanice001 <700853+allanice001@users.noreply.github.com>
11 KiB
Autoglue Repository Architecture & Agents
Overview
Autoglue is a Kubernetes cluster management platform built with Go that manages the lifecycle of K3s clusters across GlueOps-supported cloud providers. It provides a REST API for cluster provisioning, configuration, and management, along with a web UI and Terraform provider.
Repository Structure
autoglue/
├── cmd/ # CLI commands
├── internal/ # Internal packages
│ ├── api/ # HTTP API routes and middleware
│ ├── app/ # Application setup
│ ├── auth/ # Authentication logic
│ ├── bg/ # Background job workers
│ ├── common/ # Common utilities
│ ├── config/ # Configuration management
│ ├── db/ # Database operations
│ ├── handlers/ # HTTP request handlers
│ ├── keys/ # Cryptographic key management
│ ├── mapper/ # Data mapping utilities
│ ├── models/ # Database models
│ ├── utils/ # Utility functions
│ ├── version/ # Version information
│ └── web/ # Web UI integration
├── sdk/ # Generated SDKs
│ └── ts/ # TypeScript SDK
├── ui/ # Frontend application (React)
├── docs/ # OpenAPI/Swagger documentation
├── postgres/ # PostgreSQL configuration
├── main.go # Application entry point
├── schema.sql # Database schema
└── docker-compose.yml # Development environment
Core Components
1. API Layer (internal/api)
The API layer provides RESTful endpoints for managing cloud resources:
- Authentication (
mount_auth_routes.go): OAuth/OIDC integration, JWT tokens - Clusters (
mount_cluster_routes.go): Kubernetes cluster management - Servers (
mount_server_routes.go): Server resource management - SSH Keys (
mount_ssh_routes.go): SSH key generation and management - DNS (
mount_dns_routes.go): DNS record management - Load Balancers (
mount_load_balancer_routes.go): Load balancer configuration - Node Pools (
mount_node_pool_routes.go): Worker node pool management - Credentials (
mount_credential_routes.go): Cloud provider credentials - Organizations (
mount_org_routes.go): Multi-tenant organization management
Middleware:
- Request logging with zerolog
- Rate limiting (1000 requests/minute per IP)
- CORS handling
- Security headers
- Authentication/Authorization
- Request body size limits (10MB)
2. Background Jobs (internal/bg)
Autoglue uses the Archer job queue system with PostgreSQL-backed job persistence.
Active Job Workers:
| Worker | Purpose | Timeout |
|---|---|---|
bootstrap_bastion |
Provision and configure bastion host servers | Configurable (default 60s) |
archer_cleanup |
Clean up old job records | 5 minutes |
tokens_cleanup |
Purge expired refresh tokens | 5 minutes |
db_backup_s3 |
Backup database to S3 | 15 minutes |
dns_reconcile |
Synchronize DNS records with Route53 | 2 minutes |
org_key_sweeper |
Remove expired organization API keys | 5 minutes |
cluster_action |
Execute cluster lifecycle actions | Configurable |
Planned Job Workers (Currently Disabled):
The following workers exist in the codebase but are currently commented out:
prepare_cluster- Prepare infrastructure for cluster deploymentcluster_setup- Initial cluster configurationcluster_bootstrap- Full Kubernetes cluster bootstrapping process
Configuration:
archer.instances: Number of worker instances (default: 1)archer.timeoutSec: Job timeout in seconds (default: 60)archer.cleanup_retain_days: Job retention period (default: 7 days)
3. Data Models (internal/models)
Core Models:
User- User accounts with OAuth integrationOrganization- Multi-tenant organizationsMembership- User-organization relationshipsApiKey- API authentication tokens (user and org-level)OrganizationKey- Organization-level credentials with auto-expiryCluster- Kubernetes cluster definitionsNodePool- Worker node group configurationsServer- Individual server instancesSshKey- SSH keypair management with encryptionLoadBalancer- Load balancer configurationsDomain- DNS domain managementCredential- Cloud provider API credentials (AWS, etc.)Job- Background job queue recordsSigningKey- JWT signing keys with rotationRefreshToken- OAuth refresh token storageMasterKey- Master encryption key for data at restLabel,Annotation,Taint- Kubernetes resource metadata
4. Handlers (internal/handlers)
Request handlers implement business logic for API endpoints:
auth.go- OAuth flows, token issuanceclusters.go- Cluster CRUD operationsservers.go- Server provisioningssh_keys.go- SSH key generation with Ed25519/RSA supportdns.go- DNS record management via Route53load_balancers.go- Load balancer configurationnode_pools.go- Node pool management with labels/annotations/taintscredentials.go- Cloud credential storageorgs.go- Organization managementme.go- Current user informationme_keys.go- User API key managementhealth.go- Health check endpointsversion.go- Version information
5. Security & Encryption
Cryptography:
- Master Key: AES-256-GCM encryption for root secrets
- Organization Keys: Per-org encryption keys derived from master key
- SSH Keys: Secure generation and encrypted storage
- JWT Tokens: RS256 signing with key rotation
- API Keys: Argon2id hashing for token storage
- At-Rest Encryption: All sensitive data (kubeconfigs, credentials, SSH keys)
Authentication Methods:
- OAuth/OIDC (Google Workspace integration)
- Bearer tokens (JWT)
- Organization Key/Secret pairs
- User API keys
6. CLI Commands (cmd)
serve- Start the API server (default command)keys generate- Generate JWT signing keysencrypt create-master- Create master encryption keydb- Database management utilitiesversion- Display version information
7. Integration Points
Cloud Providers:
- AWS (Route53 for DNS, S3 for backups)
- Support for multi-cloud credentials
External Services:
- PostgreSQL (primary data store)
- S3-compatible storage (backups)
- OAuth providers (Google)
SDKs:
- TypeScript SDK (
sdk/ts/) - Generated from OpenAPI spec - Go SDK (consumed via module alias) - Used by external integrations
External Integrations:
- Terraform Provider - Separate repository providing IaC support for Autoglue resources
Development Workflow
Prerequisites
- Go 1.25.4+
- Docker & Docker Compose
- PostgreSQL (via docker-compose)
- Node.js (for UI development)
Setup
# 1. Configure environment
cp .env.example .env
# 2. Start database
docker compose up -d
# 3. Generate JWT keys
go run . keys generate
# 4. Create master encryption key
go run . encrypt create-master
# 5. Update OpenAPI docs and SDKs
make swagger
make sdk-all
# 6. Start API server with embedded UI
go run .
Build & Test
# Build application
go build -o autoglue .
# Run tests
go test ./...
# Build UI
make ui
Note: The Terraform provider is maintained in a separate repository.
API Architecture
Request Flow
Client → CORS → Rate Limit → Logger → Auth → Handler → DB/Jobs → Response
Authentication Flow
- User logs in via OAuth (Google)
- Backend validates token with provider
- JWT access token issued (short-lived)
- Refresh token stored in DB
- Organization context from
X-Org-IDheader
Job Execution Flow
- Handler enqueues job via
Jobs.Enqueue() - Archer worker picks up job from PostgreSQL
- Worker executes task with timeout
- Result stored in
jobstable - Retries on failure (configurable)
Database Schema
Key Tables:
users- User accountsaccounts- OAuth provider linkageorganizations- Tenant isolationmemberships- User-org relationshipsapi_keys- Authentication tokensclusters- K8s cluster definitionsnode_pools- Worker node groupsservers- Compute instancesssh_keys- SSH keypair storageload_balancers- LB configurationsdomains- DNS domainscredentials- Cloud API credentialsjobs- Background job queuesigning_keys- JWT key rotationrefresh_tokens- OAuth token storagemaster_keys- Encryption key hierarchy
Configuration
Environment variables (.env):
DATABASE_URL- PostgreSQL connection stringJWT_PRIVATE_ENC_KEY- JWT private key encryptionGOOGLE_CLIENT_ID/GOOGLE_CLIENT_SECRET- OAuthALLOWED_ORIGINS- CORS configurationarcher.*- Job queue settings- AWS credentials for Route53/S3
Deployment
Docker
docker build -t autoglue .
docker run -p 8080:8080 --env-file .env autoglue
Production Considerations
- Database connection pooling
- Rate limiting configuration
- CORS allowed origins
- JWT key rotation schedule
- Backup retention policies
- Worker instance scaling
- Monitoring and alerting
API Documentation
- Swagger UI:
http://localhost:8080/swagger/index.html - OpenAPI Spec:
docs/openapi.yaml - SDK Documentation:
sdk/ts/README.md
Testing
The repository includes:
- Unit tests for handlers (
*_test.go) - Test utilities (
internal/testutil/) - Integration tests with embedded PostgreSQL
Run tests:
go test ./internal/handlers/
go test -v ./...
Key Features
- Multi-tenancy: Organization-based resource isolation
- Encryption at Rest: All sensitive data encrypted per-org
- Async Job Processing: Background tasks with retry logic
- API Key Management: Multiple authentication methods
- SSH Key Generation: Automated keypair creation (RSA/Ed25519)
- DNS Automation: Route53 integration for DNS records
- Kubernetes Management: Cluster lifecycle automation
- Terraform Provider: Infrastructure-as-Code support
- Web UI: React-based management interface
- OpenAPI/Swagger: Auto-generated API documentation
Architecture Patterns
- Repository Pattern: Data access abstraction via GORM
- Dependency Injection: Dependencies passed to handlers
- Middleware Chain: Request processing pipeline
- Job Queue: Async processing with Archer
- Multi-tenant: Organization-scoped data isolation
- Encryption: Key hierarchy (master → org → resource)
Future Enhancements
Based on commented code and structure:
- Full cluster provisioning automation
- Additional cloud provider support
- Enhanced monitoring and observability
- Cluster backup and restore
- Advanced RBAC controls
- Custom resource definitions
Resources
- GitHub: https://github.com/GlueOps/autoglue
- Production API: https://autoglue.glueopshosted.com/api/v1
- Pre-prod API: https://autoglue.glueopshosted.rocks/api/v1
- Staging API: https://autoglue.apps.nonprod.earth.onglueops.rocks/api/v1