Security Validation: OPA Gatekeeper enforcement was successfully validated. Non-compliant resources (missing seccompProfile, improper runAsGroup, missing ALB annotations, incorrect image pull policies) were consistently blocked at admission time. Compliant deployments required patching via Kustomize, confirming that policy enforcement did not impede valid deployments but did require explicit configuration alignment.
Connectivity: End-to-end connectivity from ingress to ODM services was successfully established. TLS termination and internal routing functioned as expected once services were healthy. The pilot confirmed that ingress configuration (including annotations and TLS settings) can securely expose ODM endpoints while maintaining encrypted communication within the cluster.
Lessons Learned
Configuration Challenges: The default Helm chart required significant augmentation to meet enterprise policy requirements. Key challenges included enforcing security contexts (runAsGroup, supplementalGroups, seccomp), overriding image pull policies for init containers, and patching components not fully configurable via values.yaml (e.g., test jobs). Additionally, ingress configuration required precise annotations (TLS enforcement, backend protocol, routing paths) to align with cluster ingress controller behavior.
Operational Friction: Manual configuration of environment-specific parameters introduced complexity and room for error. Standardizing these inputs and integrating them into a controlled deployment pipeline such as Jenkins will improve the deployment process.
Unexpected Behavior: Kubernetes services only route to pods that pass readiness checks, meaning application-level issues can manifest as networking failures. Ingress resources may appear correctly configured even when backend services are unavailable, requiring inspection of service endpoints and pod health. Additionally, overlapping environments using the same hostname can create routing ambiguity, reinforcing the importance of clear environment isolation.
Recommendations for Production
Automation:
Implement a fully automated deployment pipeline (ie Jenkins, Github Actions) with Helm template, Kustomize overlays, to enforce consistent, policy-compliant configurations
Integrate centralized secret management (e.g., External Secrets Operator or Vault) to eliminate manual credential handling
Standardize environment configuration (DB, ingress, TLS) as reusable templates
Observability:
Integrate monitoring and logging (e.g., Prometheus and Grafana, or cloud-native equivalents such as CloudWatch) to track pod readiness, restart patterns, and decision service latency
Implement alerting for failed readiness probes, missing service endpoints, and dependency failures (e.g., database connectivity)
Enhance visibility into ODM application logs to reduce time to diagnose issues across components
Scaling:
Implement Horizontal Pod Autoscaler (HPA) for decision runtime components based on CPU utilization and request throughput
Scale Decision Center independently from runtime components, as it serves administrative rather than high-throughput workloads
Plan for backend database scaling and connection management, as ODM performance is tightly coupled to database responsiveness under load