メインコンテンツへスキップ
DevOpsMar 28, 2026

Platform Engineering Ate DevOps: Building Your Internal Developer Platform in 2026

OS
Open Soft Team

Engineering Team

80% of Large Orgs Have Platform Teams — And You Should Too

Gartner’s 2026 Engineering Effectiveness Report confirms what many of us have been feeling: 80% of large engineering organizations (500+ developers) now have dedicated platform engineering teams, up from 45% in 2024. The industry has voted with headcount, and the verdict is clear — platform engineering is not a trend, it is the operating model.

The shift happened because DevOps, as originally conceived, hit a scaling wall. “You build it, you run it” works beautifully for a 20-person startup. At 200 engineers, it becomes “you build it, you run it, and you spend 40% of your time on undifferentiated infrastructure work.” Platform engineering is the answer: centralize the infrastructure expertise, expose it through self-service interfaces, and let application developers focus on shipping features.

What Is an Internal Developer Platform?

An Internal Developer Platform (IDP) is a set of tools, workflows, and self-service capabilities that abstract away infrastructure complexity for application developers. It is not a single product — it is an integration layer that connects your existing tools into a coherent developer experience.

The core principle: developers should be able to deploy a new service to production without filing a ticket, waiting for an ops team, or reading a 50-page runbook.

IDP Architecture

A production IDP in 2026 typically consists of five layers:

+------------------------------------------------------------------+
|                    Developer Portal (Backstage)                   |
|   Service catalog, docs, templates, scaffolding, search          |
+------------------------------------------------------------------+
|                    Self-Service Portal                            |
|   Deploy service, provision database, create environment          |
|   Request resources, view costs, manage secrets                  |
+------------------------------------------------------------------+
|                    CI/CD Pipeline (Standardized)                  |
|   Build, test, scan, deploy — with AI-assisted optimization      |
+------------------------------------------------------------------+
|                    Pre-Approved Infrastructure                    |
|   Terraform modules, Kubernetes operators, database-as-a-service |
|   All security-scanned, compliance-validated, cost-tagged         |
+------------------------------------------------------------------+
|                    Guardrails & Policies                          |
|   OPA/Kyverno policies, cost limits, security baselines          |
|   Automated compliance checks, drift detection                   |
+------------------------------------------------------------------+

Layer 1: Developer Portal (Backstage)

Backstage, the CNCF-graduated developer portal originally created at Spotify, has become the de facto standard interface for IDPs. As of March 2026:

  • 3,200+ companies use Backstage in production (up from 900 in 2024)
  • 700+ open-source plugins available in the Backstage marketplace
  • Backstage 2.0 (released January 2026) introduced a new frontend framework, declarative UI extensions, and native support for platform actions

Backstage serves as the single entry point for developers to:

  • Browse the service catalog — Every service, library, and infrastructure component is registered with metadata (owner, documentation, dependencies, API specs, deployment status)
  • Scaffold new services — Software templates generate new projects with CI/CD, monitoring, and deployment configured out of the box
  • View documentation — TechDocs renders Markdown documentation alongside the service catalog, so docs live next to the code they describe
  • Search everything — Unified search across services, APIs, documentation, runbooks, and incidents
  • Trigger platform actions — Deploy a service, provision a database, rotate secrets, create a new environment — all through the portal
# Backstage Software Template for a new microservice
apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
  name: microservice-template
  title: Production Microservice
  description: Creates a new microservice with CI/CD, monitoring, and K8s deployment
spec:
  owner: platform-team
  type: service
  parameters:
    - title: Service Details
      properties:
        name:
          title: Service Name
          type: string
          pattern: "^[a-z][a-z0-9-]*$"
        language:
          title: Language
          type: string
          enum: [rust, go, typescript, python]
        database:
          title: Database
          type: string
          enum: [postgresql, none]
  steps:
    - id: scaffold
      action: fetch:template
      input:
        url: ./skeleton
        values:
          name: ${{ parameters.name }}
          language: ${{ parameters.language }}
    - id: create-repo
      action: publish:gitlab
      input:
        repoUrl: gitlab.com?repo=${{ parameters.name }}&owner=backend
    - id: provision-infra
      action: terraform:apply
      input:
        module: microservice-base
        vars:
          service_name: ${{ parameters.name }}
          database: ${{ parameters.database }}
    - id: register-catalog
      action: catalog:register
      input:
        repoContentsUrl: ${{ steps.create-repo.output.repoContentsUrl }}

Layer 2: Self-Service Infrastructure

The self-service layer provides developers with pre-approved infrastructure resources that can be provisioned instantly:

  • Databases — PostgreSQL, Redis, MongoDB instances with automated backups, monitoring, and connection pooling
  • Message queues — Kafka topics, RabbitMQ vhosts, NATS subjects
  • Environments — Ephemeral preview environments for pull requests, staging environments with production-like data
  • Secrets — Vault-managed secrets with automatic rotation and injection
  • DNS and certificates — Automatic DNS record creation and TLS certificate provisioning via cert-manager

The key word is pre-approved. The platform team has already reviewed, security-scanned, and cost-optimized each resource type. Developers choose from a menu of validated options rather than writing raw Terraform from scratch.

Layer 3: Standardized CI/CD

The platform team provides standardized CI/CD pipelines that enforce organizational standards:

# Platform-provided CI/CD pipeline (developers do not write this)
# Automatically attached to every service created through the portal
stages:
  - build:
      steps:
        - compile
        - unit-test
        - lint
  - security:
      steps:
        - sast-scan        # Static analysis (Semgrep, CodeQL)
        - dependency-audit  # Known vulnerability scan
        - container-scan    # Image vulnerability scan (Trivy)
        - secrets-scan      # Prevent credential leaks (Gitleaks)
  - deploy-staging:
      steps:
        - deploy-to-staging
        - integration-test
        - performance-test
  - deploy-production:
      steps:
        - canary-deploy-10-percent
        - automated-rollback-on-error-spike
        - progressive-rollout-to-100-percent
        - post-deploy-smoke-test

Developers do not configure pipelines — they just push code. The platform handles build, test, scan, and deploy automatically.

Layer 4: Pre-Approved Infrastructure Modules

The platform team maintains a library of Terraform modules and Kubernetes operators that encode organizational best practices:

  • Every module is versioned, tested, and security-reviewed
  • Modules enforce tagging conventions, network policies, resource limits, and backup schedules
  • Cost estimates are calculated before provisioning
  • Drift detection alerts when infrastructure diverges from the declared state

Layer 5: Guardrails and Policies

Guardrails are the secret ingredient that makes self-service safe. Without them, self-service becomes “developers provision whatever they want and the bill explodes.”

OPA (Open Policy Agent) and Kyverno enforce policies at multiple levels:

  • Kubernetes admission — Block deployments that lack resource limits, health checks, or security contexts
  • Terraform plan — Reject infrastructure changes that violate cost budgets or compliance rules
  • CI/CD gates — Fail builds that introduce critical vulnerabilities or skip required tests
  • Runtime — Alert on or block runtime behavior that violates security baselines

Example Kyverno policy:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-resource-limits
spec:
  validationFailureAction: Enforce
  rules:
    - name: check-limits
      match:
        any:
          - resources:
              kinds: ["Pod"]
      validate:
        message: "All containers must have CPU and memory limits"
        pattern:
          spec:
            containers:
              - resources:
                  limits:
                    memory: "?*"
                    cpu: "?*"

AI in CI/CD: 76% Adoption and 3x Fewer Deployment Failures

The 2026 State of DevOps Report reveals that 76% of engineering organizations now use AI in their CI/CD pipelines, up from 31% in 2024. The impact is measurable: teams using AI-assisted CI/CD report 3x fewer deployment failures and 40% shorter lead times.

Where AI Fits in the Pipeline

StageAI ApplicationImpact
Code reviewAI-generated review comments, security suggestions30% fewer bugs reaching CI
Test generationAI generates unit and integration tests from code changes60% higher test coverage
Test selectionAI predicts which tests are relevant to a change70% shorter test suite execution
Deployment riskAI scores deployment risk based on change characteristics50% fewer high-severity incidents
Incident responseAI correlates deployment with production anomalies65% faster MTTR
Rollback decisionAI recommends rollback based on error rate trends80% faster rollback initiation

AI-Powered Test Selection

One of the highest-ROI AI applications in CI/CD is predictive test selection. Instead of running the entire test suite on every commit (which can take 30-60 minutes for large codebases), AI models predict which tests are likely to fail based on the changed files:

  • Launchable and Gradle Predictive Test Selection are the leading tools
  • They analyze historical test results and code change patterns
  • Typical result: run 20% of the test suite, catch 99% of failures
  • Average CI time reduction: 60-70%

AI-Assisted Deployment Risk Scoring

Platform teams are training models to score deployment risk based on:

  • Size of the change (lines of code, files modified)
  • Blast radius (number of dependent services)
  • Author experience with the codebase
  • Time since last deployment
  • Historical failure rate for similar changes

High-risk deployments automatically receive additional safeguards: smaller canary percentages, longer bake times, and human approval gates.

DevSecOps: Security Scanning Automated and Embedded

The “shift left” movement has matured from a slogan into an automated reality. In a modern IDP, security scanning is embedded in the platform — developers do not choose whether to run it.

The Security Scanning Stack

LayerToolWhat It Catches
IDESemgrep, Snyk IDEBugs during development
Pre-commitGitleaks, TruffleHogLeaked secrets
SASTSemgrep, CodeQLCode vulnerabilities
SCASnyk, Dependabot, TrivyVulnerable dependencies
ContainerTrivy, GrypeImage vulnerabilities
IaCCheckov, tfsecInfrastructure misconfigurations
DASTZAP, NucleiRuntime vulnerabilities
RuntimeFalco, TetragonAnomalous behavior

The platform team configures these tools once, integrates them into the standardized CI/CD pipeline, and sets policies for severity thresholds. Critical vulnerabilities block deployment automatically. High-severity findings create tickets with SLA-driven deadlines. Medium and low findings are tracked but do not block.

Supply Chain Security

Software supply chain attacks have driven adoption of:

  • SLSA Level 3 build provenance for all artifacts
  • Sigstore/cosign for container image signing
  • SBOM generation (SPDX or CycloneDX) for every deployed artifact
  • VEX (Vulnerability Exploitability eXchange) documents for dependency vulnerabilities

The platform automates all of this. Developers do not generate SBOMs or sign images manually — the CI/CD pipeline does it transparently.

Developer Experience as a Metric

The most forward-thinking platform teams have adopted Developer Experience (DevEx) as a first-class metric, measured through a combination of quantitative and qualitative signals:

DORA Metrics (Quantitative)

The four DORA metrics remain the gold standard for measuring software delivery performance:

MetricElite Performer ThresholdHow Platform Engineering Helps
Deployment frequencyOn-demand (multiple per day)Self-service deploy, automated pipelines
Lead time for changesLess than 1 hourPre-built templates, AI test selection
Change failure rateLess than 5%Automated scanning, canary deployments
Time to restore serviceLess than 1 hourAutomated rollback, incident tooling

SPACE Framework (Qualitative)

The SPACE framework (Satisfaction, Performance, Activity, Communication, Efficiency) captures what DORA misses — the human experience of using the platform:

  • Developer satisfaction surveys — Quarterly surveys asking developers to rate the platform on a 1-10 scale
  • Time-to-first-deploy — How long does it take a new hire to deploy their first change to production? (Target: <1 day)
  • Cognitive load index — How many tools, systems, and processes must a developer understand to do their job? (Target: minimal, the platform abstracts the rest)
  • Toil ratio — What percentage of developer time is spent on undifferentiated infrastructure work vs feature development? (Target: <10%)

Measuring Platform Adoption

Platform teams should track:

  • Portal adoption — What percentage of developers use the portal weekly?
  • Template usage — What percentage of new services use platform templates vs custom setups?
  • Self-service ratio — What percentage of infrastructure requests are self-served vs ticket-based?
  • Time-to-provision — How long from request to resource availability?

Building Your IDP: A 12-Week Roadmap

For teams starting from scratch, here is a pragmatic roadmap:

Weeks 1-3: Foundation

  • Deploy Backstage with basic service catalog
  • Register existing services (name, owner, repo, docs link)
  • Create your first software template for the most common service type
  • Set up a platform team channel for developer feedback

Weeks 4-6: CI/CD Standardization

  • Define a standard CI/CD pipeline for your primary language/framework
  • Integrate security scanning (SAST, SCA, container scanning)
  • Implement automated canary deployments for production
  • Measure baseline DORA metrics

Weeks 7-9: Self-Service Infrastructure

  • Build Terraform modules for common resources (database, cache, queue)
  • Expose them through Backstage actions or a self-service API
  • Implement cost tagging and visibility
  • Deploy OPA/Kyverno guardrails

Weeks 10-12: Polish and Measure

  • Run a developer satisfaction survey
  • Measure time-to-first-deploy for a mock new hire
  • Identify top 3 developer pain points and address them
  • Document the platform architecture and publish it in Backstage

Frequently Asked Questions

Does platform engineering eliminate the need for DevOps engineers?

No. Platform engineering reorganizes DevOps work, not eliminates it. DevOps engineers become platform engineers — instead of supporting individual teams, they build and maintain the shared platform. The skills are the same (infrastructure, automation, reliability), but the scope shifts from team-level to organization-level.

How big should a platform team be?

A common ratio is 1 platform engineer per 15-25 application developers. A 200-person engineering org typically needs 8-12 platform engineers. Start smaller (3-4 people) and grow based on demand.

Is Backstage the only option for a developer portal?

Backstage is the most popular open-source option, but alternatives exist. Port, Cortex, and OpsLevel offer commercial developer portals with less operational overhead. Some teams build custom portals on top of their existing tools. However, Backstage’s plugin ecosystem and community make it the default choice for most organizations.

What if developers resist using the platform?

Resistance usually comes from two sources: the platform does not solve their actual problems, or it feels like a constraint rather than an enabler. The fix is the same: talk to developers, understand their pain points, and build the platform around their needs — not around what the platform team thinks they need. Make the platform the path of least resistance, not a mandate.

How do you handle teams with unique requirements?

The platform should cover 80% of common needs through standardized paths. For the remaining 20%, provide escape hatches — the ability to customize pipelines, bring your own Terraform modules, or request non-standard resources through a lightweight review process. The goal is “golden paths, not golden cages.”