基于 GCP Cloud Build 与 Cloud Run 构建支持 Relay 的动态前端预览环境


团队内部的前后端协作流程一直存在一个摩擦点:前端开发者在一个功能分支上进行UI开发,依赖于后端对应分支的API。在提交Pull Request进行代码审查时,产品经理和测试人员无法直观地预览变更效果。他们必须在本地拉取前后端分支、安装依赖、启动服务,整个过程不仅繁琐,而且环境不一致性常常导致“在我电脑上是好的”这类问题,严重拖慢了迭代速度。

我们需要的不是更多的文档或口头同步,而是一个自动化的预览系统。目标很明确:当任何一个PR被创建或更新时,系统能自动构建该PR对应的全栈环境(前端+后端),并将其部署到一个临时的、可通过URL公开访问的隔离环境中。这个URL会被自动评论到PR页面,供团队成员点击预览。

初步的技术栈选型基于我们现有的体系:Google Cloud Platform (GCP) 用于基础设施,所有应用都已经容器化。前端采用React和Relay,后端是Node.js。挑战在于,前端应用是一个静态构建的单页应用(SPA),它的Relay环境在构建时就需要知道API后端的地址。而我们的预览环境,其后端API地址是动态生成的,每次部署都不同。如何在CI/CD流程中将这个动态的后端地址优雅地注入到静态的前端容器中,是这次实践的核心技术难题。

架构构想与技术选型决策

整个流程的核心是事件驱动。Git仓库中的PR事件是起点,触发GCP上的自动化流程,最终产出一个可用的预览环境。

graph TD
    A[Developer Pushes to PR Branch] --> B{GitHub Webhook};
    B --> C[GCP Cloud Build Trigger];
    C --> D[Cloud Build Pipeline Execution];
    D --> E{Build & Push Backend Image};
    D --> F{Build & Push Frontend Image};
    E --> G[Deploy Backend to Cloud Run];
    G --> H{Get Backend Service URL};
    F & H --> I[Deploy Frontend to Cloud Run with Backend URL];
    I --> J{Get Frontend Service URL};
    J --> K[Post URL to GitHub PR Comment];

    subgraph GCP Project
        C
        D
        E
        F
        G
        H
        I
        J
    end

    style F fill:#cde4ff
    style I fill:#cde4ff
  1. CI/CD编排: GCP Cloud Build
    我们选择Cloud Build作为CI/CD工具,因为它与GCP生态(如Container Registry, Cloud Run)原生集成,配置简单,并且按使用时长计费,非常适合这种突发性、并行的构建任务。我们将通过一个cloudbuild.yaml文件来定义所有构建和部署步骤。

  2. 应用托管: GCP Cloud Run
    Cloud Run是我们的不二之选。它是一个全托管的Serverless容器平台,能够根据流量自动伸缩,甚至可以缩容到零。对于预览环境这种大部分时间没有流量的场景,成本极低。每个PR分支对应一个独立的Cloud Run服务,实现了环境的完美隔离。

  3. 核心挑战:动态配置注入
    前端Relay应用在初始化时需要一个明确的GraphQL服务端URL。通常这个URL通过构建时的环境变量(如process.env.REACT_APP_API_URL)写入。但在我们的场景中,后端的URL直到部署完成后才可知。

    一个常见的错误是在构建前端镜像时,试图将一个占位符URL硬编码进去。这无法工作。正确的思路必须是延迟配置注入,即在容器启动时,而非构建时,完成最终配置。我们将采用一种“模板替换”的运行时配置方案。

Cloud Build 流水线实现

cloudbuild.yaml是整个自动化流程的指挥中心。它需要处理分支名称的解析、镜像构建、服务部署等一系列任务。真实项目中的配置会更复杂,但核心逻辑如下:

# cloudbuild.yaml
# This pipeline builds and deploys both backend and frontend services for a PR preview.
steps:
  # Step 1: Install dependencies for scripting steps
  - name: 'gcr.io/cloud-builders/npm'
    id: 'Install Deps'
    args: ['install']

  # Step 2: Dynamically determine the service name based on branch or PR number.
  # This ensures each PR gets a unique, identifiable service name.
  - name: 'node'
    id: 'Set Service Name'
    entrypoint: 'node'
    args:
      - '-e'
      - |
        const branchName = "${_BRANCH_NAME}".toLowerCase().replace(/[\/_\.]/g, '-');
        const safeBranchName = branchName.slice(0, 20); // Cloud Run service names have length limits.
        const servicePrefix = 'pr-preview';
        console.log(`SERVICE_NAME=${servicePrefix}-${safeBranchName}`);
        require('fs').writeFileSync('./workspace-vars.env', `SERVICE_NAME=${servicePrefix}-${safeBranchName}`);
    # _BRANCH_NAME is a substitution provided by the Cloud Build trigger.

  # Step 3: Build and push the backend image.
  # The image is tagged with the commit SHA for traceability.
  - name: 'gcr.io/cloud-builders/docker'
    id: 'Build Backend'
    args:
      - 'build'
      - '-t'
      - 'gcr.io/$PROJECT_ID/my-backend:${SHORT_SHA}'
      - './backend'
      - '-f'
      - './backend/Dockerfile'

  - name: 'gcr.io/cloud-builders/docker'
    id: 'Push Backend'
    args: ['push', 'gcr.io/$PROJECT_ID/my-backend:${SHORT_SHA}']

  # Step 4: Deploy the backend to Cloud Run.
  # We deploy it first to obtain its unique URL.
  - name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
    id: 'Deploy Backend'
    entrypoint: 'bash'
    args:
      - '-c'
      - |
        source ./workspace-vars.env
        gcloud run deploy $$SERVICE_NAME-backend \
          --image=gcr.io/$PROJECT_ID/my-backend:${SHORT_SHA} \
          --region=us-central1 \
          --platform=managed \
          --no-allow-unauthenticated \
          --set-env-vars=DATABASE_URL=... # Add other backend envs here
        
        # Capture the backend URL and write it to the shared environment file.
        BACKEND_URL=$$(gcloud run services describe $$SERVICE_NAME-backend --platform=managed --region=us-central1 --format='value(status.url)')
        echo "BACKEND_URL=$$BACKEND_URL" >> ./workspace-vars.env
    
  # Step 5: Build and push the frontend image.
  - name: 'gcr.io/cloud-builders/docker'
    id: 'Build Frontend'
    args:
      - 'build'
      - '-t'
      - 'gcr.io/$PROJECT_ID/my-frontend:${SHORT_SHA}'
      - './frontend'
      - '-f'
      - './frontend/Dockerfile'

  - name: 'gcr.io/cloud-builders/docker'
    id: 'Push Frontend'
    args: ['push', 'gcr.io/$PROJECT_ID/my-frontend:${SHORT_SHA}']

  # Step 6: Deploy the frontend to Cloud Run.
  # This is the critical step where we inject the dynamic backend URL.
  - name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
    id: 'Deploy Frontend'
    entrypoint: 'bash'
    args:
      - '-c'
      - |
        source ./workspace-vars.env
        gcloud run deploy $$SERVICE_NAME-frontend \
          --image=gcr.io/$PROJECT_ID/my-frontend:${SHORT_SHA} \
          --region=us-central1 \
          --platform=managed \
          --allow-unauthenticated \
          --set-env-vars="API_URL=$$BACKEND_URL/graphql" # This env var is used by the container at startup.
    
        # Optional: Post the frontend URL back to the PR as a comment.
        FRONTEND_URL=$$(gcloud run services describe $$SERVICE_NAME-frontend --platform=managed --region=us-central1 --format='value(status.url)')
        # ... logic to call GitHub/GitLab API using curl ...

# Available substitutions from Cloud Build trigger: $PROJECT_ID, $SHORT_SHA, $_BRANCH_NAME
substitutions:
  _BRANCH_NAME: 'main' # Default value

options:
  logging: CLOUD_LOGGING_ONLY

这里的关键点在于第4步和第6步的衔接。我们先部署后端,然后用gcloud命令捕获其生成的URL,并将其写入一个临时文件workspace-vars.env。接着,在部署前端时,我们从该文件读取URL,并通过--set-env-vars标志将其作为环境变量注入到前端的Cloud Run实例中。

前端容器的运行时配置

现在,前端容器有了一个名为API_URL的环境变量。但我们的React应用是编译后的静态文件,它无法直接读取服务端的环境变量。解决方法是在容器启动时,动态生成一个配置文件供index.html加载。

1. Nginx配置

我们使用Nginx来服务静态文件。配置需要保持简单,主要负责路由和返回主index.html

# frontend/nginx.conf
server {
  listen 8080;
  server_name localhost;

  # Serve the generated config file
  location /config.js {
    root /usr/share/nginx/html;
    add_header 'Cache-Control' 'no-store, no-cache, must-revalidate, proxy-revalidate, max-age=0';
  }

  location / {
    root /usr/share/nginx/html;
    index index.html;
    try_files $uri /index.html;
  }
}

2. 运行时配置模板

public目录下,我们创建一个配置文件的模板。

// frontend/public/config.template.js
// This is a template file. The placeholder will be replaced by the entrypoint script.
window.APP_CONFIG = {
  API_URL: '${API_URL}'
};

3. Dockerfile与入口脚本

这是整个方案的粘合剂。Dockerfile负责构建React应用,并设置一个自定义的入口脚本entrypoint.sh。这个脚本会在Nginx启动前执行,完成配置文件的动态生成。

# frontend/Dockerfile

# --- Build Stage ---
FROM node:18-alpine AS build
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
# Note: We don't need REACT_APP_API_URL here anymore.
RUN npm run build

# --- Production Stage ---
FROM nginx:1.23-alpine
# Copy built static files
COPY --from=build /app/build /usr/share/nginx/html
# Copy Nginx config
COPY nginx.conf /etc/nginx/conf.d/default.conf
# Copy the config template and the entrypoint script
COPY public/config.template.js /usr/share/nginx/html/config.template.js
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh

# The container will run this script on startup.
ENTRYPOINT ["/entrypoint.sh"]
# The script will then start Nginx.
CMD ["nginx", "-g", "daemon off;"]

entrypoint.sh脚本是核心,它使用envsubst工具来替换模板中的环境变量。

#!/bin/sh
# frontend/entrypoint.sh

# This script generates the final config.js from a template and environment variables.
# It allows us to inject dynamic configuration into a static frontend build at runtime.

# Path to the template and the final output file.
TEMPLATE_FILE="/usr/share/nginx/html/config.template.js"
OUTPUT_FILE="/usr/share/nginx/html/config.js"

echo "Generating config.js from environment variables..."

# Use envsubst to substitute environment variables in the template.
# We must export the variables for envsubst to see them.
export API_URL

# Important: The single quotes around '$API_URL' tell envsubst which variables to substitute.
# If we had more variables, it would be '$VAR1,$VAR2'.
envsubst '$API_URL' < "$TEMPLATE_FILE" > "$OUTPUT_FILE"

echo "config.js generated successfully:"
cat "$OUTPUT_FILE"

# The original CMD from the Dockerfile is passed as arguments to this script.
# "exec "$@"" will execute `nginx -g 'daemon off;'`.
exec "$@"

4. 前端应用消费配置

最后,在index.html中引入这个动态生成的配置文件,并在Relay环境中消费它。

<!-- frontend/public/index.html -->
<!DOCTYPE html>
<html lang="en">
  <head>
    <!-- ... other head elements ... -->
    <script src="%PUBLIC_URL%/config.js"></script>
  </head>
  <body>
    <!-- ... -->
  </body>
</html>

Relay Environment的配置也变得非常直观。

// frontend/src/RelayEnvironment.ts
import { Environment, Network, RecordSource, Store } from 'relay-runtime';

// Access the configuration from the global window object.
const API_URL = window.APP_CONFIG?.API_URL;

if (!API_URL || API_URL.includes('undefined')) {
  // A simple sanity check for production environments.
  // In a real project, this should be handled more gracefully, perhaps showing an error page.
  console.error("Critical Error: API_URL is not configured. Check the deployment.");
  // You might want to throw an error to halt execution.
  // throw new Error("API URL is missing");
}


async function fetchGraphQL(params, variables) {
  const response = await fetch(API_URL, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      // ... other headers like Authorization
    },
    body: JSON.stringify({
      query: params.text,
      variables,
    }),
  });

  return await response.json();
}

export default new Environment({
  network: Network.create(fetchGraphQL),
  store: new Store(new RecordSource()),
});

这个方案在真实项目中被证明是非常健壮的。它将构建时和运行时配置彻底解耦,使得前端容器成为一个可移植的、与环境无关的构件。

IDP面板的雏形:使用UnoCSS快速迭代

为了管理这些预览环境,我们快速构建了一个内部面板。UnoCSS在这里发挥了巨大作用,它让我们无需离开HTML/JSX就能编写样式,极大地提升了UI开发效率。

下面是一个简单的React组件,用于展示当前活跃的预览环境列表,数据可以来自一个简单的后台API,该API通过gcloud命令或GCP SDK查询Cloud Run服务列表得到。

// components/PreviewList.jsx
// This component uses UnoCSS for styling.
// In your project setup, UnoCSS would be configured to scan this file.

import React, 'react';

// Mock data, in a real app this would come from an API call.
const previews = [
  { id: 1, name: 'pr-preview-feature-new-button', url: 'https://...', status: 'Active' },
  { id: 2, name: 'pr-preview-fix-login-modal', url: 'https://...', status: 'Active' },
  { id: 3, name: 'pr-preview-refactor-api-client', url: 'https://...', status: 'Terminating' },
];

function StatusBadge({ status }) {
  const baseClasses = 'px-2 py-1 text-xs font-semibold rounded-full';
  const statusClasses = {
    Active: 'bg-green-100 text-green-800',
    Terminating: 'bg-yellow-100 text-yellow-800',
  };
  return <span className={`${baseClasses} ${statusClasses[status]}`}>{status}</span>;
}

export function PreviewList() {
  return (
    <div className="p-4 sm:p-6 lg:p-8 bg-gray-50 min-h-screen">
      <div className="max-w-4xl mx-auto">
        <h1 className="text-2xl font-bold text-gray-900 mb-4">
          Active Preview Environments
        </h1>
        <div className="bg-white shadow-md rounded-lg overflow-hidden">
          <ul className="divide-y divide-gray-200">
            {previews.map((preview) => (
              <li key={preview.id} className="p-4 flex justify-between items-center hover:bg-gray-50 transition-colors">
                <div>
                  <p className="font-mono text-sm text-gray-800">{preview.name}</p>
                  <a 
                    href={preview.url} 
                    target="_blank" 
                    rel="noopener noreferrer" 
                    className="text-xs text-blue-600 hover:underline"
                  >
                    {preview.url}
                  </a>
                </div>
                <StatusBadge status={preview.status} />
              </li>
            ))}
          </ul>
        </div>
      </div>
    </div>
  );
}

UnoCSS的原子化、按需生成的特性,使得这个面板的CSS体积极小,加载飞快。这种为开发者构建的内部工具,效率和实用性是第一位的,UnoCSS的理念与之完美契合。

方案的局限性与未来迭代方向

这套系统已经极大地改善了我们的协作流程,但它并非没有缺点。在真实生产环境中,还需要考虑以下几点:

  1. 成本管理:虽然Cloud Run缩容到零可以节省大量成本,但如果PR数量众多,活跃的预览环境累积起来,依然会产生费用(主要是最低实例数和容器镜像存储费用)。必须配套一个生命周期管理策略,例如,在PR被合并或关闭时,自动触发另一个Cloud Build任务来删除对应的Cloud Run服务。

  2. 数据隔离:目前所有预览环境都连接同一个共享的开发数据库。这可能导致分支间的测试数据互相干扰。一个更彻底的方案是为每个预览环境动态创建独立的数据库实例或schema,例如使用Cloud SQL的克隆功能,但这会显著增加架构的复杂度和成本。

  3. 安全性:当前的预览环境前端是公开访问的,仅靠一个难以猜测的URL来保护。对于涉及敏感数据的项目,这是不可接受的。可以集成GCP Identity-Aware Proxy (IAP) 来为每个预览环境添加基于团队成员Google账户的访问控制。

  4. 构建速度:随着项目变大,docker buildnpm install会成为瓶颈。需要深入优化Dockerfile的多阶段构建、利用Cloud Build的缓存机制,甚至引入更高效的构建工具。


  目录