服务器运维笔记：GitLab CI/CD 实战

服务器运维笔记

本文是服务器运维笔记系列的第 13 篇，聚焦 GitLab CI/CD 的实际应用。从基础概念到高级配置，从单机部署到 Kubernetes 集成，提供一套可以直接复用的实战指南。

一、GitLab CI 基础

1.1 架构概览

GitLab CI/CD 的核心架构由三个角色组成：

GitLab Server：托管代码仓库，解析 .gitlab-ci.yml，调度流水线任务
GitLab Runner：执行流水线中具体 Job 的代理程序，可部署在任意机器上
Executor：Runner 内部的执行环境（Docker、Shell、Kubernetes 等）

工作流程：开发者推送代码 → GitLab 解析 CI 配置 → 按 Stages 调度 Jobs → Runner 拉取代码并执行 → 结果回写 GitLab UI。

1.2 Runner 类型

类型	说明	适用场景
Shared Runner	全局共享，所有项目可用	小团队、公共资源池
Group Runner	组级别共享，组内项目可用	部门/团队级别隔离
Specific Runner	绑定到单个项目	敏感项目、特殊环境需求

1.3 .gitlab-ci.yml 核心语法

# 定义流水线阶段（按顺序执行）
stages:
  - build
  - test
  - deploy

# 定义一个 Job
build-app:
  stage: build
  image: node:18-alpine
  script:
    - npm ci
    - npm run build
  artifacts:
    paths:
      - dist/

核心概念：

stages：定义阶段顺序，同阶段 Jobs 并行，不同阶段串行
jobs：定义在顶层的 key（非保留字），每个 Job 必须属于一个 stage
script：Job 中实际执行的命令列表（必填）
image：指定 Docker 镜像（Docker Executor 下生效）
tags：匹配特定 Runner 的标签

二、Runner 配置

2.1 安装与注册

安装 GitLab Runner（以 Ubuntu/Debian 为例）：

# 添加官方仓库
curl -L "https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.deb.sh" | sudo bash

# 安装
sudo apt-get install gitlab-runner

# 验证版本
gitlab-runner --version

注册 Runner：

sudo gitlab-runner register \
  --non-interactive \
  --url "https://gitlab.example.com/" \
  --registration-token "PROJECT_TOKEN_HERE" \
  --executor "docker" \
  --docker-image "alpine:latest" \
  --description "my-docker-runner" \
  --tag-list "docker,linux" \
  --run-untagged="true" \
  --locked="false"

注册完成后，配置文件位于 /etc/gitlab-runner/config.toml。

2.2 Docker Executor 配置

Docker Executor 是最常用的执行器，每次 Job 在独立容器中运行，天然隔离。

# /etc/gitlab-runner/config.toml
concurrent = 4          # 最大并发 Job 数
check_interval = 3      # 轮询间隔（秒）

[[runners]]
  name = "docker-runner"
  url = "https://gitlab.example.com/"
  token = "RUNNER_TOKEN"
  executor = "docker"

  [runners.docker]
    image = "alpine:latest"           # 默认镜像
    privileged = false                # 是否特权模式（Docker-in-Docker 需要）
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/cache"]              # 挂载缓存卷
    shm_size = 0                      # 共享内存大小
    pull_policy = "if-not-present"    # 拉取策略：always/if-not-present/never
    allowed_images = ["ruby:*", "python:*", "node:*"]  # 镜像白名单

Docker-in-Docker（DinD）方案：需要在 CI 配置中构建 Docker 镜像时使用。

build-docker-image:
  stage: build
  image: docker:24
  services:
    - docker:24-dind
  variables:
    DOCKER_TLS_CERTDIR: "/certs"
  script:
    - docker build -t my-app:$CI_COMMIT_SHA .
    - docker push registry.example.com/my-app:$CI_COMMIT_SHA

2.3 Shell Executor 配置

Shell Executor 直接在宿主机上执行命令，性能最好但隔离性最差。

[[runners]]
  name = "shell-runner"
  url = "https://gitlab.example.com/"
  token = "RUNNER_TOKEN"
  executor = "shell"

  [runners.custom_build_dir]
    enabled = true    # 允许自定义构建目录

⚠️ Shell Executor 下所有 Job 共享同一环境，注意依赖冲突。建议仅用于特定场景（如需要访问宿主机硬件、GPU 等）。

2.4 自动伸缩配置（Docker Machine / Autoscaler）

大规模场景下可配置 Runner 自动伸缩，按需创建和销毁 Runner 实例。

[[runners]]
  name = "autoscale-runner"
  url = "https://gitlab.example.com/"
  token = "RUNNER_TOKEN"
  executor = "docker+machine"
  limit = 10                    # 最大实例数

  [runners.machine]
    IdleCount = 2               # 空闲实例数
    IdleTime = 600              # 空闲超时（秒）
    MaxBuilds = 100             # 单实例最大构建次数
    MachineName = "runner-%s"
    MachineDriver = "amazonec2"
    MachineOptions = [
      "amazonec2-instance-type=t3.medium",
      "amazonec2-region=cn-north-1",
      "amazonec2-vpc-id=vpc-xxx",
      "amazonec2-subnet-id=subnet-xxx",
      "amazonec2-security-group=sg-xxx",
    ]

    [[runners.machine.autoscaling]]
      Periods = ["* * 8-18 * * mon-fri *"]   # 工作时间
      IdleCount = 5
      IdleTime = 300
      Timezone = "Asia/Shanghai"

    [[runners.machine.autoscaling]]
      Periods = ["* * * * * sat,sun *"]       # 周末
      IdleCount = 0
      IdleTime = 60

三、流水线设计

3.1 Stages 与 Jobs

stages:
  - lint
  - build
  - test
  - security
  - deploy

# 同一 stage 的 jobs 会并行执行
lint-js:
  stage: lint
  script:
    - npx eslint . --max-warnings=0

lint-python:
  stage: lint
  script:
    - flake8 src/
    - mypy src/

3.2 Dependencies 与制品传递

使用 dependencies 控制 Job 之间的制品传递，避免不必要的下载。

build-frontend:
  stage: build
  script:
    - npm ci && npm run build
  artifacts:
    paths:
      - dist/frontend/

build-backend:
  stage: build
  script:
    - go build -o bin/server ./cmd/server
  artifacts:
    paths:
      - bin/

deploy-staging:
  stage: deploy
  dependencies:
    - build-frontend     # 只下载 frontend 的制品
    - build-backend      # 只下载 backend 的制品
  script:
    - scp dist/frontend/* staging:/var/www/html/
    - scp bin/server staging:/opt/app/

3.3 Artifacts 配置

test-unit:
  stage: test
  script:
    - go test ./... -coverprofile=coverage.out -v 2>&1 | tee test-output.txt
  artifacts:
    when: always              # always/on_success/on_failure
    paths:
      - coverage.out
      - test-output.txt
    reports:
      junit: test-output.txt  # 解析测试报告在 MR 中展示
      coverage_report:
        coverage_format: cobertura
        path: coverage.xml
    expire_in: 7 days         # 制品过期时间

3.4 Cache 配置

variables:
  npm_config_cache: "$CI_PROJECT_DIR/.npm"

cache:
  key:
    files:
      - package-lock.json     # 基于文件内容生成缓存 key
  paths:
    - .npm/
    - node_modules/

build:
  stage: build
  script:
    - npm ci --cache .npm
    - npm run build

四、变量管理

4.1 预定义变量

GitLab 提供了大量内置变量，常用的包括：

# 在 script 中直接使用
deploy-production:
  stage: deploy
  script:
    - echo "项目: $CI_PROJECT_NAME"
    - echo "分支: $CI_COMMIT_REF_NAME"
    - echo "提交: $CI_COMMIT_SHA"
    - echo "短 SHA: $CI_COMMIT_SHORT_SHA"
    - echo "标签: $CI_COMMIT_TAG"
    - echo "MR 源分支: $CI_MERGE_REQUEST_SOURCE_BRANCH_NAME"
    - echo "Pipeline ID: $CI_PIPELINE_ID"
    - echo "Job ID: $CI_JOB_ID"
    - echo "Runner 描述: $CI_RUNNER_DESCRIPTION"
    - echo "Registry: $CI_REGISTRY"
    - echo "Registry 镜像: $CI_REGISTRY_IMAGE"

完整的预定义变量列表可参考：GitLab 官方文档 → CI/CD → Predefined variables。

4.2 自定义变量

变量可以在三个层级定义，优先级：Job 级 > Pipeline 级 > 项目/组 Settings。

# Pipeline 级变量
variables:
  APP_NAME: "my-awesome-app"
  DEPLOY_ENV: "staging"

build:
  stage: build
  variables:
    BUILD_TYPE: "release"       # Job 级变量
  script:
    - echo "Building $APP_NAME ($BUILD_TYPE)"
    - make BUILD_TYPE=$BUILD_TYPE

4.3 Protected 与 Masked 变量

在 Settings → CI/CD → Variables 中配置：

Protected：仅在受保护分支/标签的流水线中可见
Masked：在 Job 日志中自动隐藏（显示 [MASKED]），值必须是 Base64 编码或满足正则规则

# 使用 Protected + Masked 变量（在 Settings 中配置，不在 YAML 中暴露）
deploy-production:
  stage: deploy
  script:
    - echo "$PROD_DB_PASSWORD" | docker secret create db_password -  # 日志中显示 [MASKED]
  only:
    - main                          # 仅 main 分支触发
  environment:
    name: production

4.4 Secrets 管理（Vault 集成）

对于更高安全要求，可集成 HashiCorp Vault：

# 需要在项目 Settings 中配置 Vault 集成
deploy-with-vault:
  stage: deploy
  id_tokens:
    VAULT_ID_TOKEN:
      aud: https://vault.example.com
  secrets:
    DB_PASSWORD:
      vault:
        engine: { name: kv-v2, path: secret }
        path: production/db
        field: password
  script:
    - echo "DB password loaded from Vault"   # 不要打印实际值
    - deploy --db-password "$DB_PASSWORD"

五、常用场景

5.1 构建→测试→部署（完整流水线）

stages:
  - build
  - test
  - deploy

variables:
  DOCKER_REGISTRY: "registry.example.com"

# ============ Build ============
build:
  stage: build
  image: docker:24
  services:
    - docker:24-dind
  script:
    - docker build -t $DOCKER_REGISTRY/$CI_PROJECT_NAME:$CI_COMMIT_SHA .
    - docker push $DOCKER_REGISTRY/$CI_PROJECT_NAME:$CI_COMMIT_SHA

# ============ Test ============
test-unit:
  stage: test
  image: $DOCKER_REGISTRY/$CI_PROJECT_NAME:$CI_COMMIT_SHA
  script:
    - npm test -- --coverage
  coverage: '/All files\s*\|\s*([\d.]+)/'
  artifacts:
    reports:
      junit: junit.xml
      coverage_report:
        coverage_format: cobertura
        path: coverage.xml

test-integration:
  stage: test
  image: $DOCKER_REGISTRY/$CI_PROJECT_NAME:$CI_COMMIT_SHA
  services:
    - postgres:15
    - redis:7
  variables:
    POSTGRES_DB: test_db
    POSTGRES_USER: test
    POSTGRES_PASSWORD: test
    DATABASE_URL: "postgresql://test:test@postgres:5432/test_db"
    REDIS_URL: "redis://redis:6379"
  script:
    - npm run test:integration

# ============ Deploy ============
deploy-staging:
  stage: deploy
  environment:
    name: staging
    url: https://staging.example.com
  script:
    - kubectl set image deployment/$CI_PROJECT_NAME
        app=$DOCKER_REGISTRY/$CI_PROJECT_NAME:$CI_COMMIT_SHA
  only:
    - develop
  when: manual                  # 手动触发

5.2 多环境部署

.deploy_template: &deploy_template
  stage: deploy
  image: bitnami/kubectl:latest
  script:
    - kubectl config use-context $KUBE_CONTEXT
    - envsubst < k8s/deployment.yml | kubectl apply -f -
    - kubectl rollout status deployment/$CI_PROJECT_NAME -n $K8S_NAMESPACE

deploy-staging:
  <<: *deploy_template
  variables:
    KUBE_CONTEXT: staging
    K8S_NAMESPACE: staging
    REPLICAS: "2"
  environment:
    name: staging
    url: https://staging.example.com
  only:
    - develop

deploy-production:
  <<: *deploy_template
  variables:
    KUBE_CONTEXT: production
    K8S_NAMESPACE: production
    REPLICAS: "5"
  environment:
    name: production
    url: https://www.example.com
  only:
    - main
  when: manual
  allow_failure: false          # 不允许跳过

5.3 并行任务

test-suite:
  stage: test
  parallel: 4                   # 启动 4 个并行 Job
  script:
    - |
      case $CI_NODE_INDEX in
        1) TEST_PATTERN="tests/unit/models/**" ;;
        2) TEST_PATTERN="tests/unit/services/**" ;;
        3) TEST_PATTERN="tests/unit/controllers/**" ;;
        4) TEST_PATTERN="tests/unit/middleware/**" ;;
      esac
    - pytest $TEST_PATTERN --junitxml=results-$CI_NODE_INDEX.xml
  artifacts:
    reports:
      junit:
        - results-*.xml

5.4 矩阵构建

build-matrix:
  stage: build
  parallel:
    matrix:
      - GO_VERSION: ["1.21", "1.22"]
        OS: ["linux", "darwin"]
        ARCH: ["amd64", "arm64"]
  image: golang:$GO_VERSION
  script:
    - GOOS=$OS GOARCH=$ARCH go build -o bin/app-$OS-$ARCH ./cmd/app
  artifacts:
    paths:
      - bin/

六、制品管理

6.1 Artifacts 详细配置

build:
  stage: build
  script:
    - make build
    - make docs
  artifacts:
    name: "$CI_PROJECT_NAME-$CI_COMMIT_REF_SLUG-$CI_PIPELINE_ID"
    paths:
      - bin/
      - docs/
      - config/production.yml
    exclude:
      - "**/*.test"             # 排除测试文件
      - "**/*.log"
    when: on_success            # on_success（默认）/ on_failure / always
    expire_in: 30 days
    access: all                 # all / developer / none

6.2 过期策略

# 快速过期的中间产物
test-results:
  artifacts:
    expire_in: 3 days

# 永久保留的发布制品
release-package:
  artifacts:
    expire_in: never            # 永不过期
  only:
    - tags                      # 仅标签触发

6.3 依赖传递控制

build-frontend:
  stage: build
  script: npm run build
  artifacts:
    paths: [dist/]

build-backend:
  stage: build
  script: go build ./...
  artifacts:
    paths: [bin/]

# 默认行为：每个 Job 会下载前面所有阶段的制品
# 显式控制：
deploy:
  stage: deploy
  dependencies:                 # 只下载指定 Job 的制品
    - build-frontend
  script:
    - ls dist/                  # ✅ 有 frontend
    - ls bin/ 2>/dev/null       # ❌ 没有 backend

# 不下载任何制品
lint:
  stage: test
  dependencies: []              # 空数组 = 不下载
  script: eslint .

七、部署策略

7.1 SSH 部署

deploy-ssh:
  stage: deploy
  before_script:
    - apt-get update && apt-get install -y openssh-client
    - eval $(ssh-agent -s)
    - echo "$SSH_PRIVATE_KEY" | tr -d '\r' | ssh-add -
    - mkdir -p ~/.ssh && chmod 700 ~/.ssh
    - ssh-keyscan -H $DEPLOY_HOST >> ~/.ssh/known_hosts
  script:
    - scp -r dist/* deploy@$DEPLOY_HOST:/opt/app/
    - ssh deploy@$DEPLOY_HOST "cd /opt/app && docker compose pull && docker compose up -d"
  environment:
    name: production
    url: https://www.example.com

7.2 Docker 部署（带 Registry）

build-and-push:
  stage: build
  image: docker:24
  services:
    - docker:24-dind
  before_script:
    - echo "$CI_REGISTRY_PASSWORD" | docker login -u "$CI_REGISTRY_USER" --password-stdin $CI_REGISTRY
  script:
    - |
      docker build \
        --cache-from $CI_REGISTRY_IMAGE:latest \
        --tag $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA \
        --tag $CI_REGISTRY_IMAGE:latest \
        .
    - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
    - docker push $CI_REGISTRY_IMAGE:latest

deploy-docker:
  stage: deploy
  before_script:
    - eval $(ssh-agent -s)
    - echo "$SSH_PRIVATE_KEY" | ssh-add -
  script:
    - |
      ssh deploy@$DEPLOY_HOST << 'EOF'
        docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
        docker pull $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
        docker stop app || true
        docker rm app || true
        docker run -d --name app \
          --restart unless-stopped \
          -p 8080:8080 \
          -e DATABASE_URL="$DATABASE_URL" \
          $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
      EOF

7.3 Kubernetes 部署

deploy-k8s:
  stage: deploy
  image:
    name: bitnami/kubectl:latest
    entrypoint: [""]
  before_script:
    - echo "$KUBE_CONFIG" | base64 -d > /tmp/kubeconfig
    - export KUBECONFIG=/tmp/kubeconfig
  script:
    # 替换镜像标签
    - kubectl set image deployment/$CI_PROJECT_NAME
        app=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
        -n $K8S_NAMESPACE
    - kubectl rollout status deployment/$CI_PROJECT_NAME
        -n $K8S_NAMESPACE
        --timeout=300s
  after_script:
    - rm -f /tmp/kubeconfig
  environment:
    name: production
    url: https://www.example.com
    kubernetes:
      namespace: production

使用 Helm 部署：

deploy-helm:
  stage: deploy
  image:
    name: alpine/helm:3.14
    entrypoint: [""]
  before_script:
    - echo "$KUBE_CONFIG" | base64 -d > /tmp/kubeconfig
    - export KUBECONFIG=/tmp/kubeconfig
    - helm repo add myrepo https://charts.example.com
    - helm repo update
  script:
    - |
      helm upgrade --install $CI_PROJECT_NAME myrepo/app-chart \
        --namespace $K8S_NAMESPACE \
        --create-namespace \
        --set image.repository=$CI_REGISTRY_IMAGE \
        --set image.tag=$CI_COMMIT_SHA \
        --set ingress.host=$DEPLOY_HOST \
        --values values-production.yaml \
        --wait --timeout 5m

7.4 Auto DevOps

GitLab 内置的 Auto DevOps 可以零配置完成从构建到部署的全流程：

# 项目根目录创建 .gitlab-ci.yml 仅需一行
include:
  - template: Auto-DevOps.gitlab-ci.yml

# 或者自定义覆盖
variables:
  AUTO_DEVOPS_DEPLOY_TARGET: kubernetes
  KUBE_NAMESPACE: my-namespace
  HELM_UPGRADE_EXTRA_ARGS: "--set replicas=3"

include:
  - template: Auto-DevOps.gitlab-ci.yml

八、缓存优化

8.1 全局缓存

# 全局缓存配置，所有 Job 继承
cache:
  key: "${CI_COMMIT_REF_SLUG}"    # 按分支隔离缓存
  paths:
    - node_modules/
    - .npm/
    - .cache/
  policy: pull-push                # pull / push / pull-push

# 仅 pull（不更新缓存，用于只读 Job）
test:
  stage: test
  cache:
    key: "${CI_COMMIT_REF_SLUG}"
    paths:
      - node_modules/
    policy: pull                   # 只读缓存
  script:
    - npm test

8.2 Per-Job 缓存

build-frontend:
  stage: build
  cache:
    - key: frontend-deps
      paths:
        - frontend/node_modules/
    - key: frontend-build-cache
      paths:
        - frontend/.cache/
  script:
    - cd frontend && npm ci && npm run build

build-backend:
  stage: build
  cache:
    - key: backend-deps
      paths:
        - backend/vendor/
  script:
    - cd backend && go mod download && go build ./...

8.3 缓存键策略

# 策略 1：基于 Lock 文件生成缓存 key（推荐）
cache:
  key:
    files:
      - package-lock.json          # 文件内容变化 → 新缓存
  paths:
    - node_modules/

# 策略 2：分支 + Lock 文件组合
cache:
  key:
    files:
      - package-lock.json
    prefix: $CI_COMMIT_REF_SLUG
  paths:
    - node_modules/

# 策略 3：固定 key + 手动版本控制
cache:
  key: "deps-v2"                   # 需要清除缓存时改版本号
  paths:
    - node_modules/

# 清除缓存的技巧：
# 方式 1：在 GitLab UI → CI/CD → Pipelines → Clear runner caches
# 方式 2：修改 cache key 中的版本号
# 方式 3：使用 API
#   curl --request DELETE --header "PRIVATE-TOKEN: $TOKEN" \
#     "https://gitlab.example.com/api/v4/projects/$PROJECT_ID/clean"

九、安全扫描

9.1 SAST（静态应用安全测试）

include:
  - template: Security/SAST.gitlab-ci.yml

sast:
  variables:
    SAST_EXCLUDED_ANALYZERS: "spotbugs"     # 排除特定分析器
    SAST_EXCLUDED_PATHS: "test,spec,docs"   # 排除目录

9.2 DAST（动态应用安全测试）

include:
  - template: Security/DAST.gitlab-ci.yml

dast:
  variables:
    DAST_WEBSITE: "https://staging.example.com"
    DAST_FULL_SCAN_ENABLED: "true"
    DAST_AUTH_URL: "https://staging.example.com/login"
    DAST_USERNAME: "scanner@example.com"
    DAST_PASSWORD: "$DAST_PASSWORD"         # 在 CI Variables 中配置

9.3 依赖扫描

include:
  - template: Security/Dependency-Scanning.gitlab-ci.yml

dependency_scanning:
  variables:
    DS_EXCLUDED_PATHS: "test,spec,docs"

9.4 容器扫描

include:
  - template: Security/Container-Scanning.gitlab-ci.yml

container_scanning:
  variables:
    CS_IMAGE: "$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA"
    CS_SEVERITY_THRESHOLD: "high"           # 只报告 high 及以上

完整安全扫描流水线示例：

include:
  - template: Security/SAST.gitlab-ci.yml
  - template: Security/Dependency-Scanning.gitlab-ci.yml
  - template: Security/Container-Scanning.gitlab-ci.yml
  - template: Security/Secret-Detection.gitlab-ci.yml

stages:
  - build
  - test
  - security-sast
  - security-container
  - deploy

build:
  stage: build
  script:
    - docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
    - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA

# 所有安全扫描 Job 自动从 include 模板继承配置
# 扫描结果会自动合并到 MR 的安全面板中

十、最佳实践

10.1 YAML 锚点复用

# 定义可复用的 Job 片段
.docker_login: &docker_login
  before_script:
    - echo "$CI_REGISTRY_PASSWORD" | docker login -u "$CI_REGISTRY_USER" --password-stdin

.base_deploy: &base_deploy
  stage: deploy
  image: bitnami/kubectl:latest
  <<: *docker_login
  when: manual
  allow_failure: false

# 使用锚点扩展
deploy-staging:
  <<: *base_deploy
  variables:
    K8S_NAMESPACE: staging
  environment:
    name: staging
  only:
    - develop

deploy-production:
  <<: *base_deploy
  variables:
    K8S_NAMESPACE: production
  environment:
    name: production
  only:
    - main

10.2 使用 `extends`（更推荐）

.docker_login:
  before_script:
    - echo "$CI_REGISTRY_PASSWORD" | docker login -u "$CI_REGISTRY_USER" --password-stdin

.base_deploy:
  stage: deploy
  image: bitnami/kubectl:latest
  extends: .docker_login
  when: manual

deploy-staging:
  extends: .base_deploy
  variables:
    K8S_NAMESPACE: staging
  environment:
    name: staging

deploy-production:
  extends: .base_deploy
  variables:
    K8S_NAMESPACE: production
  environment:
    name: production

💡 extends 比 YAML 锚点更强大：支持多级继承、深度合并对象和数组。推荐优先使用 extends。

10.3 Monorepo 策略

# 使用 rules:changes 按文件变更路径触发
build-frontend:
  stage: build
  rules:
    - changes:
        - frontend/**/*
        - shared/**/*
      when: on_success
  script:
    - cd frontend && npm ci && npm run build

build-backend:
  stage: build
  rules:
    - changes:
        - backend/**/*
        - shared/**/*
      when: on_success
  script:
    - cd backend && go build ./...

deploy-frontend:
  stage: deploy
  rules:
    - changes:
        - frontend/**/*
      when: manual
  script:
    - echo "Deploy frontend"

deploy-backend:
  stage: deploy
  rules:
    - changes:
        - backend/**/*
      when: manual
  script:
    - echo "Deploy backend"

10.4 调试技巧

debug-job:
  stage: test
  image: ubuntu:22.04
  variables:
    CI_DEBUG_TRACE: "1"             # 开启 Shell 调试（set -x）
  script:
    - echo "Debug info:"
    - echo "Shell: $0"
    - echo "User: $(whoami)"
    - echo "PWD: $(pwd)"
    - echo "ENV:"
    - env | sort
    - echo "Network:"
    - ip addr show || ifconfig
    - echo "Disk:"
    - df -h

其他调试方法：

# 1. 本地运行 Job（需要安装 gitlab-runner）
gitlab-runner exec docker build-job

# 2. 使用 CI_DEBUG_SERVICES 查看 service 日志
#    在 Variables 中设置 CI_DEBUG_SERVICES=true

# 3. 在 Job 失败时保留容器（Docker Executor）
#    config.toml 中设置 [runners.docker] cleanup = false

# 4. 使用 after_script 查看失败后的环境状态
job:
  script:
    - make test
  after_script:
    - ls -la /builds/
    - cat /tmp/*.log 2>/dev/null || true

10.5 常见坑与解决方案

坑 1：Job 日志中变量值泄露

# ❌ 错误：直接打印密码
deploy:
  script:
    - echo "Connecting with password: $DB_PASSWORD"    # 日志中可见！
    - mysql -u root -p$DB_PASSWORD

# ✅ 正确：使用 Masked 变量 + 避免打印
deploy:
  script:
    - mysql -u root -p"$DB_PASSWORD" -e "SELECT 1"    # 变量在 Settings 中设为 Masked

坑 2：Docker Executor 中文件权限问题

# ❌ 以 root 运行的容器中创建的文件，宿主机 runner 用户无法访问
build:
  script:
    - touch output.txt

# ✅ 使用 --user 指定用户，或在 script 中修改权限
build:
  script:
    - touch output.txt
    - chmod 644 output.txt

坑 3：缓存未生效

# ❌ 不同分支使用相同缓存 key 但 Lock 文件不同
cache:
  key: "deps"

# ✅ 基于 Lock 文件生成 key
cache:
  key:
    files:
      - package-lock.json

坑 4：rules 与 only/except 混用

# ❌ 不要混用 rules 和 only/except（已废弃）
job:
  only:
    - main
  rules:
    - if: $CI_COMMIT_BRANCH == "main"

# ✅ 统一使用 rules
job:
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
    - if: $CI_COMMIT_TAG

坑 5：Artifact 超大导致上传失败

# ❌ 打包整个 node_modules（可能上 GB）
artifacts:
  paths:
    - node_modules/

# ✅ 只打包构建产物，依赖用 cache 管理
artifacts:
  paths:
    - dist/
  expire_in: 7 days

附录：完整项目配置示例

以下是一个完整的 .gitlab-ci.yml 模板，涵盖了本文讨论的大部分功能：

# ============================================
# .gitlab-ci.yml - 完整项目模板
# ============================================

stages:
  - prepare
  - build
  - test
  - security
  - deploy

variables:
  DOCKER_REGISTRY: "$CI_REGISTRY"
  DOCKER_IMAGE: "$CI_REGISTRY_IMAGE"
  K8S_NAMESPACE: "default"

# ---- 全局缓存 ----
default:
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/
      - .npm/
  image: node:18-alpine

# ---- 复用模板 ----
.docker_auth: &docker_auth
  before_script:
    - echo "$CI_REGISTRY_PASSWORD" | docker login -u "$CI_REGISTRY_USER" --password-stdin $CI_REGISTRY

.kubectl_base: &kubectl_base
  image:
    name: bitnami/kubectl:latest
    entrypoint: [""]
  before_script:
    - echo "$KUBE_CONFIG" | base64 -d > /tmp/kubeconfig
    - export KUBECONFIG=/tmp/kubeconfig

# ============ Prepare Stage ============
install-deps:
  stage: prepare
  script:
    - npm ci --cache .npm
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/
      - .npm/
    policy: pull-push

# ============ Build Stage ============
build-app:
  stage: build
  script:
    - npm run build
  artifacts:
    paths:
      - dist/
    expire_in: 7 days

build-docker:
  stage: build
  image: docker:24
  services:
    - docker:24-dind
  <<: *docker_auth
  script:
    - docker build --cache-from $DOCKER_IMAGE:latest -t $DOCKER_IMAGE:$CI_COMMIT_SHA -t $DOCKER_IMAGE:latest .
    - docker push $DOCKER_IMAGE:$CI_COMMIT_SHA
    - docker push $DOCKER_IMAGE:latest

# ============ Test Stage ============
test-unit:
  stage: test
  script:
    - npm run test:unit -- --coverage
  coverage: '/Lines\s*:\s*([\d.]+)/'
  artifacts:
    reports:
      junit: junit.xml
      coverage_report:
        coverage_format: cobertura
        path: coverage.xml

test-integration:
  stage: test
  services:
    - postgres:15
    - redis:7
  variables:
    POSTGRES_DB: test
    POSTGRES_USER: test
    POSTGRES_PASSWORD: test
    DATABASE_URL: "postgresql://test:test@postgres:5432/test"
  script:
    - npm run test:integration

# ============ Security Stage ============
include:
  - template: Security/SAST.gitlab-ci.yml
  - template: Security/Dependency-Scanning.gitlab-ci.yml
  - template: Security/Secret-Detection.gitlab-ci.yml

# ============ Deploy Stage ============
deploy-staging:
  <<: *kubectl_base
  stage: deploy
  script:
    - kubectl set image deployment/app app=$DOCKER_IMAGE:$CI_COMMIT_SHA -n staging
    - kubectl rollout status deployment/app -n staging --timeout=300s
  environment:
    name: staging
    url: https://staging.example.com
  rules:
    - if: $CI_COMMIT_BRANCH == "develop"

deploy-production:
  <<: *kubectl_base
  stage: deploy
  script:
    - kubectl set image deployment/app app=$DOCKER_IMAGE:$CI_COMMIT_SHA -n production
    - kubectl rollout status deployment/app -n production --timeout=300s
  environment:
    name: production
    url: https://www.example.com
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
      when: manual
  allow_failure: false

📝 写在最后：GitLab CI/CD 是一个功能非常强大的持续集成/持续部署平台。本文覆盖了日常运维中最常用的配置和场景，建议收藏作为速查手册。实际项目中，根据团队规模和业务需求选择合适的 Runner 架构和流水线设计，切忌过度设计。从简单开始，逐步迭代优化，才是 DevOps 的正确姿势。

如果觉得文章对你有用，请随意赞赏

运维

服务器运维笔记：GitLab CI/CD 实战

https://acf1sh.top/console/overview/archives/fu-wu-qi-yun-wei-bi-ji-gitlab-ci-cd-shi-zhan

作者

fish

发布于

2026-06-15

更新于

2026-06-10

许可协议

CC BY 4.0

服务器运维笔记：GitLab CI/CD 实战

一、GitLab CI 基础

1.1 架构概览

1.2 Runner 类型

1.3 .gitlab-ci.yml 核心语法

二、Runner 配置

2.1 安装与注册

2.2 Docker Executor 配置

2.3 Shell Executor 配置

2.4 自动伸缩配置（Docker Machine / Autoscaler）

三、流水线设计

3.1 Stages 与 Jobs

3.2 Dependencies 与制品传递

3.3 Artifacts 配置

3.4 Cache 配置

四、变量管理

4.1 预定义变量

4.2 自定义变量

4.3 Protected 与 Masked 变量

4.4 Secrets 管理（Vault 集成）

五、常用场景

5.1 构建→测试→部署（完整流水线）

5.2 多环境部署

5.3 并行任务

5.4 矩阵构建

六、制品管理

6.1 Artifacts 详细配置

6.2 过期策略

6.3 依赖传递控制

七、部署策略

7.1 SSH 部署

7.2 Docker 部署（带 Registry）

7.3 Kubernetes 部署

7.4 Auto DevOps

八、缓存优化

8.1 全局缓存

8.2 Per-Job 缓存

8.3 缓存键策略

九、安全扫描

9.1 SAST（静态应用安全测试）

9.2 DAST（动态应用安全测试）

9.3 依赖扫描

9.4 容器扫描

十、最佳实践

10.1 YAML 锚点复用

10.2 使用 extends（更推荐）

10.3 Monorepo 策略

10.4 调试技巧

10.5 常见坑与解决方案

附录：完整项目配置示例

作者

发布于

更新于

许可协议

评论

10.2 使用 `extends`（更推荐）