本文是服务器运维笔记系列的第 13 篇,聚焦 GitLab CI/CD 的实际应用。从基础概念到高级配置,从单机部署到 Kubernetes 集成,提供一套可以直接复用的实战指南。
一、GitLab CI 基础
1.1 架构概览
GitLab CI/CD 的核心架构由三个角色组成:
GitLab Server:托管代码仓库,解析
.gitlab-ci.yml,调度流水线任务GitLab Runner:执行流水线中具体 Job 的代理程序,可部署在任意机器上
Executor:Runner 内部的执行环境(Docker、Shell、Kubernetes 等)
工作流程:开发者推送代码 → GitLab 解析 CI 配置 → 按 Stages 调度 Jobs → Runner 拉取代码并执行 → 结果回写 GitLab UI。
1.2 Runner 类型
1.3 .gitlab-ci.yml 核心语法
# 定义流水线阶段(按顺序执行)
stages:
- build
- test
- deploy
# 定义一个 Job
build-app:
stage: build
image: node:18-alpine
script:
- npm ci
- npm run build
artifacts:
paths:
- dist/核心概念:
stages:定义阶段顺序,同阶段 Jobs 并行,不同阶段串行
jobs:定义在顶层的 key(非保留字),每个 Job 必须属于一个 stage
script:Job 中实际执行的命令列表(必填)
image:指定 Docker 镜像(Docker Executor 下生效)
tags:匹配特定 Runner 的标签
二、Runner 配置
2.1 安装与注册
安装 GitLab Runner(以 Ubuntu/Debian 为例):
# 添加官方仓库
curl -L "https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.deb.sh" | sudo bash
# 安装
sudo apt-get install gitlab-runner
# 验证版本
gitlab-runner --version注册 Runner:
sudo gitlab-runner register \
--non-interactive \
--url "https://gitlab.example.com/" \
--registration-token "PROJECT_TOKEN_HERE" \
--executor "docker" \
--docker-image "alpine:latest" \
--description "my-docker-runner" \
--tag-list "docker,linux" \
--run-untagged="true" \
--locked="false"注册完成后,配置文件位于 /etc/gitlab-runner/config.toml。
2.2 Docker Executor 配置
Docker Executor 是最常用的执行器,每次 Job 在独立容器中运行,天然隔离。
# /etc/gitlab-runner/config.toml
concurrent = 4 # 最大并发 Job 数
check_interval = 3 # 轮询间隔(秒)
[[runners]]
name = "docker-runner"
url = "https://gitlab.example.com/"
token = "RUNNER_TOKEN"
executor = "docker"
[runners.docker]
image = "alpine:latest" # 默认镜像
privileged = false # 是否特权模式(Docker-in-Docker 需要)
disable_entrypoint_overwrite = false
oom_kill_disable = false
disable_cache = false
volumes = ["/cache"] # 挂载缓存卷
shm_size = 0 # 共享内存大小
pull_policy = "if-not-present" # 拉取策略:always/if-not-present/never
allowed_images = ["ruby:*", "python:*", "node:*"] # 镜像白名单Docker-in-Docker(DinD)方案:需要在 CI 配置中构建 Docker 镜像时使用。
build-docker-image:
stage: build
image: docker:24
services:
- docker:24-dind
variables:
DOCKER_TLS_CERTDIR: "/certs"
script:
- docker build -t my-app:$CI_COMMIT_SHA .
- docker push registry.example.com/my-app:$CI_COMMIT_SHA2.3 Shell Executor 配置
Shell Executor 直接在宿主机上执行命令,性能最好但隔离性最差。
[[runners]]
name = "shell-runner"
url = "https://gitlab.example.com/"
token = "RUNNER_TOKEN"
executor = "shell"
[runners.custom_build_dir]
enabled = true # 允许自定义构建目录⚠️ Shell Executor 下所有 Job 共享同一环境,注意依赖冲突。建议仅用于特定场景(如需要访问宿主机硬件、GPU 等)。
2.4 自动伸缩配置(Docker Machine / Autoscaler)
大规模场景下可配置 Runner 自动伸缩,按需创建和销毁 Runner 实例。
[[runners]]
name = "autoscale-runner"
url = "https://gitlab.example.com/"
token = "RUNNER_TOKEN"
executor = "docker+machine"
limit = 10 # 最大实例数
[runners.machine]
IdleCount = 2 # 空闲实例数
IdleTime = 600 # 空闲超时(秒)
MaxBuilds = 100 # 单实例最大构建次数
MachineName = "runner-%s"
MachineDriver = "amazonec2"
MachineOptions = [
"amazonec2-instance-type=t3.medium",
"amazonec2-region=cn-north-1",
"amazonec2-vpc-id=vpc-xxx",
"amazonec2-subnet-id=subnet-xxx",
"amazonec2-security-group=sg-xxx",
]
[[runners.machine.autoscaling]]
Periods = ["* * 8-18 * * mon-fri *"] # 工作时间
IdleCount = 5
IdleTime = 300
Timezone = "Asia/Shanghai"
[[runners.machine.autoscaling]]
Periods = ["* * * * * sat,sun *"] # 周末
IdleCount = 0
IdleTime = 60三、流水线设计
3.1 Stages 与 Jobs
stages:
- lint
- build
- test
- security
- deploy
# 同一 stage 的 jobs 会并行执行
lint-js:
stage: lint
script:
- npx eslint . --max-warnings=0
lint-python:
stage: lint
script:
- flake8 src/
- mypy src/3.2 Dependencies 与制品传递
使用 dependencies 控制 Job 之间的制品传递,避免不必要的下载。
build-frontend:
stage: build
script:
- npm ci && npm run build
artifacts:
paths:
- dist/frontend/
build-backend:
stage: build
script:
- go build -o bin/server ./cmd/server
artifacts:
paths:
- bin/
deploy-staging:
stage: deploy
dependencies:
- build-frontend # 只下载 frontend 的制品
- build-backend # 只下载 backend 的制品
script:
- scp dist/frontend/* staging:/var/www/html/
- scp bin/server staging:/opt/app/3.3 Artifacts 配置
test-unit:
stage: test
script:
- go test ./... -coverprofile=coverage.out -v 2>&1 | tee test-output.txt
artifacts:
when: always # always/on_success/on_failure
paths:
- coverage.out
- test-output.txt
reports:
junit: test-output.txt # 解析测试报告在 MR 中展示
coverage_report:
coverage_format: cobertura
path: coverage.xml
expire_in: 7 days # 制品过期时间3.4 Cache 配置
variables:
npm_config_cache: "$CI_PROJECT_DIR/.npm"
cache:
key:
files:
- package-lock.json # 基于文件内容生成缓存 key
paths:
- .npm/
- node_modules/
build:
stage: build
script:
- npm ci --cache .npm
- npm run build四、变量管理
4.1 预定义变量
GitLab 提供了大量内置变量,常用的包括:
# 在 script 中直接使用
deploy-production:
stage: deploy
script:
- echo "项目: $CI_PROJECT_NAME"
- echo "分支: $CI_COMMIT_REF_NAME"
- echo "提交: $CI_COMMIT_SHA"
- echo "短 SHA: $CI_COMMIT_SHORT_SHA"
- echo "标签: $CI_COMMIT_TAG"
- echo "MR 源分支: $CI_MERGE_REQUEST_SOURCE_BRANCH_NAME"
- echo "Pipeline ID: $CI_PIPELINE_ID"
- echo "Job ID: $CI_JOB_ID"
- echo "Runner 描述: $CI_RUNNER_DESCRIPTION"
- echo "Registry: $CI_REGISTRY"
- echo "Registry 镜像: $CI_REGISTRY_IMAGE"完整的预定义变量列表可参考:GitLab 官方文档 → CI/CD → Predefined variables。
4.2 自定义变量
变量可以在三个层级定义,优先级:Job 级 > Pipeline 级 > 项目/组 Settings。
# Pipeline 级变量
variables:
APP_NAME: "my-awesome-app"
DEPLOY_ENV: "staging"
build:
stage: build
variables:
BUILD_TYPE: "release" # Job 级变量
script:
- echo "Building $APP_NAME ($BUILD_TYPE)"
- make BUILD_TYPE=$BUILD_TYPE4.3 Protected 与 Masked 变量
在 Settings → CI/CD → Variables 中配置:
Protected:仅在受保护分支/标签的流水线中可见
Masked:在 Job 日志中自动隐藏(显示
[MASKED]),值必须是 Base64 编码或满足正则规则
# 使用 Protected + Masked 变量(在 Settings 中配置,不在 YAML 中暴露)
deploy-production:
stage: deploy
script:
- echo "$PROD_DB_PASSWORD" | docker secret create db_password - # 日志中显示 [MASKED]
only:
- main # 仅 main 分支触发
environment:
name: production4.4 Secrets 管理(Vault 集成)
对于更高安全要求,可集成 HashiCorp Vault:
# 需要在项目 Settings 中配置 Vault 集成
deploy-with-vault:
stage: deploy
id_tokens:
VAULT_ID_TOKEN:
aud: https://vault.example.com
secrets:
DB_PASSWORD:
vault:
engine: { name: kv-v2, path: secret }
path: production/db
field: password
script:
- echo "DB password loaded from Vault" # 不要打印实际值
- deploy --db-password "$DB_PASSWORD"五、常用场景
5.1 构建→测试→部署(完整流水线)
stages:
- build
- test
- deploy
variables:
DOCKER_REGISTRY: "registry.example.com"
# ============ Build ============
build:
stage: build
image: docker:24
services:
- docker:24-dind
script:
- docker build -t $DOCKER_REGISTRY/$CI_PROJECT_NAME:$CI_COMMIT_SHA .
- docker push $DOCKER_REGISTRY/$CI_PROJECT_NAME:$CI_COMMIT_SHA
# ============ Test ============
test-unit:
stage: test
image: $DOCKER_REGISTRY/$CI_PROJECT_NAME:$CI_COMMIT_SHA
script:
- npm test -- --coverage
coverage: '/All files\s*\|\s*([\d.]+)/'
artifacts:
reports:
junit: junit.xml
coverage_report:
coverage_format: cobertura
path: coverage.xml
test-integration:
stage: test
image: $DOCKER_REGISTRY/$CI_PROJECT_NAME:$CI_COMMIT_SHA
services:
- postgres:15
- redis:7
variables:
POSTGRES_DB: test_db
POSTGRES_USER: test
POSTGRES_PASSWORD: test
DATABASE_URL: "postgresql://test:test@postgres:5432/test_db"
REDIS_URL: "redis://redis:6379"
script:
- npm run test:integration
# ============ Deploy ============
deploy-staging:
stage: deploy
environment:
name: staging
url: https://staging.example.com
script:
- kubectl set image deployment/$CI_PROJECT_NAME
app=$DOCKER_REGISTRY/$CI_PROJECT_NAME:$CI_COMMIT_SHA
only:
- develop
when: manual # 手动触发5.2 多环境部署
.deploy_template: &deploy_template
stage: deploy
image: bitnami/kubectl:latest
script:
- kubectl config use-context $KUBE_CONTEXT
- envsubst < k8s/deployment.yml | kubectl apply -f -
- kubectl rollout status deployment/$CI_PROJECT_NAME -n $K8S_NAMESPACE
deploy-staging:
<<: *deploy_template
variables:
KUBE_CONTEXT: staging
K8S_NAMESPACE: staging
REPLICAS: "2"
environment:
name: staging
url: https://staging.example.com
only:
- develop
deploy-production:
<<: *deploy_template
variables:
KUBE_CONTEXT: production
K8S_NAMESPACE: production
REPLICAS: "5"
environment:
name: production
url: https://www.example.com
only:
- main
when: manual
allow_failure: false # 不允许跳过5.3 并行任务
test-suite:
stage: test
parallel: 4 # 启动 4 个并行 Job
script:
- |
case $CI_NODE_INDEX in
1) TEST_PATTERN="tests/unit/models/**" ;;
2) TEST_PATTERN="tests/unit/services/**" ;;
3) TEST_PATTERN="tests/unit/controllers/**" ;;
4) TEST_PATTERN="tests/unit/middleware/**" ;;
esac
- pytest $TEST_PATTERN --junitxml=results-$CI_NODE_INDEX.xml
artifacts:
reports:
junit:
- results-*.xml5.4 矩阵构建
build-matrix:
stage: build
parallel:
matrix:
- GO_VERSION: ["1.21", "1.22"]
OS: ["linux", "darwin"]
ARCH: ["amd64", "arm64"]
image: golang:$GO_VERSION
script:
- GOOS=$OS GOARCH=$ARCH go build -o bin/app-$OS-$ARCH ./cmd/app
artifacts:
paths:
- bin/六、制品管理
6.1 Artifacts 详细配置
build:
stage: build
script:
- make build
- make docs
artifacts:
name: "$CI_PROJECT_NAME-$CI_COMMIT_REF_SLUG-$CI_PIPELINE_ID"
paths:
- bin/
- docs/
- config/production.yml
exclude:
- "**/*.test" # 排除测试文件
- "**/*.log"
when: on_success # on_success(默认)/ on_failure / always
expire_in: 30 days
access: all # all / developer / none6.2 过期策略
# 快速过期的中间产物
test-results:
artifacts:
expire_in: 3 days
# 永久保留的发布制品
release-package:
artifacts:
expire_in: never # 永不过期
only:
- tags # 仅标签触发6.3 依赖传递控制
build-frontend:
stage: build
script: npm run build
artifacts:
paths: [dist/]
build-backend:
stage: build
script: go build ./...
artifacts:
paths: [bin/]
# 默认行为:每个 Job 会下载前面所有阶段的制品
# 显式控制:
deploy:
stage: deploy
dependencies: # 只下载指定 Job 的制品
- build-frontend
script:
- ls dist/ # ✅ 有 frontend
- ls bin/ 2>/dev/null # ❌ 没有 backend
# 不下载任何制品
lint:
stage: test
dependencies: [] # 空数组 = 不下载
script: eslint .七、部署策略
7.1 SSH 部署
deploy-ssh:
stage: deploy
before_script:
- apt-get update && apt-get install -y openssh-client
- eval $(ssh-agent -s)
- echo "$SSH_PRIVATE_KEY" | tr -d '\r' | ssh-add -
- mkdir -p ~/.ssh && chmod 700 ~/.ssh
- ssh-keyscan -H $DEPLOY_HOST >> ~/.ssh/known_hosts
script:
- scp -r dist/* deploy@$DEPLOY_HOST:/opt/app/
- ssh deploy@$DEPLOY_HOST "cd /opt/app && docker compose pull && docker compose up -d"
environment:
name: production
url: https://www.example.com7.2 Docker 部署(带 Registry)
build-and-push:
stage: build
image: docker:24
services:
- docker:24-dind
before_script:
- echo "$CI_REGISTRY_PASSWORD" | docker login -u "$CI_REGISTRY_USER" --password-stdin $CI_REGISTRY
script:
- |
docker build \
--cache-from $CI_REGISTRY_IMAGE:latest \
--tag $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA \
--tag $CI_REGISTRY_IMAGE:latest \
.
- docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
- docker push $CI_REGISTRY_IMAGE:latest
deploy-docker:
stage: deploy
before_script:
- eval $(ssh-agent -s)
- echo "$SSH_PRIVATE_KEY" | ssh-add -
script:
- |
ssh deploy@$DEPLOY_HOST << 'EOF'
docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
docker pull $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
docker stop app || true
docker rm app || true
docker run -d --name app \
--restart unless-stopped \
-p 8080:8080 \
-e DATABASE_URL="$DATABASE_URL" \
$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
EOF7.3 Kubernetes 部署
deploy-k8s:
stage: deploy
image:
name: bitnami/kubectl:latest
entrypoint: [""]
before_script:
- echo "$KUBE_CONFIG" | base64 -d > /tmp/kubeconfig
- export KUBECONFIG=/tmp/kubeconfig
script:
# 替换镜像标签
- kubectl set image deployment/$CI_PROJECT_NAME
app=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
-n $K8S_NAMESPACE
- kubectl rollout status deployment/$CI_PROJECT_NAME
-n $K8S_NAMESPACE
--timeout=300s
after_script:
- rm -f /tmp/kubeconfig
environment:
name: production
url: https://www.example.com
kubernetes:
namespace: production使用 Helm 部署:
deploy-helm:
stage: deploy
image:
name: alpine/helm:3.14
entrypoint: [""]
before_script:
- echo "$KUBE_CONFIG" | base64 -d > /tmp/kubeconfig
- export KUBECONFIG=/tmp/kubeconfig
- helm repo add myrepo https://charts.example.com
- helm repo update
script:
- |
helm upgrade --install $CI_PROJECT_NAME myrepo/app-chart \
--namespace $K8S_NAMESPACE \
--create-namespace \
--set image.repository=$CI_REGISTRY_IMAGE \
--set image.tag=$CI_COMMIT_SHA \
--set ingress.host=$DEPLOY_HOST \
--values values-production.yaml \
--wait --timeout 5m7.4 Auto DevOps
GitLab 内置的 Auto DevOps 可以零配置完成从构建到部署的全流程:
# 项目根目录创建 .gitlab-ci.yml 仅需一行
include:
- template: Auto-DevOps.gitlab-ci.yml
# 或者自定义覆盖
variables:
AUTO_DEVOPS_DEPLOY_TARGET: kubernetes
KUBE_NAMESPACE: my-namespace
HELM_UPGRADE_EXTRA_ARGS: "--set replicas=3"
include:
- template: Auto-DevOps.gitlab-ci.yml八、缓存优化
8.1 全局缓存
# 全局缓存配置,所有 Job 继承
cache:
key: "${CI_COMMIT_REF_SLUG}" # 按分支隔离缓存
paths:
- node_modules/
- .npm/
- .cache/
policy: pull-push # pull / push / pull-push
# 仅 pull(不更新缓存,用于只读 Job)
test:
stage: test
cache:
key: "${CI_COMMIT_REF_SLUG}"
paths:
- node_modules/
policy: pull # 只读缓存
script:
- npm test8.2 Per-Job 缓存
build-frontend:
stage: build
cache:
- key: frontend-deps
paths:
- frontend/node_modules/
- key: frontend-build-cache
paths:
- frontend/.cache/
script:
- cd frontend && npm ci && npm run build
build-backend:
stage: build
cache:
- key: backend-deps
paths:
- backend/vendor/
script:
- cd backend && go mod download && go build ./...8.3 缓存键策略
# 策略 1:基于 Lock 文件生成缓存 key(推荐)
cache:
key:
files:
- package-lock.json # 文件内容变化 → 新缓存
paths:
- node_modules/
# 策略 2:分支 + Lock 文件组合
cache:
key:
files:
- package-lock.json
prefix: $CI_COMMIT_REF_SLUG
paths:
- node_modules/
# 策略 3:固定 key + 手动版本控制
cache:
key: "deps-v2" # 需要清除缓存时改版本号
paths:
- node_modules/
# 清除缓存的技巧:
# 方式 1:在 GitLab UI → CI/CD → Pipelines → Clear runner caches
# 方式 2:修改 cache key 中的版本号
# 方式 3:使用 API
# curl --request DELETE --header "PRIVATE-TOKEN: $TOKEN" \
# "https://gitlab.example.com/api/v4/projects/$PROJECT_ID/clean"九、安全扫描
9.1 SAST(静态应用安全测试)
include:
- template: Security/SAST.gitlab-ci.yml
sast:
variables:
SAST_EXCLUDED_ANALYZERS: "spotbugs" # 排除特定分析器
SAST_EXCLUDED_PATHS: "test,spec,docs" # 排除目录9.2 DAST(动态应用安全测试)
include:
- template: Security/DAST.gitlab-ci.yml
dast:
variables:
DAST_WEBSITE: "https://staging.example.com"
DAST_FULL_SCAN_ENABLED: "true"
DAST_AUTH_URL: "https://staging.example.com/login"
DAST_USERNAME: "scanner@example.com"
DAST_PASSWORD: "$DAST_PASSWORD" # 在 CI Variables 中配置9.3 依赖扫描
include:
- template: Security/Dependency-Scanning.gitlab-ci.yml
dependency_scanning:
variables:
DS_EXCLUDED_PATHS: "test,spec,docs"9.4 容器扫描
include:
- template: Security/Container-Scanning.gitlab-ci.yml
container_scanning:
variables:
CS_IMAGE: "$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA"
CS_SEVERITY_THRESHOLD: "high" # 只报告 high 及以上完整安全扫描流水线示例:
include:
- template: Security/SAST.gitlab-ci.yml
- template: Security/Dependency-Scanning.gitlab-ci.yml
- template: Security/Container-Scanning.gitlab-ci.yml
- template: Security/Secret-Detection.gitlab-ci.yml
stages:
- build
- test
- security-sast
- security-container
- deploy
build:
stage: build
script:
- docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
- docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
# 所有安全扫描 Job 自动从 include 模板继承配置
# 扫描结果会自动合并到 MR 的安全面板中十、最佳实践
10.1 YAML 锚点复用
# 定义可复用的 Job 片段
.docker_login: &docker_login
before_script:
- echo "$CI_REGISTRY_PASSWORD" | docker login -u "$CI_REGISTRY_USER" --password-stdin
.base_deploy: &base_deploy
stage: deploy
image: bitnami/kubectl:latest
<<: *docker_login
when: manual
allow_failure: false
# 使用锚点扩展
deploy-staging:
<<: *base_deploy
variables:
K8S_NAMESPACE: staging
environment:
name: staging
only:
- develop
deploy-production:
<<: *base_deploy
variables:
K8S_NAMESPACE: production
environment:
name: production
only:
- main10.2 使用 extends(更推荐)
.docker_login:
before_script:
- echo "$CI_REGISTRY_PASSWORD" | docker login -u "$CI_REGISTRY_USER" --password-stdin
.base_deploy:
stage: deploy
image: bitnami/kubectl:latest
extends: .docker_login
when: manual
deploy-staging:
extends: .base_deploy
variables:
K8S_NAMESPACE: staging
environment:
name: staging
deploy-production:
extends: .base_deploy
variables:
K8S_NAMESPACE: production
environment:
name: production💡
extends比 YAML 锚点更强大:支持多级继承、深度合并对象和数组。推荐优先使用extends。
10.3 Monorepo 策略
# 使用 rules:changes 按文件变更路径触发
build-frontend:
stage: build
rules:
- changes:
- frontend/**/*
- shared/**/*
when: on_success
script:
- cd frontend && npm ci && npm run build
build-backend:
stage: build
rules:
- changes:
- backend/**/*
- shared/**/*
when: on_success
script:
- cd backend && go build ./...
deploy-frontend:
stage: deploy
rules:
- changes:
- frontend/**/*
when: manual
script:
- echo "Deploy frontend"
deploy-backend:
stage: deploy
rules:
- changes:
- backend/**/*
when: manual
script:
- echo "Deploy backend"10.4 调试技巧
debug-job:
stage: test
image: ubuntu:22.04
variables:
CI_DEBUG_TRACE: "1" # 开启 Shell 调试(set -x)
script:
- echo "Debug info:"
- echo "Shell: $0"
- echo "User: $(whoami)"
- echo "PWD: $(pwd)"
- echo "ENV:"
- env | sort
- echo "Network:"
- ip addr show || ifconfig
- echo "Disk:"
- df -h其他调试方法:
# 1. 本地运行 Job(需要安装 gitlab-runner)
gitlab-runner exec docker build-job
# 2. 使用 CI_DEBUG_SERVICES 查看 service 日志
# 在 Variables 中设置 CI_DEBUG_SERVICES=true
# 3. 在 Job 失败时保留容器(Docker Executor)
# config.toml 中设置 [runners.docker] cleanup = false
# 4. 使用 after_script 查看失败后的环境状态
job:
script:
- make test
after_script:
- ls -la /builds/
- cat /tmp/*.log 2>/dev/null || true10.5 常见坑与解决方案
坑 1:Job 日志中变量值泄露
# ❌ 错误:直接打印密码
deploy:
script:
- echo "Connecting with password: $DB_PASSWORD" # 日志中可见!
- mysql -u root -p$DB_PASSWORD
# ✅ 正确:使用 Masked 变量 + 避免打印
deploy:
script:
- mysql -u root -p"$DB_PASSWORD" -e "SELECT 1" # 变量在 Settings 中设为 Masked坑 2:Docker Executor 中文件权限问题
# ❌ 以 root 运行的容器中创建的文件,宿主机 runner 用户无法访问
build:
script:
- touch output.txt
# ✅ 使用 --user 指定用户,或在 script 中修改权限
build:
script:
- touch output.txt
- chmod 644 output.txt坑 3:缓存未生效
# ❌ 不同分支使用相同缓存 key 但 Lock 文件不同
cache:
key: "deps"
# ✅ 基于 Lock 文件生成 key
cache:
key:
files:
- package-lock.json坑 4:rules 与 only/except 混用
# ❌ 不要混用 rules 和 only/except(已废弃)
job:
only:
- main
rules:
- if: $CI_COMMIT_BRANCH == "main"
# ✅ 统一使用 rules
job:
rules:
- if: $CI_COMMIT_BRANCH == "main"
- if: $CI_COMMIT_TAG坑 5:Artifact 超大导致上传失败
# ❌ 打包整个 node_modules(可能上 GB)
artifacts:
paths:
- node_modules/
# ✅ 只打包构建产物,依赖用 cache 管理
artifacts:
paths:
- dist/
expire_in: 7 days附录:完整项目配置示例
以下是一个完整的 .gitlab-ci.yml 模板,涵盖了本文讨论的大部分功能:
# ============================================
# .gitlab-ci.yml - 完整项目模板
# ============================================
stages:
- prepare
- build
- test
- security
- deploy
variables:
DOCKER_REGISTRY: "$CI_REGISTRY"
DOCKER_IMAGE: "$CI_REGISTRY_IMAGE"
K8S_NAMESPACE: "default"
# ---- 全局缓存 ----
default:
cache:
key:
files:
- package-lock.json
paths:
- node_modules/
- .npm/
image: node:18-alpine
# ---- 复用模板 ----
.docker_auth: &docker_auth
before_script:
- echo "$CI_REGISTRY_PASSWORD" | docker login -u "$CI_REGISTRY_USER" --password-stdin $CI_REGISTRY
.kubectl_base: &kubectl_base
image:
name: bitnami/kubectl:latest
entrypoint: [""]
before_script:
- echo "$KUBE_CONFIG" | base64 -d > /tmp/kubeconfig
- export KUBECONFIG=/tmp/kubeconfig
# ============ Prepare Stage ============
install-deps:
stage: prepare
script:
- npm ci --cache .npm
cache:
key:
files:
- package-lock.json
paths:
- node_modules/
- .npm/
policy: pull-push
# ============ Build Stage ============
build-app:
stage: build
script:
- npm run build
artifacts:
paths:
- dist/
expire_in: 7 days
build-docker:
stage: build
image: docker:24
services:
- docker:24-dind
<<: *docker_auth
script:
- docker build --cache-from $DOCKER_IMAGE:latest -t $DOCKER_IMAGE:$CI_COMMIT_SHA -t $DOCKER_IMAGE:latest .
- docker push $DOCKER_IMAGE:$CI_COMMIT_SHA
- docker push $DOCKER_IMAGE:latest
# ============ Test Stage ============
test-unit:
stage: test
script:
- npm run test:unit -- --coverage
coverage: '/Lines\s*:\s*([\d.]+)/'
artifacts:
reports:
junit: junit.xml
coverage_report:
coverage_format: cobertura
path: coverage.xml
test-integration:
stage: test
services:
- postgres:15
- redis:7
variables:
POSTGRES_DB: test
POSTGRES_USER: test
POSTGRES_PASSWORD: test
DATABASE_URL: "postgresql://test:test@postgres:5432/test"
script:
- npm run test:integration
# ============ Security Stage ============
include:
- template: Security/SAST.gitlab-ci.yml
- template: Security/Dependency-Scanning.gitlab-ci.yml
- template: Security/Secret-Detection.gitlab-ci.yml
# ============ Deploy Stage ============
deploy-staging:
<<: *kubectl_base
stage: deploy
script:
- kubectl set image deployment/app app=$DOCKER_IMAGE:$CI_COMMIT_SHA -n staging
- kubectl rollout status deployment/app -n staging --timeout=300s
environment:
name: staging
url: https://staging.example.com
rules:
- if: $CI_COMMIT_BRANCH == "develop"
deploy-production:
<<: *kubectl_base
stage: deploy
script:
- kubectl set image deployment/app app=$DOCKER_IMAGE:$CI_COMMIT_SHA -n production
- kubectl rollout status deployment/app -n production --timeout=300s
environment:
name: production
url: https://www.example.com
rules:
- if: $CI_COMMIT_BRANCH == "main"
when: manual
allow_failure: false📝 写在最后:GitLab CI/CD 是一个功能非常强大的持续集成/持续部署平台。本文覆盖了日常运维中最常用的配置和场景,建议收藏作为速查手册。实际项目中,根据团队规模和业务需求选择合适的 Runner 架构和流水线设计,切忌过度设计。从简单开始,逐步迭代优化,才是 DevOps 的正确姿势。