基础设施服务能力及职责摘录
InfraService
服务模块协作
服务内部职责划分
iacService[Iac任务模块]
- 检查Iac清单
- 调度Iac任务
- 核查Iac落地[后期建设]
- 周期性核查历史Iac清单与资产落地情况
- 核查当前Iac任务落地情况
templateService[配置模版/规格标准]
allDeployService[业务&数据铺设服务]
- 管理Sql全量初始化铺设
- 管理Pod全量初始化铺设
k8sService[K8s直查模块]
- 集群查询[后期建设]
- 节点查询[后期建设]
- 节点池查询[后期建设]
rdsService[Rds直查模块]
- 实例查询[当前建设]
- 数据库查询[当前建设]
- Db参数配置查询[后期建设]
ecsService[服务计算直查模块]
基础设施用例流程
总体流程
Infra数据模型
参数配置
规格数据样例
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| { "id": "ai1.2xlarge.4", "name": "ai1.2xlarge.4", "vcpus": "8", "ram": 32768, "disk": "0", "swap": "", "attachableQuantity": { "free_scsi": 20, "free_blk": 12, "free_disk": 20, "free_nic": 4 }, "OS-FLV-EXT-DATA:ephemeral": 0, "rxtx_factor": 1, "OS-FLV-DISABLED:disabled": false, "rxtx_quota": null, "rxtx_cap": null, "os-flavor-access:is_public": true, }
|
Rds配置单样例
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
| { "instance": { "name": "rds-instance-rep2", "datastore": { "type": "MySQL", "version": "5.6" }, "flavor_ref": "rds.mysql.s1.large", "volume": { "type": "ULTRAHIGH", "size": 100 }, "disk_encryption_id": "2gfdsh-844a-4023-a776-fc5c5fb71fb4", "region": "aaa", "availability_zone": "bbb", "vpc_id": "490a4a08-ef4b-44c5-94be-3051ef9e4fce", "subnet_id": "0e2eda62-1d42-4d64-a9d1-4e9aa9cd994f", "security_group_id": "2a1f7fc8-3307-42a7-aa6f-42c8b9b8f8c5", "port": "8635", "backup_strategy": { "start_time": "08:15-09:15", "keep_days": 12 }, "charge_info": { "charge_mode": "postPaid" }, "password": "Test@12345678", "configuration_id": "452408-ef4b-44c5-94be-305145fg", "enterprise_project_id": "fdsa-3rds", "time_zone": "UTC+04:00" }, "job_id": "dff1d289-4d03-4942-8b9f-463ea07c000d" }
|
k8s节点池样例
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58
| { "kind": "NodePool", "apiVersion": "v3", "metadata": { "name": "lc-it-nodepool-79796", "uid": "99addaa2-69eb-11ea-a592-0255ac1001bb" }, "spec": { "type": "vm", "nodeTemplate": { "flavor": "s6.large.2", "az": "******", "os": "EulerOS 2.5", "login": { "sshKey": "KeyPair-001" }, "rootVolume": { //系统盘 "volumetype": "SAS", "size": 40 }, "dataVolumes": [ //数据盘 { "volumetype": "SAS", "size": 100, "extendParam": { "useType": "docker" } } ], "publicIP": { // 公有Ip "eip": { "bandwidth": {} } }, "nodeNicSpec": { "primaryNic": { "subnetId": "7e767d10-7548-4df5-ad72-aeac1d08bd8a" } }, "billingMode": 0,//0: 按需付费 1: 包周期 2: 已废弃:自动付费包周期 "extendParam": { "maxPods": 110 }, "k8sTags": { "cce.cloud.com/cce-nodepool": "lc-it-nodepool-79796" } }, "autoscaling": { enable: 1, minNodeCount: 1 maxNodeCount: 20 scaleDownCooldownTime: 111,//节点保留时间,单位为分钟,扩容出来的节点在这个时间内不会被缩掉 priority: 1,//节点池权重,更高的权重在扩容时拥有更高的优先级 }, initialNodeCount: 1, //初始化数量 "nodeManagement": {} } }
|
Infra清单用例
规格模版
- SecurityGroup:安全组
- Net:网络
- Flaover:规格
- Volume:数据磁盘
- InfraExt:基建设施扩展属性
Infra清单
- Infra
- K8sTemplate
- RdsTemplate
Infra清单数据结构
资源评估
资源评估类不需要变动参数模版,直接选择“规格”作为参数填充覆盖原模版
扩展参数
扩展参数类需要用到自定义Infra清单
Rds侧面
K8s侧
额外结构
Infra清单-转变-Tf脚本
用户表单填写数据整理Infra
数据结构题
Infra
数据转变转Terraform HCL脚本
配置样例
Rds部分
华为云
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
| variable "vpc_id" {} variable "subnet_id" {} variable "secgroup_id" {} variable "availability_zone" {}
# 建设实例 resource "huaweicloud_rds_instance" "instance" { name = "terraform_test_rds_instance" flavor = "rds.pg.n1.large.2" vpc_id = var.vpc_id #预留 subnet_id = var.subnet_id #预留 security_group_id = var.secgroup_id #预留 availability_zone = [var.availability_zone] #预留
db { type = "MySQL" version = "8.0" password = "test" }
volume { type = "ULTRAHIGH" size = 100 }
backup_strategy { start_time = "08:00-09:00" keep_days = 1 } }
# 建设数据库 resource "huaweicloud_rds_mysql_database" "test" { instance_id = huaweicloud_rds_instance.instance.id name = "test" character_set = "utf8" description = "test database" }
# 建设账号 resource "huaweicloud_rds_mysql_account" "test" { instance_id = huaweicloud_rds_instance.instance.id name = "test" password = "Test@12345678" }
# 授权 resource "huaweicloud_rds_mysql_database_privilege" "test" { instance_id = huaweicloud_rds_instance.instance.id db_name = huaweicloud_rds_mysql_database.test.name
users { name = huaweicloud_rds_mysql_account.test.name readonly = false } }
|
K8s
华为云
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
| variable "cluster_id" {} variable "key_pair" {} variable "availability_zone" {}
resource "huaweicloud_cce_node_pool" "node_pool" { cluster_id = var.cluster_id name = "testpool" os = "EulerOS 2.5" initial_node_count = 2 flavor_id = "s3.large.4" availability_zone = var.availability_zone key_pair = var.keypair scall_enable = true min_node_count = 1 max_node_count = 10 scale_down_cooldown_time = 100 priority = 1 type = "vm"
root_volume { size = 40 volumetype = "SAS" } data_volumes { size = 100 volumetype = "SAS" } }
|
隔离性
terraform_remote_state使用数据源直接读取其他Terraform写入的Terraform状态文件
举例数据库隔离粒度不是Rds,而是Database,只需要一个Db
各个Service所需要的db来源就可以用remote_state
1 2 3 4 5 6 7 8 9
| data "terraform_remote_state" "db"{ backend = "s3"
config = { bucket = "bucket_name" key = "prod/data-stores/mysql/terraform.tfstate" region = "us-east-2" } }
|
状态管理
Terraform脚本代码可Git托管,State当前集群状态无法给Git托管
- 手动错误
忘记从Git中读取最新状态文件,或在运行Terraform之后忘记将状态文件推送Git
系统意外地回到之前状态或重复了以前的部署
- 锁定
Git无法处理锁定机制,该机制避免多实例对同一个状态文件同时运行terraform apply
- 机密
Terraform状态文件属于纯文本文件。可能会将与资源相关的敏感数据写入文件
关于状态存储
Amazon S3远程存储Terraform状态文件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
| provider aws { region = "us-east-2" }
resource "aws_s3_bucket" "terraform_state" { bucket = "terraform-state" # 防止Terraform Destory删除S3 Bucket lifecyle { prevent_destroy = true }
# s3启用版本控制 versioning { enabled = true }
# 服务器端加密 servver_side_encryption_configuration { rule { apply_server_side_encryption_by_default { sse_algorigthm = "AES256" } } } }
|
关于状态锁定
1 2 3 4 5 6 7 8 9
| resource "aws_dynamodb_table" "terraform_locks"{ name = "terraform-locks" billing_mode = "PER_REQUEST" hash_key = "LockId" attribute { name = "LockId" type = "S" } }"
|
装配backend
1 2 3 4 5 6 7 8 9 10
| terraform { backend "s3" { bucket = "terraform-state" key = "/xxx/xxx/terraform.tfstate" region = "xxx"
dynamodb_table = "terraform-locks" encrypt = true } }
|
也可以单独拿出backend配置
1 2 3 4 5 6
| > backend.hcl
bucket = "xxx" region = "xxx" dynamodb_table = "xxx" encrypt = true
|
terraform init -backend-config=backend.hcl
可重用设施部分
默认配额-基础设施重用
默认配额: 该HCL不额外制定其他属性被认定默认配额
infra component云设施基础组件不同Provider
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| provider "alicloud"{ access_key = "your-access-key" secret_key = "your-secret-key" region = "cn-hangzhou" }
provider "huaweicloud" { access_key = "your-access-key" secret_key = "your-secret-key" region = "cn-north-1" project_id = "your-project-id" }
provider "azurerm" { features {} client_id = "your-client-id" client_secret = "your-client-secret" subscription_id = "your-subscription-id" tenant_id = "your-tenant-id" }
|
如按照默认我们提供的配置的则走infra
->huawei
|alicloud
|azure
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
| - infra - alicloud - component - k8s - main.tf - rds - main.tf
- 租户1 - service - main.tf - databases - main.tf
module "webservice" { source = "../../infra/alicloud/component/k8s/..." }
module "databases" { source = "../../infra/alicloud/component/rds/..." }
|
如有自定义配置走全量自定义,非模块引入
1 2 3 4 5 6 7 8 9
| - 租户1 - service - databases - main.tf
resource xxx { xxx .... }
|
默认配额-可调参部分
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
| - infra - huaweicloud - component - k8s - main.tf
- 租户1 - service - var.tf - main.tf - databases
> var.tf variable "cluster_node_count" { description = "节点池数量" type = "number" }
> main.tf
resource "webservice" { source = "../../infra/huaweicloud/component/k8s/..." cluster_node_count = var.cluster_node_count }
> infra/huaweicloud/component/main.tf
resource "huaweicloud_cce_node_pool" "test" { ... initial_node_count = var.cluster_node_count scall_enable = false min_node_count = 0 max_node_count = 0 ... }
|
Deploy部署
Deploy部署包
Sealos优势
- 自建云自带image-cri-shim类镜像私仓
- LVScare多Master ipvs轻量级负载均衡
- 镜像内部附带kubectl、sealos操作命令,将部署工作融合到一个镜像
- 业务镜像全部离线download到registry
手动安装自建私有云
1 2 3
| sealos run labring/kubernetes:v1.25.0 labring/helm:v3.8.2 labring/calico:v3.24.1 \ --masters 192.168.64.2,192.168.64.22,192.168.64.20 \ --nodes 192.168.64.21,192.168.64.19 -p [your-ssh-passwd]
|
手动铺设Deploy程序包
1
| sealos run digiwin.com/xxx/deploy:x.x.x
|
Deploy程序包结构
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
| . ├── charts │ └── nginx │ ├── Chart.lock │ ├── charts │ ├── Chart.yaml │ ├── README.md │ ├── templates │ ├── values.schema.json │ └── values.yaml ├── images │ └── shim │ └── nginxImages ├── init.sh ├── Kubefile ├── manifests │ └── nginx │ ├── deployment.yaml │ ├── ingress.yaml │ └── service.yaml ├── opt │ └── helm └── registry
|
Kubefile内容
1 2 3 4 5 6 7 8
| FROM scratch ENV version v1.1.0 COPY manifests ./manifests COPY registry ./registry ENTRYPOINT ["kubectl apply -f manifests/tigera-operator.yaml"] CMD ["kubectl apply -f manifests/custom-resources.yaml"] # 当然也可以做py脚本初始化,由Py脚本去做Sql的全量铺设以及kubectl apply -f 集群各类yaml文件的初始化 CMD [manifests/start.py]
|
构建/建设/应用Deploy程序包
问题及物料
问题
客户开放的子账号模式?合作伙伴模式?主账号申请账单?涉及到毛利计算颗粒度
- 使用限制:单账号单集群节点最大用量
- 资费:包年/包月/用量计费
- 资产基础设施盘点: 盘点未关联客户Infra清单的设施、未落地Infra清单的设施
- 工作流程
- Rds购入
- Rds逻辑库配置
- K8s NodePool/AgentPool配置
- K8s Scale伸缩实例数
材料准备
- 开设子账号
- 华为云
- IAM、Region-项目Id-项目名
- 子账号【编程访问】
- CCE FullAccess
- RDS ManageAccess
- User Name,Access Key Id,Secret Access Key
- 微软云 (swqqh@qq.com)
- Microsoft Entra Privileged Identity Management
- 凭据:subscription_id、tenant_id、client_id、client_secret
- 阿里云
- RAM子账号、Region、用户组/权限
- AliyunCSFullAccess
- AliyunRDSFullAccess
- 启用 OpenAPI 调用访问
- AccessKey SECRET_KEY
- Terraform测试/开发集群:测试产生资费
- 回收:中间的路由设施也得回收
国内查看评论需要代理~