INTRO
이전의 글에서 Openshift Logging Operator에 대한 설명을 간단하게 하였습니다.
이 글에서는 OpenShift Logging Operator를 설치하는 과정을 소개드립니다.
Openshift Elasticsearch Operator 및 Openshit Logging Operator - CLI 설치
Elasticsearch는 메모리를 많이 사용하는 애플리케이션입니다. 최소 16GB 메모리를 가진 Elasticsearch node 3개를 설치하기 때문에 배포할 Infra node(혹은 Compute node)는 16GB 이상의 메모리를 가지고 있어야 합니다.
Elasticsearch가 OOM으로 서비스를 하지 못할 경우 Heap Memory 증가보다는 node 수를 증가 시키는 Scale out 기법을 사용하는 것을 권합니다.
Elasticsearch가 설치될 namespace(project)를 생성합니다.
openshift.io/cluster-monitoring: "true" Label을 설정하여 Prometheus가 Metric을 수집할 수 있게 설정합니다.
# cat << EOF > eo-namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: openshift-operators-redhat
annotations:
openshift.io/node-selector: ""
labels:
openshift.io/cluster-monitoring: "true"
EOF
<namespace 생성 및 확인>
# oc create -f eo-namespace.yaml
namespace/openshift-operators-redhat created
# oc get project openshift-operators-redhat
NAME DISPLAY NAME STATUS
openshift-operators-redhat Active
Openshift Logging Operator가 설치될 namespace를 생성합니다.
openshift.io/cluster-monitoring: "true" Label을 설정하여 Prometheus가 Metric을 수집할 수 있게 설정합니다.
# cat << EOF > olo-namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: openshift-logging
annotations:
openshift.io/node-selector: ""
labels:
openshift.io/cluster-monitoring: "true"
EOF
<namespace 생성 및 확인>
# oc create -f olo-namespace.yaml
namespace/openshift-logging created
# oc get project openshift-logging
NAME DISPLAY NAME STATUS
openshift-logging Active
Elasticsearch Operator를 위한 Operator group을 생성합니다.
Operator group: 동일한 namespace에 배포된 모든 Operator를 Operator group으로 구성하여 namespace 목록 또는 클러스터 레벨에서 CR(Custoom Resource)를 조사하는 역할
# cat << EOF > eo-og.yaml
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: openshift-operators-redhat
namespace: openshift-operators-redhat
spec: {}
EOF
<Operator group 생성 및 확인>
# oc create eo-og.yaml
operatorgroup.operators.coreos.com/openshift-operators-redhat created
# oc get og -n openshift-operators-redhat
NAME AGE
openshift-operators-redhat 13s
Elasticsearch Operator를 위한 Subscription을 생성하고 CSV를 확인합니다.
Elasticsearch Operator는 "openshift-operators-redhat" namespace에 설치됩니다.
- Subscription: 패키지의 Channel을 추적하여 CSV를 최신 상태로 유지시킴
- CSV(Cluster Service Version): 클러스터에서 실행중인 특정 버전의 Operator를 나타냄
# cat << EOF > eo-sub.yaml
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: "elasticsearch-operator"
namespace: "openshift-operators-redhat"
spec:
channel: "stable-5.5"
installPlanApproval: "Automatic"
source: "redhat-operator-index-ocptb"
sourceNamespace: "openshift-marketplace"
name: "elasticsearch-operator"
EOF
<Subscription 생성 및 확인>
# oc create -f eo-sub.yaml
subscription.operators.coreos.com/elasticsearch-operator created
# oc get sub -n openshift-operators-redhat
NAME PACKAGE SOURCE CHANNEL
elasticsearch-operator elasticsearch-operator redhat-operator-index-ocptb stable
# oc get csv --all-namespaces |grep elasticsearch-operator
default elasticsearch-operator.v5.7.2 OpenShift Elasticsearch Operator 5.7.2 elasticsearch-operator.v5.7.1 Succeeded
kube-node-lease elasticsearch-operator.v5.7.2 OpenShift Elasticsearch Operator 5.7.2 elasticsearch-operator.v5.7.1 Succeeded
kube-public elasticsearch-operator.v5.7.2 OpenShift Elasticsearch Operator 5.7.2 elasticsearch-operator.v5.7.1 Succeeded
kube-system elasticsearch-operator.v5.7.2 OpenShift Elasticsearch Operator 5.7.2 elasticsearch-operator.v5.7.1 Succeeded
openshift-apiserver-operator elasticsearch-operator.v5.7.2 OpenShift Elasticsearch Operator 5.7.2 elasticsearch-operator.v5.7.1 Succeeded
...(생략)
Openshift Logging Operator를 위한 Operator group을 생성합니다.
# cat << EOF > olo-og.yaml
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: cluster-logging
namespace: openshift-logging
spec:
targetNamespaces:
- openshift-logging
EOF
<Operator group 생성 및 확인>
# oc create -f olo-og.yaml
operatorgroup.operators.coreos.com/cluster-logging created
# oc get og -n openshift-logging
NAME AGE
cluster-logging 16s
Openshift Logging Operator를 위한 Subscription을 생성하고 CSV를 확인합니다.
Openshift Logging Operator는 "openshift-logging" namespace에 설치됩니다.
# cat << EOF > olo-sub.yaml
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: cluster-logging
namespace: openshift-logging
spec:
channel: "stable"
name: cluster-logging
source: redhat-operator-index-ocptb
sourceNamespace: openshift-marketplace
<Subscription 생성 및 확인>
# oc create -f olo-sub.yaml
subscription.operators.coreos.com/cluster-logging created
# oc get sub -n openshift-logging
NAME PACKAGE SOURCE CHANNEL
cluster-logging cluster-logging redhat-operator-index-ocptb stable
# oc get csv -A |grep -i openshift-logging
openshift-logging cluster-logging.v5.7.2 Red Hat OpenShift Logging 5.7.2 cluster-logging.v5.7.1 Succeeded
Openshift Logging Instance 생성
Openshift Logging Operator를 위한 인스턴스 yaml을 작성 후 생성합니다.
현재 Infra node에 taint가 설정되어 있기 때문에 tolerations를 설정해야 됩니다.
- curator: Elasticsearch의 인덱스 생성/삭제, 스냅샷 생성/삭제, shard routing allocation 변경과 같은 Elasticsearch의 관리를 위한 반복적인 작업을 수행 (cron 스케줄링으로 반복)
- .spec.elasticsearch.storage: {} : storage에 {}으로 입력하면 emptyDir 볼륨으로 생성됩니다.
※ emptyDir: Pod가 실행될 때 생성되고 Pod가 삭제될 때 같이 삭제되는 임시볼륨
# cat << EOF > olo-instance.yaml
apiVersion: "logging.openshift.io/v1"
kind: "ClusterLogging"
metadata:
name: "instance"
namespace: "openshift-logging"
spec:
managementState: "Managed"
logStore:
type: "elasticsearch"
retentionPolicy:
application:
maxAge: 1d
infra:
maxAge: 7d
audit:
maxAge: 7d
elasticsearch:
nodeCount: 3
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
storage: {}
resources:
limits:
memory: "8Gi"
requests:
memory: "8Gi"
proxy:
resources:
limits:
memory: 256Mi
requests:
memory: 256Mi
redundancyPolicy: "SingleRedundancy"
visualization:
type: "kibana"
kibana:
replicas: 1
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
curation:
type: "curator"
curator:
schedule: "30 3 * * *"
nodeSelector:
node-role.kubernetes.io/infra: ""
tolerations:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
collection:
logs:
type: "fluentd"
fluentd:
tolerations:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
EOF
<Openshift Logging 인스턴스 생성 및 확인>
# oc create -f olo-instance.yaml
# oc get pod -n openshift-logging -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cluster-logging-operator-6f7d96cfb5-gvv84 1/1 Running 0 37m 10.131.0.25 worker2.ocp4.example.com <none> <none>
collector-29d4x 2/2 Running 0 77s 10.128.2.31 worker1.ocp4.example.com <none> <none>
collector-j4hk8 2/2 Running 0 76s 10.129.0.68 master1.ocp4.example.com <none> <none>
collector-n6nd8 2/2 Running 0 77s 10.130.2.28 infra2.ocp4.example.com <none> <none>
collector-qdsqn 2/2 Running 0 77s 10.129.2.23 infra1.ocp4.example.com <none> <none>
collector-r565w 2/2 Running 0 77s 10.128.0.45 master2.ocp4.example.com <none> <none>
collector-tm7mt 2/2 Running 0 77s 10.131.0.31 worker2.ocp4.example.com <none> <none>
collector-xl47z 2/2 Running 0 76s 10.130.0.42 master3.ocp4.example.com <none> <none>
collector-xzkvh 2/2 Running 0 77s 10.131.2.21 infra3.ocp4.example.com <none> <none>
elasticsearch-cdm-ihn3yrdv-1-85cdbd7b4f-mgrtl 2/2 Running 0 22m 10.130.2.14 infra2.ocp4.example.com <none> <none>
elasticsearch-cdm-ihn3yrdv-2-6674db8445-dsk2l 2/2 Running 0 22m 10.131.2.14 infra3.ocp4.example.com <none> <none>
elasticsearch-cdm-ihn3yrdv-3-6997c99649-hmtkv 2/2 Running 0 22m 10.129.2.14 infra1.ocp4.example.com <none> <none>
elasticsearch-im-app-28181595-5zjkd 0/1 Completed 0 3m53s 10.130.2.26 infra2.ocp4.example.com <none> <none>
elasticsearch-im-audit-28181595-ln2cp 0/1 Completed 0 3m53s 10.130.2.25 infra2.ocp4.example.com <none> <none>
elasticsearch-im-infra-28181595-bdhm5 0/1 Completed 0 3m53s 10.130.2.27 infra2.ocp4.example.com <none> <none>
Openshift Logging 인스턴스 배포시 Kibana 실행되지 않는 현상 발생
Openshift Logging 인스턴스 생성시 Kibana pod가 실행되지 않는 경우가 발생합니다.
Elastic Operator pod의 로그를 확인합니다.
Image Registry에 대한 Egress HTTPS 호출이 실패 되어 리소스에서 누락되어 실행되지 않은 것으로 확인됩니다.
# oc get pod -n openshift-operators-redhat
NAME READY STATUS RESTARTS AGE
elasticsearch-operator-698bb969f4-shkfr 2/2 Running 0 119m
# oc logs -c elasticsearch-operator elasticsearch-operator-698bb969f4-shkfr -n openshift-operators-redhat |more
...(생략)
{"_ts":"2023-08-01T12:06:35.919878649Z","_level":"0","_component":"elasticsearch-operator_controllers_Kibana","_message":"Registering future events","name":{"Namespace":"openshift-loggi
ng","Name":"kibana"}}
{"_ts":"2023-08-01T12:06:57.139310355Z","_level":"0","_component":"elasticsearch-operator_controllers_Elasticsearch","_message":"Updated Elasticsearch","cluster":"elasticsearch","namesp
ace":"openshift-logging","retries":0}
{"_ts":"2023-08-01T12:07:08.266232743Z","_level":"0","_component":"elasticsearch-operator","_message":"Reconciler error","Kibana":{"name":"kibana","namespace":"openshift-logging"},"_err
or":{"cause":{"cause":{"ErrStatus":{"metadata":{},"status":"Failure","message":"ConfigMap \"kibana-trusted-ca-bundle\" not found","reason":"NotFound","details":{"name":"kibana-trusted-c
a-bundle","kind":"ConfigMap"},"code":404}},"msg":"failed to get configmap","name":"kibana-trusted-ca-bundle","namespace":"openshift-logging"},"cluster":"kibana","msg":"failed to get tru
sted CA bundle config map"},"controller":"kibana-controller","controllerGroup":"logging.openshift.io","controllerKind":"Kibana","name":"kibana","namespace":"openshift-logging","reconcil
eID":"65b185a2-8ff3-4e35-ad6d-db8e21fda738"}
{"_ts":"2023-08-01T12:07:08.311912381Z","_level":"0","_component":"elasticsearch-operator","_message":"Reconciler error","Kibana":{"name":"kibana","namespace":"openshift-logging"},"_err
or":{"cause":{"msg":"ImageStream tag contains no images","name":"oauth-proxy","namespace":"openshift","tag":"v4.4"},"msg":"Failed to get oauth-proxy image"},"controller":"kibana-control
ler","controllerGroup":"logging.openshift.io","controllerKind":"Kibana","name":"kibana","namespace":"openshift-logging","reconcileID":"2779a52b-491f-46cd-9c9d-703666889eb4"}
oauth-proxy Imagestream를 확인하여 인증서가 정상인지 확인합니다.
확인한 결과 Image Registry에 대한 CA 인증서 에러가 발생한 것을 확인하였습니다.
# oc get -o yaml is/oauth-proxy -n openshift
apiVersion: image.openshift.io/v1
kind: ImageStream
metadata:
annotations:
include.release.openshift.io/ibm-cloud-managed: "true"
include.release.openshift.io/self-managed-high-availability: "true"
name: oauth-proxy
namespace: openshift
..(생략)
status:
dockerImageRepository: ""
tags:
- conditions:
- generation: 617
lastTransitionTime: "2023-08-01T13:08:33Z"
message: 'Internal error occurred: [harbor.example.com:443/ocp4/openshift4@sha256:330d1bc787a8c84b3f7b02f50fb9be2e840361aefb1580d70ff75cb1f73a4a15:
Get "https://harbor.example.com:443/v2/": x509: certificate signed by unknown
authority, quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:330d1bc787a8c84b3f7b02f50fb9be2e840361aefb1580d70ff75cb1f73a4a15:
Get "https://quay.io/v2/": dial tcp 44.194.1.249:443: connect: connection
refused]'
reason: InternalError
status: "False"
type: ImportSuccess
items: null
tag: v4.4
cluster의 proxy 설정에 CA 인증서가 포함되어 있는지 확인합니다.
spec에 정의된 것이 없는 것을 확인하였고 클러스터에 생성되어 있는 "user-ca-bundle" configmap을 trustedCA로 정의합니다.
# oc get proxy cluster -o yaml
apiVersion: config.openshift.io/v1
kind: Proxy
metadata:
creationTimestamp: "2023-07-25T07:30:44Z"
generation: 4
name: cluster
resourceVersion: "3331271"
uid: 4ceb82ab-7980-4384-b425-27baa08e4980
spec: {}
status: {}
"user-ca-bundle" configmap을 조회합니다.
install-config.yaml에 사용하였던 "additionalTrustBundle" 즉, Image Registry 인증서의 내용이 확인되었습니다.
# oc get cm user-ca-bundle -n openshift-config -o json |jq .data
{
"ca-bundle.crt": "-----BEGIN CERTIFICATE-----\BsZTE...(생략)\n-----END CERTIFICATE-----\n"
}
앞에서 확인한 "user-ca-bundle" configmap을 proxy에 추가하는 작업을 진행합니다.
"user-ca-bundle" configmap 이 trustedCA로 정의된 것을 확인하였습니다.
# oc patch proxy/cluster --type merge -p '{"spec":{"trustedCA":{"name": "user-ca-bundle"}}}'
<trustedCA 추가 확인>
# oc get proxy cluster -o json |jq .spec
{
"trustedCA": {
"name": "user-ca-bundle"
}
}
앞에서 문제가 있던 oauth-proxy Imagestream을 삭제하고 확인합니다. (몇 분 기다리면 다시 생성됩니다. 절대 수동으로 생성하지 마세요.)
# oc delete is -n openshift oauth-proxy
<CA 인증서 에러 해결 확인>
# oc get is -n openshift oauth-proxy -o json |jq .status
{
"dockerImageRepository": "",
"tags": [
{
"items": [
{
"created": "2023-08-01T13:24:11Z",
"dockerImageReference": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:330d1bc787a8c84b3f7b02f50fb9be2e840361aefb1580d70ff75cb1f73a4a15",
"generation": 2,
"image": "sha256:330d1bc787a8c84b3f7b02f50fb9be2e840361aefb1580d70ff75cb1f73a4a15"
}
],
"tag": "v4.4"
}
]
}
Kibana pod가 실행중인지 확인합니다.
# oc get pod -n openshift-logging
NAME READY STATUS RESTARTS AGE
cluster-logging-operator-6f7d96cfb5-gvv84 1/1 Running 0 88m
collector-29d4x 2/2 Running 0 52m
collector-j4hk8 2/2 Running 0 52m
collector-n6nd8 2/2 Running 0 52m
collector-qdsqn 2/2 Running 0 52m
collector-r565w 2/2 Running 0 52m
collector-tm7mt 2/2 Running 0 52m
collector-xl47z 2/2 Running 0 52m
collector-xzkvh 2/2 Running 0 52m
elasticsearch-cdm-ihn3yrdv-1-85cdbd7b4f-mgrtl 2/2 Running 0 74m
elasticsearch-cdm-ihn3yrdv-2-6674db8445-dsk2l 2/2 Running 0 74m
elasticsearch-cdm-ihn3yrdv-3-6997c99649-hmtkv 2/2 Running 0 74m
elasticsearch-im-app-28181640-sq9m9 0/1 Completed 0 10m
elasticsearch-im-audit-28181640-7vl75 0/1 Completed 0 10m
elasticsearch-im-infra-28181640-8nfwp 0/1 Completed 0 10m
kibana-64985c788c-lg4fz 2/2 Running 0 46m
마지막으로 Route를 조회해서 Kibana 콘솔로 접근할 수 있는 주소를 조회합니다.
대부분의 사용자가 같은 대역대의 PC에서 접속하지 않기 때문에 윈도우의 hosts 파일에 아래의 route 주소를 기입하여야합니다.
EX) <IP 주소> kibana-openshift-logging.apps.ocp4.example.com
# oc get route -n openshift-logging
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
kibana kibana-openshift-logging.apps.ocp4.example.com kibana <all> reencrypt/Redirect None
'Linux > OpenShift' 카테고리의 다른 글
RHOCP) OpenShift Logging (4) - journald 설정 변경 (0) | 2023.08.05 |
---|---|
RHOCP) OpenShift Logging (3) - 영구 스토리지(PV) 구성 (0) | 2023.08.02 |
RHOCP) OpenShift Logging (1) - 개요 (0) | 2023.08.01 |
RHOCP) OVN-Kubernetes Architecture (0) | 2023.07.31 |
RHOCP) Infra node 구성과 워크로드 격리 (0) | 2023.07.29 |