-
Notifications
You must be signed in to change notification settings - Fork 24
[OCTRL-1081] Wrap kubectl into Mesos executor task #805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
justonedev1
wants to merge
3
commits into
master
Choose a base branch
from
OCTRL-1081
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,70 @@ | ||
| # ECS with Kubernetes | ||
|
|
||
| > ⚠️ **Warning** | ||
| > All Kubernetes work done is in a stage of prototype. | ||
|
|
||
| ## Kubernetes Cluster | ||
|
|
||
| While prototyping we used many Kubernetes clusters, namely [`kind`](https://kind.sigs.k8s.io/), [`minikube`](https://minikube.sigs.k8s.io/docs/) and [`k3s`](https://k3s.io/) | ||
| in both local and remote cluster deployment. We used Openstack for remote deployment. | ||
| Follow the guides at the individual distributions in order to create the desired cluster setup. | ||
| k3s is recommended to run this prototype, as it is lightweight | ||
| and easily installed distribution which is also [`CNCF`](https://www.cncf.io/training/certification/) certified. | ||
|
|
||
| All settings of `k3s` were used as default except one: locked-in-memory size. Use `ulimit -l` to learn | ||
| what is the limit for the current user and `LimitMEMLOCK` inside the k3s systemd service config | ||
| to set it for correct value. Right now the `flp` user has unlimited size (`LimitMEMLOCK=infinity`). | ||
| This config is necessary because even if you are running Pods with the privileged security context | ||
| under user flp, Kubernetes still sets limits according to its internal settings and doesn't | ||
| respect linux settings. | ||
|
|
||
| Another setup we expect at this moment to be present at the target nodes | ||
| is ability to run Pods with privileged permissions and also under user `flp`. | ||
| This means that the machine has to have `flp` user setup the same way as | ||
| if you would do the installation with [`o2-flp-setup`](https://alice-flp.docs.cern.ch/Operations/Experts/system-configuration/utils/o2-flp-setup/). | ||
|
|
||
| ## Running tasks (`KubectlTask`) | ||
|
|
||
| ECS is setup to run tasks through Mesos on all required hosts baremetal with active | ||
| task management (see [`ControllableTask`](/executor/executable/controllabletask.go)) | ||
| and OCC gRPC communication. When running docker task through ECS we could easily | ||
| wrap command to be run into the docker container with proper settings | ||
| ([see](/docs/running_docker.md)). This is however not possible for Kubernetes | ||
| workloads as the Pods are "hidden" inside the cluster. So we plan | ||
| to deploy our own Task Controller which will connect to and guide | ||
| OCC state machine of required tasks. Thus we need to create custom | ||
| POC way to communicate with Kubernetes cluster from Mesos executor. | ||
|
|
||
| The reason why we don't call Kubernetes cluster directly from ECS core | ||
| is that ECS does a lot of heavy lifting while deploying workloads, | ||
| monitoring workloads and by generating a lot of configuration which | ||
| is not trivial to replicate manually. However, if we create some class | ||
| that would be able to deploy one task into the Kubernetes and monitor its | ||
| state we could replicate `ControllableTask` workflow and leave ECS | ||
| mostly intact for now, save a lot of work and focus on prototyping | ||
| Kubernetes operator pattern. | ||
|
|
||
| Thus [`KubectlTask`](/executor/executable/kubectltask.go) was created. This class | ||
| is written as a wrapper around `kubectl` utility to manage Kubernetes cluster. | ||
| It is based on following `kubectl` commands: | ||
|
|
||
| * `apply` => `kubectl apply -f manifest.yaml` - deploys resource described inside given manifest | ||
| * `delete` => `kubectl delete -f manifest.yaml` - deletes resource from cluster | ||
| * `patch` => `kubectl patch -f exampletask.yaml --type='json' -p='[{"op": "replace", "path": "/spec/state", "value": "running"}]` - changes the state of resource inside cluster | ||
| * `get` => `kubectl get -f manifest.yaml -o jsonpath='{.spec.state}'` - queries exact field of resource (`state` in the example) inside cluster. | ||
|
|
||
| These four commands allow us to deploy and monitor status of the deployed | ||
| resource without necessity to interact with it directly. However `KubectlTask` | ||
| expects that resource is the CRD [Task](/control-operator/api/v1alpha1/task_types.go). | ||
|
Check failure on line 58 in docs/kubernetes_ecs.md
|
||
|
|
||
| In order to activate `KubectlTask` you need to change yaml template | ||
| inside the `ControlWorkflows` directory. Namely: | ||
|
|
||
| * add path to the kubectl manifest as the first argument in `.command.arguments` field | ||
knopers8 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| * change `.control.mode` to either `kubectl_direct` or `kubectl_fairmq` | ||
| You can find working template inside `control-operator/ecs-manifests/control-workflows/*_kube.yaml` | ||
|
|
||
| Working kubectl manifests are to be found in `control-operator/ecs-manifests/kubernetes-manifests`. | ||
| You can see `*test.yaml` for concrete deployable manifests by `kubectl apply`, the rest | ||
| are the templates with variables to be filled in in a `${var}` format. `KubectlTask` | ||
| fills these variables from env vars. | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a change in behaviour for hooks though, no? Before they were getting
directinstead ofhook, which actually smells like a bug, but perhaps something is relying on it?Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably my misunderstanding, as I thought that it is a bug to implicitly change hook to direct.. especially when we have hooktask that is created only if
controlmode.HOOKis present.see