Improve tutorials for Job
Co-authored-by: Kundan Kumar <kundan.kumar@india.nec.com>pull/44095/head
parent
16bb7a1c8a
commit
7e75688d69
|
@ -1,6 +1,5 @@
|
||||||
---
|
---
|
||||||
title: Coarse Parallel Processing Using a Work Queue
|
title: Coarse Parallel Processing Using a Work Queue
|
||||||
min-kubernetes-server-version: v1.8
|
|
||||||
content_type: task
|
content_type: task
|
||||||
weight: 20
|
weight: 20
|
||||||
---
|
---
|
||||||
|
@ -8,7 +7,7 @@ weight: 20
|
||||||
|
|
||||||
<!-- overview -->
|
<!-- overview -->
|
||||||
|
|
||||||
In this example, we will run a Kubernetes Job with multiple parallel
|
In this example, you will run a Kubernetes Job with multiple parallel
|
||||||
worker processes.
|
worker processes.
|
||||||
|
|
||||||
In this example, as each pod is created, it picks up one unit of work
|
In this example, as each pod is created, it picks up one unit of work
|
||||||
|
@ -16,7 +15,7 @@ from a task queue, completes it, deletes it from the queue, and exits.
|
||||||
|
|
||||||
Here is an overview of the steps in this example:
|
Here is an overview of the steps in this example:
|
||||||
|
|
||||||
1. **Start a message queue service.** In this example, we use RabbitMQ, but you could use another
|
1. **Start a message queue service.** In this example, you use RabbitMQ, but you could use another
|
||||||
one. In practice you would set up a message queue service once and reuse it for many jobs.
|
one. In practice you would set up a message queue service once and reuse it for many jobs.
|
||||||
1. **Create a queue, and fill it with messages.** Each message represents one task to be done. In
|
1. **Create a queue, and fill it with messages.** Each message represents one task to be done. In
|
||||||
this example, a message is an integer that we will do a lengthy computation on.
|
this example, a message is an integer that we will do a lengthy computation on.
|
||||||
|
@ -26,11 +25,16 @@ Here is an overview of the steps in this example:
|
||||||
## {{% heading "prerequisites" %}}
|
## {{% heading "prerequisites" %}}
|
||||||
|
|
||||||
|
|
||||||
Be familiar with the basic,
|
You should already be familiar with the basic,
|
||||||
non-parallel, use of [Job](/docs/concepts/workloads/controllers/job/).
|
non-parallel, use of [Job](/docs/concepts/workloads/controllers/job/).
|
||||||
|
|
||||||
{{< include "task-tutorial-prereqs.md" >}}
|
{{< include "task-tutorial-prereqs.md" >}}
|
||||||
|
|
||||||
|
You will need a container image registry where you can upload images to run in your cluster.
|
||||||
|
|
||||||
|
This task example also assumes that you have Docker installed locally.
|
||||||
|
|
||||||
|
|
||||||
<!-- steps -->
|
<!-- steps -->
|
||||||
|
|
||||||
## Starting a message queue service
|
## Starting a message queue service
|
||||||
|
@ -43,21 +47,20 @@ cluster and reuse it for many jobs, as well as for long-running services.
|
||||||
Start RabbitMQ as follows:
|
Start RabbitMQ as follows:
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
kubectl create -f https://raw.githubusercontent.com/kubernetes/kubernetes/release-1.3/examples/celery-rabbitmq/rabbitmq-service.yaml
|
# make a Service for the StatefulSet to use
|
||||||
|
kubectl create -f https://kubernetes.io/examples/application/job/rabbitmq-service.yaml
|
||||||
```
|
```
|
||||||
```
|
```
|
||||||
service "rabbitmq-service" created
|
service "rabbitmq-service" created
|
||||||
```
|
```
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
kubectl create -f https://raw.githubusercontent.com/kubernetes/kubernetes/release-1.3/examples/celery-rabbitmq/rabbitmq-controller.yaml
|
kubectl create -f https://kubernetes.io/examples/application/job/rabbitmq-statefulset.yaml
|
||||||
```
|
```
|
||||||
```
|
```
|
||||||
replicationcontroller "rabbitmq-controller" created
|
statefulset "rabbitmq" created
|
||||||
```
|
```
|
||||||
|
|
||||||
We will only use the rabbitmq part from the [celery-rabbitmq example](https://github.com/kubernetes/kubernetes/tree/release-1.3/examples/celery-rabbitmq).
|
|
||||||
|
|
||||||
## Testing the message queue service
|
## Testing the message queue service
|
||||||
|
|
||||||
Now, we can experiment with accessing the message queue. We will
|
Now, we can experiment with accessing the message queue. We will
|
||||||
|
@ -68,7 +71,7 @@ First create a temporary interactive Pod.
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
# Create a temporary interactive container
|
# Create a temporary interactive container
|
||||||
kubectl run -i --tty temp --image ubuntu:18.04
|
kubectl run -i --tty temp --image ubuntu:22.04
|
||||||
```
|
```
|
||||||
```
|
```
|
||||||
Waiting for pod default/temp-loe07 to be running, status is Pending, pod ready: false
|
Waiting for pod default/temp-loe07 to be running, status is Pending, pod ready: false
|
||||||
|
@ -77,76 +80,82 @@ Waiting for pod default/temp-loe07 to be running, status is Pending, pod ready:
|
||||||
|
|
||||||
Note that your pod name and command prompt will be different.
|
Note that your pod name and command prompt will be different.
|
||||||
|
|
||||||
Next install the `amqp-tools` so we can work with message queues.
|
Next install the `amqp-tools` so you can work with message queues.
|
||||||
|
The next commands show what you need to run inside the interactive shell in that Pod:
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
# Install some tools
|
apt-get update && apt-get install -y curl ca-certificates amqp-tools python dnsutils
|
||||||
root@temp-loe07:/# apt-get update
|
|
||||||
.... [ lots of output ] ....
|
|
||||||
root@temp-loe07:/# apt-get install -y curl ca-certificates amqp-tools python dnsutils
|
|
||||||
.... [ lots of output ] ....
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Later, we will make a docker image that includes these packages.
|
Later, you will make a container image that includes these packages.
|
||||||
|
|
||||||
Next, we will check that we can discover the rabbitmq service:
|
Next, you will check that you can discover the Service for RabbitMQ:
|
||||||
|
|
||||||
```
|
```
|
||||||
|
# Run these commands inside the Pod
|
||||||
# Note the rabbitmq-service has a DNS name, provided by Kubernetes:
|
# Note the rabbitmq-service has a DNS name, provided by Kubernetes:
|
||||||
|
nslookup rabbitmq-service
|
||||||
root@temp-loe07:/# nslookup rabbitmq-service
|
```
|
||||||
|
```
|
||||||
Server: 10.0.0.10
|
Server: 10.0.0.10
|
||||||
Address: 10.0.0.10#53
|
Address: 10.0.0.10#53
|
||||||
|
|
||||||
Name: rabbitmq-service.default.svc.cluster.local
|
Name: rabbitmq-service.default.svc.cluster.local
|
||||||
Address: 10.0.147.152
|
Address: 10.0.147.152
|
||||||
|
|
||||||
# Your address will vary.
|
|
||||||
```
|
```
|
||||||
|
(the IP addresses will vary)
|
||||||
|
|
||||||
If Kube-DNS is not set up correctly, the previous step may not work for you.
|
If the kube-dns addon is not set up correctly, the previous step may not work for you.
|
||||||
You can also find the service IP in an env var:
|
You can also find the IP address for that Service in an environment variable:
|
||||||
|
|
||||||
```
|
|
||||||
# env | grep RABBIT | grep HOST
|
|
||||||
RABBITMQ_SERVICE_SERVICE_HOST=10.0.147.152
|
|
||||||
# Your address will vary.
|
|
||||||
```
|
|
||||||
|
|
||||||
Next we will verify we can create a queue, and publish and consume messages.
|
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
|
# run this check inside the Pod
|
||||||
|
env | grep RABBITMQ_SERVICE | grep HOST
|
||||||
|
```
|
||||||
|
```
|
||||||
|
RABBITMQ_SERVICE_SERVICE_HOST=10.0.147.152
|
||||||
|
```
|
||||||
|
(the IP address will vary)
|
||||||
|
|
||||||
|
Next you will verify that you can create a queue, and publish and consume messages.
|
||||||
|
|
||||||
|
```shell
|
||||||
|
# Run these commands inside the Pod
|
||||||
# In the next line, rabbitmq-service is the hostname where the rabbitmq-service
|
# In the next line, rabbitmq-service is the hostname where the rabbitmq-service
|
||||||
# can be reached. 5672 is the standard port for rabbitmq.
|
# can be reached. 5672 is the standard port for rabbitmq.
|
||||||
|
export BROKER_URL=amqp://guest:guest@rabbitmq-service:5672
|
||||||
root@temp-loe07:/# export BROKER_URL=amqp://guest:guest@rabbitmq-service:5672
|
|
||||||
# If you could not resolve "rabbitmq-service" in the previous step,
|
# If you could not resolve "rabbitmq-service" in the previous step,
|
||||||
# then use this command instead:
|
# then use this command instead:
|
||||||
# root@temp-loe07:/# BROKER_URL=amqp://guest:guest@$RABBITMQ_SERVICE_SERVICE_HOST:5672
|
BROKER_URL=amqp://guest:guest@$RABBITMQ_SERVICE_SERVICE_HOST:5672
|
||||||
|
|
||||||
# Now create a queue:
|
# Now create a queue:
|
||||||
|
|
||||||
root@temp-loe07:/# /usr/bin/amqp-declare-queue --url=$BROKER_URL -q foo -d
|
/usr/bin/amqp-declare-queue --url=$BROKER_URL -q foo -d
|
||||||
|
```
|
||||||
|
```
|
||||||
foo
|
foo
|
||||||
|
```
|
||||||
|
|
||||||
# Publish one message to it:
|
Publish one message to the queue:
|
||||||
|
```shell
|
||||||
root@temp-loe07:/# /usr/bin/amqp-publish --url=$BROKER_URL -r foo -p -b Hello
|
/usr/bin/amqp-publish --url=$BROKER_URL -r foo -p -b Hello
|
||||||
|
|
||||||
# And get it back.
|
# And get it back.
|
||||||
|
|
||||||
root@temp-loe07:/# /usr/bin/amqp-consume --url=$BROKER_URL -q foo -c 1 cat && echo
|
/usr/bin/amqp-consume --url=$BROKER_URL -q foo -c 1 cat && echo 1>&2
|
||||||
|
```
|
||||||
|
```
|
||||||
Hello
|
Hello
|
||||||
root@temp-loe07:/#
|
|
||||||
```
|
```
|
||||||
|
|
||||||
In the last command, the `amqp-consume` tool takes one message (`-c 1`)
|
In the last command, the `amqp-consume` tool took one message (`-c 1`)
|
||||||
from the queue, and passes that message to the standard input of an arbitrary command. In this case, the program `cat` prints out the characters read from standard input, and the echo adds a carriage
|
from the queue, and passes that message to the standard input of an arbitrary command.
|
||||||
return so the example is readable.
|
In this case, the program `cat` prints out the characters read from standard input, and
|
||||||
|
the echo adds a carriage return so the example is readable.
|
||||||
|
|
||||||
## Filling the Queue with tasks
|
## Fill the queue with tasks
|
||||||
|
|
||||||
Now let's fill the queue with some "tasks". In our example, our tasks are strings to be
|
Now, fill the queue with some simulated tasks. In this example, the tasks are strings to be
|
||||||
printed.
|
printed.
|
||||||
|
|
||||||
In a practice, the content of the messages might be:
|
In a practice, the content of the messages might be:
|
||||||
|
@ -157,18 +166,22 @@ In a practice, the content of the messages might be:
|
||||||
- configuration parameters to a simulation
|
- configuration parameters to a simulation
|
||||||
- frame numbers of a scene to be rendered
|
- frame numbers of a scene to be rendered
|
||||||
|
|
||||||
In practice, if there is large data that is needed in a read-only mode by all pods
|
If there is large data that is needed in a read-only mode by all pods
|
||||||
of the Job, you will typically put that in a shared file system like NFS and mount
|
of the Job, you typically put that in a shared file system like NFS and mount
|
||||||
that readonly on all the pods, or the program in the pod will natively read data from
|
that readonly on all the pods, or write the program in the pod so that it can natively read
|
||||||
a cluster file system like HDFS.
|
data from a cluster file system (for example: HDFS).
|
||||||
|
|
||||||
For our example, we will create the queue and fill it using the amqp command line tools.
|
For this example, you will create the queue and fill it using the AMQP command line tools.
|
||||||
In practice, you might write a program to fill the queue using an amqp client library.
|
In practice, you might write a program to fill the queue using an AMQP client library.
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
|
# Run this on your computer, not in the Pod
|
||||||
/usr/bin/amqp-declare-queue --url=$BROKER_URL -q job1 -d
|
/usr/bin/amqp-declare-queue --url=$BROKER_URL -q job1 -d
|
||||||
|
```
|
||||||
|
```
|
||||||
job1
|
job1
|
||||||
```
|
```
|
||||||
|
Add items to the queue:
|
||||||
```shell
|
```shell
|
||||||
for f in apple banana cherry date fig grape lemon melon
|
for f in apple banana cherry date fig grape lemon melon
|
||||||
do
|
do
|
||||||
|
@ -176,14 +189,14 @@ do
|
||||||
done
|
done
|
||||||
```
|
```
|
||||||
|
|
||||||
So, we filled the queue with 8 messages.
|
You added 8 messages to the queue.
|
||||||
|
|
||||||
## Create an Image
|
## Create a container image
|
||||||
|
|
||||||
Now we are ready to create an image that we will run as a job.
|
Now you are ready to create an image that you will run as a Job.
|
||||||
|
|
||||||
We will use the `amqp-consume` utility to read the message
|
The job will use the `amqp-consume` utility to read the message
|
||||||
from the queue and run our actual program. Here is a very simple
|
from the queue and run the actual work. Here is a very simple
|
||||||
example program:
|
example program:
|
||||||
|
|
||||||
{{% code_sample language="python" file="application/job/rabbitmq/worker.py" %}}
|
{{% code_sample language="python" file="application/job/rabbitmq/worker.py" %}}
|
||||||
|
@ -194,9 +207,7 @@ Give the script execution permission:
|
||||||
chmod +x worker.py
|
chmod +x worker.py
|
||||||
```
|
```
|
||||||
|
|
||||||
Now, build an image. If you are working in the source
|
Now, build an image. Make a temporary directory, change to it,
|
||||||
tree, then change directory to `examples/job/work-queue-1`.
|
|
||||||
Otherwise, make a temporary directory, change to it,
|
|
||||||
download the [Dockerfile](/examples/application/job/rabbitmq/Dockerfile),
|
download the [Dockerfile](/examples/application/job/rabbitmq/Dockerfile),
|
||||||
and [worker.py](/examples/application/job/rabbitmq/worker.py). In either case,
|
and [worker.py](/examples/application/job/rabbitmq/worker.py). In either case,
|
||||||
build the image with this command:
|
build the image with this command:
|
||||||
|
@ -214,33 +225,27 @@ docker tag job-wq-1 <username>/job-wq-1
|
||||||
docker push <username>/job-wq-1
|
docker push <username>/job-wq-1
|
||||||
```
|
```
|
||||||
|
|
||||||
If you are using [Google Container
|
If you are using an alternative container image registry, tag the
|
||||||
Registry](https://cloud.google.com/tools/container-registry/), tag
|
image and push it there instead.
|
||||||
your app image with your project ID, and push to GCR. Replace
|
|
||||||
`<project>` with your project ID.
|
|
||||||
|
|
||||||
```shell
|
|
||||||
docker tag job-wq-1 gcr.io/<project>/job-wq-1
|
|
||||||
gcloud docker -- push gcr.io/<project>/job-wq-1
|
|
||||||
```
|
|
||||||
|
|
||||||
## Defining a Job
|
## Defining a Job
|
||||||
|
|
||||||
Here is a job definition. You'll need to make a copy of the Job and edit the
|
Here is a manifest for a Job. You'll need to make a copy of the Job manifest
|
||||||
image to match the name you used, and call it `./job.yaml`.
|
(call it `./job.yaml`),
|
||||||
|
and edit the name of the container image to match the name you used.
|
||||||
|
|
||||||
{{% code_sample file="application/job/rabbitmq/job.yaml" %}}
|
{{% code_sample file="application/job/rabbitmq/job.yaml" %}}
|
||||||
|
|
||||||
In this example, each pod works on one item from the queue and then exits.
|
In this example, each pod works on one item from the queue and then exits.
|
||||||
So, the completion count of the Job corresponds to the number of work items
|
So, the completion count of the Job corresponds to the number of work items
|
||||||
done. So we set, `.spec.completions: 8` for the example, since we put 8 items in the queue.
|
done. That is why the example manifest has `.spec.completions` set to `8`.
|
||||||
|
|
||||||
## Running the Job
|
## Running the Job
|
||||||
|
|
||||||
So, now run the Job:
|
Now, run the Job:
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
|
# this assumes you downloaded and then edited the manifest already
|
||||||
kubectl apply -f ./job.yaml
|
kubectl apply -f ./job.yaml
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -264,14 +269,14 @@ Labels: controller-uid=41d75705-92df-11e7-b85e-fa163ee3c11f
|
||||||
Annotations: <none>
|
Annotations: <none>
|
||||||
Parallelism: 2
|
Parallelism: 2
|
||||||
Completions: 8
|
Completions: 8
|
||||||
Start Time: Wed, 06 Sep 2017 16:42:02 +0800
|
Start Time: Wed, 06 Sep 2022 16:42:02 +0000
|
||||||
Pods Statuses: 0 Running / 8 Succeeded / 0 Failed
|
Pods Statuses: 0 Running / 8 Succeeded / 0 Failed
|
||||||
Pod Template:
|
Pod Template:
|
||||||
Labels: controller-uid=41d75705-92df-11e7-b85e-fa163ee3c11f
|
Labels: controller-uid=41d75705-92df-11e7-b85e-fa163ee3c11f
|
||||||
job-name=job-wq-1
|
job-name=job-wq-1
|
||||||
Containers:
|
Containers:
|
||||||
c:
|
c:
|
||||||
Image: gcr.io/causal-jigsaw-637/job-wq-1
|
Image: container-registry.example/causal-jigsaw-637/job-wq-1
|
||||||
Port:
|
Port:
|
||||||
Environment:
|
Environment:
|
||||||
BROKER_URL: amqp://guest:guest@rabbitmq-service:5672
|
BROKER_URL: amqp://guest:guest@rabbitmq-service:5672
|
||||||
|
@ -293,30 +298,31 @@ Events:
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
All the pods for that Job succeeded. Yay.
|
All the pods for that Job succeeded! You're done.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
<!-- discussion -->
|
<!-- discussion -->
|
||||||
|
|
||||||
## Alternatives
|
## Alternatives
|
||||||
|
|
||||||
This approach has the advantage that you
|
This approach has the advantage that you do not need to modify your "worker" program to be
|
||||||
do not need to modify your "worker" program to be aware that there is a work queue.
|
aware that there is a work queue. You can include the worker program unmodified in your container
|
||||||
|
image.
|
||||||
|
|
||||||
It does require that you run a message queue service.
|
Using this approach does require that you run a message queue service.
|
||||||
If running a queue service is inconvenient, you may
|
If running a queue service is inconvenient, you may
|
||||||
want to consider one of the other [job patterns](/docs/concepts/workloads/controllers/job/#job-patterns).
|
want to consider one of the other [job patterns](/docs/concepts/workloads/controllers/job/#job-patterns).
|
||||||
|
|
||||||
This approach creates a pod for every work item. If your work items only take a few seconds,
|
This approach creates a pod for every work item. If your work items only take a few seconds,
|
||||||
though, creating a Pod for every work item may add a lot of overhead. Consider another
|
though, creating a Pod for every work item may add a lot of overhead. Consider another
|
||||||
[example](/docs/tasks/job/fine-parallel-processing-work-queue/), that executes multiple work items per Pod.
|
design, such as in the [fine parallel work queue example](/docs/tasks/job/fine-parallel-processing-work-queue/),
|
||||||
|
that executes multiple work items per Pod.
|
||||||
|
|
||||||
In this example, we use the `amqp-consume` utility to read the message
|
In this example, you used the `amqp-consume` utility to read the message
|
||||||
from the queue and run our actual program. This has the advantage that you
|
from the queue and run the actual program. This has the advantage that you
|
||||||
do not need to modify your program to be aware of the queue.
|
do not need to modify your program to be aware of the queue.
|
||||||
A [different example](/docs/tasks/job/fine-parallel-processing-work-queue/), shows how to
|
The [fine parallel work queue example](/docs/tasks/job/fine-parallel-processing-work-queue/)
|
||||||
communicate with the work queue using a client library.
|
shows how to communicate with the work queue using a client library.
|
||||||
|
|
||||||
## Caveats
|
## Caveats
|
||||||
|
|
||||||
|
@ -327,11 +333,11 @@ If the number of completions is set to more than the number of items in the queu
|
||||||
then the Job will not appear to be completed, even though all items in the queue
|
then the Job will not appear to be completed, even though all items in the queue
|
||||||
have been processed. It will start additional pods which will block waiting
|
have been processed. It will start additional pods which will block waiting
|
||||||
for a message.
|
for a message.
|
||||||
|
You would need to make your own mechanism to spot when there is work
|
||||||
|
to do and measure the size of the queue, setting the number of completions to match.
|
||||||
|
|
||||||
There is an unlikely race with this pattern. If the container is killed in between the time
|
There is an unlikely race with this pattern. If the container is killed in between the time
|
||||||
that the message is acknowledged by the amqp-consume command and the time that the container
|
that the message is acknowledged by the `amqp-consume` command and the time that the container
|
||||||
exits with success, or if the node crashes before the kubelet is able to post the success of the pod
|
exits with success, or if the node crashes before the kubelet is able to post the success of the pod
|
||||||
back to the api-server, then the Job will not appear to be complete, even though all items
|
back to the API server, then the Job will not appear to be complete, even though all items
|
||||||
in the queue have been processed.
|
in the queue have been processed.
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -1,23 +1,23 @@
|
||||||
---
|
---
|
||||||
title: Fine Parallel Processing Using a Work Queue
|
title: Fine Parallel Processing Using a Work Queue
|
||||||
content_type: task
|
content_type: task
|
||||||
min-kubernetes-server-version: v1.8
|
|
||||||
weight: 30
|
weight: 30
|
||||||
---
|
---
|
||||||
|
|
||||||
<!-- overview -->
|
<!-- overview -->
|
||||||
|
|
||||||
In this example, we will run a Kubernetes Job with multiple parallel
|
In this example, you will run a Kubernetes Job that runs multiple parallel
|
||||||
worker processes in a given pod.
|
tasks as worker processes, each running as a separate Pod.
|
||||||
|
|
||||||
In this example, as each pod is created, it picks up one unit of work
|
In this example, as each pod is created, it picks up one unit of work
|
||||||
from a task queue, processes it, and repeats until the end of the queue is reached.
|
from a task queue, processes it, and repeats until the end of the queue is reached.
|
||||||
|
|
||||||
Here is an overview of the steps in this example:
|
Here is an overview of the steps in this example:
|
||||||
|
|
||||||
1. **Start a storage service to hold the work queue.** In this example, we use Redis to store
|
1. **Start a storage service to hold the work queue.** In this example, you will use Redis to store
|
||||||
our work items. In the previous example, we used RabbitMQ. In this example, we use Redis and
|
work items. In the [previous example](/docs/tasks/job/coarse-parallel-processing-work-queue),
|
||||||
a custom work-queue client library because AMQP does not provide a good way for clients to
|
you used RabbitMQ. In this example, you will use Redis and a custom work-queue client library;
|
||||||
|
this is because AMQP does not provide a good way for clients to
|
||||||
detect when a finite-length work queue is empty. In practice you would set up a store such
|
detect when a finite-length work queue is empty. In practice you would set up a store such
|
||||||
as Redis once and reuse it for the work queues of many jobs, and other things.
|
as Redis once and reuse it for the work queues of many jobs, and other things.
|
||||||
1. **Create a queue, and fill it with messages.** Each message represents one task to be done. In
|
1. **Create a queue, and fill it with messages.** Each message represents one task to be done. In
|
||||||
|
@ -30,6 +30,13 @@ Here is an overview of the steps in this example:
|
||||||
|
|
||||||
{{< include "task-tutorial-prereqs.md" >}}
|
{{< include "task-tutorial-prereqs.md" >}}
|
||||||
|
|
||||||
|
You will need a container image registry where you can upload images to run in your cluster.
|
||||||
|
The example uses [Docker Hub](https://hub.docker.com/), but you could adapt it to a different
|
||||||
|
container image registry.
|
||||||
|
|
||||||
|
This task example also assumes that you have Docker installed locally. You use Docker to
|
||||||
|
build container images.
|
||||||
|
|
||||||
<!-- steps -->
|
<!-- steps -->
|
||||||
|
|
||||||
Be familiar with the basic,
|
Be familiar with the basic,
|
||||||
|
@ -39,7 +46,7 @@ non-parallel, use of [Job](/docs/concepts/workloads/controllers/job/).
|
||||||
|
|
||||||
## Starting Redis
|
## Starting Redis
|
||||||
|
|
||||||
For this example, for simplicity, we will start a single instance of Redis.
|
For this example, for simplicity, you will start a single instance of Redis.
|
||||||
See the [Redis Example](https://github.com/kubernetes/examples/tree/master/guestbook) for an example
|
See the [Redis Example](https://github.com/kubernetes/examples/tree/master/guestbook) for an example
|
||||||
of deploying Redis scalably and redundantly.
|
of deploying Redis scalably and redundantly.
|
||||||
|
|
||||||
|
@ -53,23 +60,27 @@ You could also download the following files directly:
|
||||||
- [`worker.py`](/examples/application/job/redis/worker.py)
|
- [`worker.py`](/examples/application/job/redis/worker.py)
|
||||||
|
|
||||||
|
|
||||||
## Filling the Queue with tasks
|
## Filling the queue with tasks
|
||||||
|
|
||||||
Now let's fill the queue with some "tasks". In our example, our tasks are strings to be
|
Now let's fill the queue with some "tasks". In this example, the tasks are strings to be
|
||||||
printed.
|
printed.
|
||||||
|
|
||||||
Start a temporary interactive pod for running the Redis CLI.
|
Start a temporary interactive pod for running the Redis CLI.
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
kubectl run -i --tty temp --image redis --command "/bin/sh"
|
kubectl run -i --tty temp --image redis --command "/bin/sh"
|
||||||
|
```
|
||||||
|
```
|
||||||
Waiting for pod default/redis2-c7h78 to be running, status is Pending, pod ready: false
|
Waiting for pod default/redis2-c7h78 to be running, status is Pending, pod ready: false
|
||||||
Hit enter for command prompt
|
Hit enter for command prompt
|
||||||
```
|
```
|
||||||
|
|
||||||
Now hit enter, start the redis CLI, and create a list with some work items in it.
|
Now hit enter, start the Redis CLI, and create a list with some work items in it.
|
||||||
|
|
||||||
|
```shell
|
||||||
|
redis-cli -h redis
|
||||||
```
|
```
|
||||||
# redis-cli -h redis
|
```console
|
||||||
redis:6379> rpush job2 "apple"
|
redis:6379> rpush job2 "apple"
|
||||||
(integer) 1
|
(integer) 1
|
||||||
redis:6379> rpush job2 "banana"
|
redis:6379> rpush job2 "banana"
|
||||||
|
@ -100,21 +111,21 @@ redis:6379> lrange job2 0 -1
|
||||||
9) "orange"
|
9) "orange"
|
||||||
```
|
```
|
||||||
|
|
||||||
So, the list with key `job2` will be our work queue.
|
So, the list with key `job2` will be the work queue.
|
||||||
|
|
||||||
Note: if you do not have Kube DNS setup correctly, you may need to change
|
Note: if you do not have Kube DNS setup correctly, you may need to change
|
||||||
the first step of the above block to `redis-cli -h $REDIS_SERVICE_HOST`.
|
the first step of the above block to `redis-cli -h $REDIS_SERVICE_HOST`.
|
||||||
|
|
||||||
|
|
||||||
## Create an Image
|
## Create a container image {#create-an-image}
|
||||||
|
|
||||||
Now we are ready to create an image that we will run.
|
Now you are ready to create an image that will process the work in that queue.
|
||||||
|
|
||||||
We will use a python worker program with a redis client to read
|
You're going to use a Python worker program with a Redis client to read
|
||||||
the messages from the message queue.
|
the messages from the message queue.
|
||||||
|
|
||||||
A simple Redis work queue client library is provided,
|
A simple Redis work queue client library is provided,
|
||||||
called rediswq.py ([Download](/examples/application/job/redis/rediswq.py)).
|
called `rediswq.py` ([Download](/examples/application/job/redis/rediswq.py)).
|
||||||
|
|
||||||
The "worker" program in each Pod of the Job uses the work queue
|
The "worker" program in each Pod of the Job uses the work queue
|
||||||
client library to get work. Here it is:
|
client library to get work. Here it is:
|
||||||
|
@ -124,7 +135,7 @@ client library to get work. Here it is:
|
||||||
You could also download [`worker.py`](/examples/application/job/redis/worker.py),
|
You could also download [`worker.py`](/examples/application/job/redis/worker.py),
|
||||||
[`rediswq.py`](/examples/application/job/redis/rediswq.py), and
|
[`rediswq.py`](/examples/application/job/redis/rediswq.py), and
|
||||||
[`Dockerfile`](/examples/application/job/redis/Dockerfile) files, then build
|
[`Dockerfile`](/examples/application/job/redis/Dockerfile) files, then build
|
||||||
the image:
|
the container image. Here's an example using Docker to do the image build:
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
docker build -t job-wq-2 .
|
docker build -t job-wq-2 .
|
||||||
|
@ -144,46 +155,40 @@ docker push <username>/job-wq-2
|
||||||
You need to push to a public repository or [configure your cluster to be able to access
|
You need to push to a public repository or [configure your cluster to be able to access
|
||||||
your private repository](/docs/concepts/containers/images/).
|
your private repository](/docs/concepts/containers/images/).
|
||||||
|
|
||||||
If you are using [Google Container
|
|
||||||
Registry](https://cloud.google.com/tools/container-registry/), tag
|
|
||||||
your app image with your project ID, and push to GCR. Replace
|
|
||||||
`<project>` with your project ID.
|
|
||||||
|
|
||||||
```shell
|
|
||||||
docker tag job-wq-2 gcr.io/<project>/job-wq-2
|
|
||||||
gcloud docker -- push gcr.io/<project>/job-wq-2
|
|
||||||
```
|
|
||||||
|
|
||||||
## Defining a Job
|
## Defining a Job
|
||||||
|
|
||||||
Here is the job definition:
|
Here is a manifest for the Job you will create:
|
||||||
|
|
||||||
{{% code_sample file="application/job/redis/job.yaml" %}}
|
{{% code_sample file="application/job/redis/job.yaml" %}}
|
||||||
|
|
||||||
Be sure to edit the job template to
|
{{< note >}}
|
||||||
|
Be sure to edit the manifest to
|
||||||
change `gcr.io/myproject` to your own path.
|
change `gcr.io/myproject` to your own path.
|
||||||
|
{{< /note >}}
|
||||||
|
|
||||||
In this example, each pod works on several items from the queue and then exits when there are no more items.
|
In this example, each pod works on several items from the queue and then exits when there are no more items.
|
||||||
Since the workers themselves detect when the workqueue is empty, and the Job controller does not
|
Since the workers themselves detect when the workqueue is empty, and the Job controller does not
|
||||||
know about the workqueue, it relies on the workers to signal when they are done working.
|
know about the workqueue, it relies on the workers to signal when they are done working.
|
||||||
The workers signal that the queue is empty by exiting with success. So, as soon as any worker
|
The workers signal that the queue is empty by exiting with success. So, as soon as **any** worker
|
||||||
exits with success, the controller knows the work is done, and the Pods will exit soon.
|
exits with success, the controller knows the work is done, and that the Pods will exit soon.
|
||||||
So, we set the completion count of the Job to 1. The job controller will wait for the other pods to complete
|
So, you need to set the completion count of the Job to 1. The job controller will wait for
|
||||||
too.
|
the other pods to complete too.
|
||||||
|
|
||||||
|
|
||||||
## Running the Job
|
## Running the Job
|
||||||
|
|
||||||
So, now run the Job:
|
So, now run the Job:
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
|
# this assumes you downloaded and then edited the manifest already
|
||||||
kubectl apply -f ./job.yaml
|
kubectl apply -f ./job.yaml
|
||||||
```
|
```
|
||||||
|
|
||||||
Now wait a bit, then check on the job.
|
Now wait a bit, then check on the Job:
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
kubectl describe jobs/job-wq-2
|
kubectl describe jobs/job-wq-2
|
||||||
|
```
|
||||||
|
```
|
||||||
Name: job-wq-2
|
Name: job-wq-2
|
||||||
Namespace: default
|
Namespace: default
|
||||||
Selector: controller-uid=b1c7e4e3-92e1-11e7-b85e-fa163ee3c11f
|
Selector: controller-uid=b1c7e4e3-92e1-11e7-b85e-fa163ee3c11f
|
||||||
|
@ -192,14 +197,14 @@ Labels: controller-uid=b1c7e4e3-92e1-11e7-b85e-fa163ee3c11f
|
||||||
Annotations: <none>
|
Annotations: <none>
|
||||||
Parallelism: 2
|
Parallelism: 2
|
||||||
Completions: <unset>
|
Completions: <unset>
|
||||||
Start Time: Mon, 11 Jan 2016 17:07:59 -0800
|
Start Time: Mon, 11 Jan 2022 17:07:59 +0000
|
||||||
Pods Statuses: 1 Running / 0 Succeeded / 0 Failed
|
Pods Statuses: 1 Running / 0 Succeeded / 0 Failed
|
||||||
Pod Template:
|
Pod Template:
|
||||||
Labels: controller-uid=b1c7e4e3-92e1-11e7-b85e-fa163ee3c11f
|
Labels: controller-uid=b1c7e4e3-92e1-11e7-b85e-fa163ee3c11f
|
||||||
job-name=job-wq-2
|
job-name=job-wq-2
|
||||||
Containers:
|
Containers:
|
||||||
c:
|
c:
|
||||||
Image: gcr.io/exampleproject/job-wq-2
|
Image: container-registry.example/exampleproject/job-wq-2
|
||||||
Port:
|
Port:
|
||||||
Environment: <none>
|
Environment: <none>
|
||||||
Mounts: <none>
|
Mounts: <none>
|
||||||
|
@ -227,7 +232,7 @@ Working on date
|
||||||
Working on lemon
|
Working on lemon
|
||||||
```
|
```
|
||||||
|
|
||||||
As you can see, one of our pods worked on several work units.
|
As you can see, one of the pods for this Job worked on several work units.
|
||||||
|
|
||||||
<!-- discussion -->
|
<!-- discussion -->
|
||||||
|
|
||||||
|
@ -238,8 +243,7 @@ want to consider one of the other
|
||||||
[job patterns](/docs/concepts/workloads/controllers/job/#job-patterns).
|
[job patterns](/docs/concepts/workloads/controllers/job/#job-patterns).
|
||||||
|
|
||||||
If you have a continuous stream of background processing work to run, then
|
If you have a continuous stream of background processing work to run, then
|
||||||
consider running your background workers with a `ReplicaSet` instead,
|
consider running your background workers with a ReplicaSet instead,
|
||||||
and consider running a background processing library such as
|
and consider running a background processing library such as
|
||||||
[https://github.com/resque/resque](https://github.com/resque/resque).
|
[https://github.com/resque/resque](https://github.com/resque/resque).
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -0,0 +1,12 @@
|
||||||
|
apiVersion: v1
|
||||||
|
kind: Service
|
||||||
|
metadata:
|
||||||
|
labels:
|
||||||
|
component: rabbitmq
|
||||||
|
name: rabbitmq-service
|
||||||
|
spec:
|
||||||
|
ports:
|
||||||
|
- port: 5672
|
||||||
|
selector:
|
||||||
|
app.kubernetes.io/name: task-queue
|
||||||
|
app.kubernetes.io/component: rabbitmq
|
|
@ -0,0 +1,36 @@
|
||||||
|
apiVersion: apps/v1
|
||||||
|
kind: StatefulSet
|
||||||
|
metadata:
|
||||||
|
labels:
|
||||||
|
component: rabbitmq
|
||||||
|
name: rabbitmq
|
||||||
|
spec:
|
||||||
|
replicas: 1
|
||||||
|
serviceName: rabbitmq-service
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
app.kubernetes.io/name: task-queue
|
||||||
|
app.kubernetes.io/component: rabbitmq
|
||||||
|
template:
|
||||||
|
metadata:
|
||||||
|
labels:
|
||||||
|
app.kubernetes.io/name: task-queue
|
||||||
|
app.kubernetes.io/component: rabbitmq
|
||||||
|
spec:
|
||||||
|
containers:
|
||||||
|
- image: rabbitmq
|
||||||
|
name: rabbitmq
|
||||||
|
ports:
|
||||||
|
- containerPort: 5672
|
||||||
|
resources:
|
||||||
|
requests:
|
||||||
|
memory: 16M
|
||||||
|
limits:
|
||||||
|
cpu: 250m
|
||||||
|
memory: 512M
|
||||||
|
volumeMounts:
|
||||||
|
- mountPath: /var/lib/rabbitmq
|
||||||
|
name: rabbitmq-data
|
||||||
|
volumes:
|
||||||
|
- name: rabbitmq-data
|
||||||
|
emptyDir: {}
|
Loading…
Reference in New Issue