SCONEAPPS: Tensorflow
Run a Tensorflow workload on a Kubernetes cluster.
Prerequisites
- A Kubernetes cluster;
- Helm 3 client. Please refer to the official setup guide.
Upload the policy for Tensorflow
If you need remote attestation, you can use the upload_policy
command to help you upload the tensorflow scone template which runs the hello world example from the TensorFlow overview. You can edit the template to run your confidential workload.
The upload_policy
command will generate a chart values file (tensorflow_chart_values.yml
) that you can use to deploy your confidential workload with help of HELM.
Run upload_policy
with the default options to create the chart values with default poliy uploaded into the latest public CAS.
Create an alias for the upload_policy
container.
alias scone="docker run -it --rm \
-v \"$HOME/.docker/config.json:/root/.docker/config.json\" \
-v \"\$PWD:/root\" \
-v \"\$PWD:/values\" \
registry.scontain.com:5050/sconecuratedimages/experimental:upload-policy "
upload_policy
comes with the templates for sconeapps preloaded. Assuming that you've created the alias above, use the following command to use the default policy for Tensorflow.
scone upload_policy templates/tensorflow
If everything went well, upload_policy
created the tensorflow_chart_values.yml
file in your current directory.
In case you want to not use the default policy maintained by Scontain, you can upload a custom policy. See scone upload_policy --help
for more information. It will always generate the chart values with the CAS address (SCONE_CAS_ADDR var) and SCONE_CONFIG_ID for remote attestation pointing to the policies created.
Install the chart
Add the sconeapps repo
If you haven't yet, please add sconeapps repo to Helm.
Install the chart
Use Helm to run Tensorflow on your cluster. We are deploying a Helm release called my-tensorflow
, with default parameters (i.e., a simple model training):
# Using the tensorflow_chart_values.yml file generated by the upload_policy command
helm install my-tensorflow sconeapps/tensorflow -f tensorflow_chart_values.yml
Have a look at the Parameters section for a complete list of parameters this chart supports.
SGX device
By default, this helm chart uses the SCONE SGX Plugin. Hence, it sets the resource limits of CAS as follows:
resources:
limits:
sgx.intel.com/enclave: 1
Alternatively, set useSGXDevPlugin
to azure
(e.g., --useSGXDevPlugin=azure
) to support Azure's SGX Device Plugin. Since Azure requires the amount of EPC memory allocated to your application to be specified, the parameter sgxEpcMem
(SGX EPC memory in MiB) becomes required too (e.g., --set useSGXDevPlugin=azure --set sgxEpcMem=16
).
In case you do not want to use the SGX plugin, you can remove the resource limit and explicitly mount the local SGX device into your container by setting:
extraVolumes:
- name: dev-isgx
hostPath:
path: /dev/isgx
extraVolumeMounts:
- name: dev-isgx
path: /dev/isgx
Please note that mounting the local SGX device into your container requires privileged mode, which will grant your container access to ALL host devices. To enable privileged mode, set securityContext
:
securityContext:
privileged: true
Before you begin
Attestation
This chart does not submit any sessions to a CAS, so you have to do it beforehand. If you need to pass remote attestation information to your container, use the upload_policy container. It will add information such as SCONE_CONFIG_ID
and SCONE_CAS_ADDR
, using the extraEnv
parameter on values.yaml
.
Data output and external volumes
If your workload produces any output artifacts that need to be saved before the container is gone, consider having an auxiliary task uploading it to somewhere after the main task is finished.
For now, any output volumes are considered to be either hostPath
(meaning that their respective directories have to exist on the worker node) or emptyDir
, which maps everything to random-generated directory under /tmp
.
Parameters
Parameter | Description | Default |
---|---|---|
image |
Tensorflow image | registry.scontain.com/sconecuratedimages/datasystems:tensorflow-1.15 |
imagePullPolicy |
Tensorflow pull policy | IfNotPresent |
imagePullSecrets |
Tensorflow pull secrets, in case of private repositories | [{"name": "sconeapps"}] |
nameOverride |
String to partially override tensorflow.fullname template with a string (will prepend the release name) | nil |
fullNameOverride |
String to fully override tensorflow.fullname template with a string | nil |
podAnnotations |
Additional pod annotations | {} |
securityContext |
Security context for Tensorflow container | {} |
extraVolumes |
Extra volume definitions | [] |
extraVolumeMounts |
Extra volume mounts for Tensorflow pod | [] |
extraEnv |
Additional environment variables for Tensorflow container | [] |
resources |
CPU/Memory resource requests/limits for node. | {} |
nodeSelector |
Node labels for pod assignment (this value is evaluated as a template) | {} |
tolerations |
List of node taints to tolerate (this value is evaluated as a template) | [] |
affinity |
Map of node/pod affinities (The value is evaluated as a template) | {} |
useSGXDevPlugin |
Use SGX Device Plugin to access SGX resources. | "scone" |
sgxEpcMem |
Required to Azure SGX Device Plugin. Protected EPC memory in MiB | 16 |