SCONEAPPS: Tensorflow

Run a Tensorflow workload on a Kubernetes cluster.

Prerequisites

A Kubernetes cluster;
Helm 3 client. Please refer to the official setup guide.

Upload the policy for Tensorflow

If you need remote attestation, you can use the upload_policy command to help you upload the tensorflow scone template which runs the hello world example from the TensorFlow overview. You can edit the template to run your confidential workload.

The upload_policy command will generate a chart values file (tensorflow_chart_values.yml) that you can use to deploy your confidential workload with help of HELM.

Run upload_policy with the default options to create the chart values with default poliy uploaded into the latest public CAS.

Create an alias for the upload_policy container.

alias scone="docker run -it --rm \
    -v \"$HOME/.docker/config.json:/root/.docker/config.json\" \
    -v \"\$PWD:/root\" \
    -v \"\$PWD:/values\" \
    registry.scontain.com:5050/sconecuratedimages/experimental:upload-policy "

upload_policy comes with the templates for sconeapps preloaded. Assuming that you've created the alias above, use the following command to use the default policy for Tensorflow.

scone upload_policy templates/tensorflow

If everything went well, upload_policy created the tensorflow_chart_values.yml file in your current directory.

In case you want to not use the default policy maintained by Scontain, you can upload a custom policy. See scone upload_policy --help for more information. It will always generate the chart values with the CAS address (SCONE_CAS_ADDR var) and SCONE_CONFIG_ID for remote attestation pointing to the policies created.

Install the chart

Add the sconeapps repo

If you haven't yet, please add sconeapps repo to Helm.

Install the chart

Use Helm to run Tensorflow on your cluster. We are deploying a Helm release called my-tensorflow, with default parameters (i.e., a simple model training):

# Using the tensorflow_chart_values.yml file generated by the upload_policy command
helm install my-tensorflow sconeapps/tensorflow -f tensorflow_chart_values.yml

Have a look at the Parameters section for a complete list of parameters this chart supports.

SGX device

By default, this helm chart uses the SCONE SGX Plugin. Hence, it sets the resource limits of CAS as follows:

resources:
  limits:
    sgx.intel.com/enclave: 1

Alternatively, set useSGXDevPlugin to azure (e.g., --useSGXDevPlugin=azure) to support Azure's SGX Device Plugin. Since Azure requires the amount of EPC memory allocated to your application to be specified, the parameter sgxEpcMem (SGX EPC memory in MiB) becomes required too (e.g., --set useSGXDevPlugin=azure --set sgxEpcMem=16).

In case you do not want to use the SGX plugin, you can remove the resource limit and explicitly mount the local SGX device into your container by setting:

extraVolumes:
  - name: dev-isgx
    hostPath:
      path: /dev/isgx

extraVolumeMounts:
  - name: dev-isgx
    path: /dev/isgx

Please note that mounting the local SGX device into your container requires privileged mode, which will grant your container access to ALL host devices. To enable privileged mode, set securityContext:

securityContext:
  privileged: true

Before you begin

Attestation

This chart does not submit any sessions to a CAS, so you have to do it beforehand. If you need to pass remote attestation information to your container, use the upload_policy container. It will add information such as SCONE_CONFIG_ID and SCONE_CAS_ADDR, using the extraEnv parameter on values.yaml.

Data output and external volumes

If your workload produces any output artifacts that need to be saved before the container is gone, consider having an auxiliary task uploading it to somewhere after the main task is finished.

For now, any output volumes are considered to be either hostPath (meaning that their respective directories have to exist on the worker node) or emptyDir, which maps everything to random-generated directory under /tmp.

Parameters

Parameter	Description	Default
`image`	Tensorflow image	`registry.scontain.com/sconecuratedimages/datasystems:tensorflow-1.15`
`imagePullPolicy`	Tensorflow pull policy	`IfNotPresent`
`imagePullSecrets`	Tensorflow pull secrets, in case of private repositories	`[{"name": "sconeapps"}]`
`nameOverride`	String to partially override tensorflow.fullname template with a string (will prepend the release name)	`nil`
`fullNameOverride`	String to fully override tensorflow.fullname template with a string	`nil`
`podAnnotations`	Additional pod annotations	`{}`
`securityContext`	Security context for Tensorflow container	`{}`
`extraVolumes`	Extra volume definitions	`[]`
`extraVolumeMounts`	Extra volume mounts for Tensorflow pod	`[]`
`extraEnv`	Additional environment variables for Tensorflow container	`[]`
`resources`	CPU/Memory resource requests/limits for node.	`{}`
`nodeSelector`	Node labels for pod assignment (this value is evaluated as a template)	`{}`
`tolerations`	List of node taints to tolerate (this value is evaluated as a template)	`[]`
`affinity`	Map of node/pod affinities (The value is evaluated as a template)	`{}`
`useSGXDevPlugin`	Use SGX Device Plugin to access SGX resources.	`"scone"`
`sgxEpcMem`	Required to Azure SGX Device Plugin. Protected EPC memory in MiB	`16`