The JupyterHub Helm chart is configurable by values in your config.yaml. In this way, you can extend user resources, build off of different Docker images, manage security and authentication, and more.
config.yaml
Below is a description of many but not all of the configurable values for the Helm chart. To see all configurable options, inspect their default values defined here.
For more guided information about some specific things you can do with modifications to the helm chart, see the Advanced Installation.
A 32-byte cryptographically secure randomly generated string used to sign values of secure cookies set by the hub. If unset, jupyterhub will generate one on startup and save it in the file jupyterhub_cookie_secret in the /srv/jupyterhub directory of the hub container. A value set here will make JupyterHub overwrite any previous file.
jupyterhub_cookie_secret
/srv/jupyterhub
You do not need to set this at all if you are using the default configuration for storing databases - sqlite on a persistent volume (with hub.db.type set to the default sqlite-pvc). If you are using an external database, then you must set this value explicitly - or your users will keep getting logged out each time the hub pod restarts.
hub.db.type
sqlite-pvc
Changing this value will all user logins to be invalidated. If this secret leaks, immediately change it to something else, or user data can be compromised
# to generate a value, run openssl rand -hex 32
Set the imagePullPolicy on the hub pod.
See the Kubernetes docs for more info on what the values mean.
Creates an image pull secret for you and makes the hub pod utilize it, allowing it to pull images from private image registries.
Using this configuration option automates the following steps that normally is required to pull from private image registries.
# you won't need to run this manually... kubectl create secret docker-registry hub-image-credentials \ --docker-server=<REGISTRY> \ --docker-username=<USERNAME> \ --docker-email=<EMAIL> \ --docker-password=<PASSWORD>
# you won't need to specify this manually... spec: imagePullSecrets: - name: hub-image-credentials
To learn the username and password fields to access a gcr.io registry from a Kubernetes cluster not associated with the same google cloud credentials, look into this guide and read the notes about the password.
Enable the creation of a Kubernetes Secret containing credentials to access a image registry. By enabling this, the hub pod will also be configured to use these credentials when it pulls its container image.
Name of the private registry you want to create a credential set for. It will default to Docker Hub’s image registry.
Examples:
https://index.docker.io/v1/
quay.io
eu.gcr.io
alexmorreale.privatereg.net
Name of the user you want to use to connect to your private registry. For external gcr.io, you will use the _json_key.
_json_key
alexmorreale
alex@pfc.com
Password of the user you want to use to connect to your private registry.
plaintextpassword
abc123SECRETzyx098
For gcr.io registries the password will be a big JSON blob for a Google cloud service account, it should look something like below.
password: |- { "type": "service_account", "project_id": "jupyter-se", "private_key_id": "f2ba09118a8d3123b3321bd9a7d6d0d9dc6fdb85", ... }
Learn more in this guide.
Set custom image name / tag for the hub pod.
Use this to customize which hub image is used. Note that you must use a version of the hub image that was bundled with this particular version of the helm-chart - using other images might not work.
Name of the image, without the tag.
# example names yuvipanda/wikimedia-hub gcr.io/my-project/my-hub
The tag of the image to pull.
This is the value after the : in your full image name.
:
# example tags v1.11.1 zhy270a
Use an existing kubernetes secret to pull the custom image.
# example existing pull secret. singleuser: image: pullSecrets: - gcr-pull
Type of database backend to use for the hub database.
The Hub requires a persistent database to function, and this lets you specify where it should be stored.
The various options are:
Use an sqlite database kept on a persistent volume attached to the hub.
sqlite
By default, this disk is created by the cloud provider using dynamic provisioning configured by a storage class. You can customize how this disk is created / attached by setting various properties under hub.db.pvc.
hub.db.pvc
This is the default setting, and should work well for most cloud provider deployments.
sqlite-memory
Use an in-memory sqlite database. This should only be used for testing, since the database is erased whenever the hub pod restarts - causing the hub to lose all memory of users who had logged in before.
When using this for testing, make sure you delete all other objects that the hub has created (such as user pods, user PVCs, etc) every time the hub restarts. Otherwise you might run into errors about duplicate resources.
mysql
Use an externally hosted mysql database.
You have to specify an sqlalchemy connection string for the mysql database you want to connect to in hub.db.url if using this option.
hub.db.url
The general format of the connection string is:
mysql+pymysql://<db-username>:<db-password>@<db-hostname>:<db-port>/<db-name>
The user specified in the connection string must have the rights to create tables in the database specified.
Note that if you use this, you must also set hub.cookieSecret.
hub.cookieSecret
postgres
Use an externally hosted postgres database.
You have to specify an sqlalchemy connection string for the postgres database you want to connect to in hub.db.url if using this option.
postgres+psycopg2://<db-username>:<db-password>@<db-hostname>:<db-port>/<db-name>
Customize the Persistent Volume Claim used when hub.db.type is sqlite-pvc.
Annotations to apply to the PVC containing the sqlite database.
See the Kubernetes documentation for more details about annotations.
Label selectors to set for the PVC containing the sqlite database.
Useful when you are using a specific PV, and want to bind to that and only that.
See the Kubernetes documentation for more details about using a label selector for what PV to bind to.
Size of disk to request for the database disk.
Connection string when hub.db.type is mysql or postgres.
See documentation for hub.db.type for more details on the format of this property.
Password for the database when hub.db.type is mysql or postgres.
Extra labels to add to the hub pod.
See the Kubernetes docs to learn more about labels.
list of initContainers to be run with hub pod. See Kubernetes Docs
hub: initContainers: - name: init-myservice image: busybox:1.28 command: ['sh', '-c', 'command1'] - name: init-mydb image: busybox:1.28 command: ['sh', '-c', 'command2']
Extra environment variables that should be set for the hub pod.
Environment variables are usually used to:
Pass parameters to some custom code in hub.extraConfig.
hub.extraConfig
Configure code running in the hub pod, such as an authenticator or spawner.
String literals with $(ENV_VAR_NAME) will be expanded by Kubelet which is a part of Kubernetes.
$(ENV_VAR_NAME)
hub: extraEnv: # basic notation (for literal values only) MY_ENV_VARS_NAME1: "my env var value 1" # explicit notation (the "name" field takes precedence) HUB_NAMESPACE: name: HUB_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace # implicit notation (the "name" field is implied) PREFIXED_HUB_NAMESPACE: value: "my-prefix-$(HUB_NAMESPACE)" SECRET_VALUE: valueFrom: secretKeyRef: name: my-k8s-secret key: password
For more information, see the Kubernetes EnvVar specification.
Arbitrary extra python based configuration that should be in jupyterhub_config.py.
jupyterhub_config.py
This is the escape hatch - if you want to configure JupyterHub to do something specific that is not present here as an option, you can write the raw Python to do it here.
extraConfig is a dict, so there can be multiple configuration snippets under different names. The configuration sections are run in alphabetical order.
Non-exhaustive examples of things you can do here:
Subclass authenticator / spawner to do a custom thing
Dynamically launch different images for different sets of images
Inject an auth token from GitHub authenticator into user pod
Anything else you can think of!
Since this is usually a multi-line string, you want to format it using YAML’s | operator.
For example:
hub: extraConfig: myConfig.py: | c.JupyterHub.something = 'something' c.Spawner.somethingelse = 'something else'
No validation of this python is performed! If you make a mistake here, it will probably manifest as either the hub pod going into Error or CrashLoopBackoff states, or in some special cases, the hub running but… just doing very random things. Be careful!
Error
CrashLoopBackoff
The UID the hub process should be running as. Use this only if you are building your own image & know that a user with this uid exists inside the hub container! Advanced feature, handle with care! Defaults to 1000, which is the uid of the jovyan user that is present in the default hub image.
jovyan
The gid the hub process should be using when touching any volumes mounted. Use this only if you are building your own image & know that a group with this gid exists inside the hub container! Advanced feature, handle with care! Defaults to 1000, which is the gid of the jovyan user that is present in the default hub image.
Object to configure the service the JupyterHub will be exposed on by the Kubernetes server.
The Kubernetes ServiceType to be used.
The default type is ClusterIP. See the Kubernetes docs to learn more about service types.
ClusterIP
Object to configure the ports the hub service will be deployed on.
The nodePort to deploy the hub service on.
Kubernetes annotations to apply to the hub service.
Set the Pod Disruption Budget for the hub pod.
See the Kubernetes documentation for more details about disruptions.
Whether PodDisruptionBudget is enabled for the hub pod.
Minimum number of pods to be available during the voluntary disruptions.
Name of the existing secret in the kubernetes cluster, typically the hub-secret.
hub-secret
This secret should represent the structure as otherwise generated by this chart:
apiVersion: v1 data: proxy.token: < FILL IN > values.yaml: < FILL IN > kind: Secret metadata: name: hub-secret
NOTE: if you choose to manage the secret yourself, you are in charge of ensuring the secret having the proper contents.
Configure the configurable-http-proxy (chp) pod managed by jupyterhub to route traffic both to itself and to user pods.
Extra environment variables that should be set for the chp pod.
Environment variables are usually used here to:
override HUB_SERVICE_PORT or HUB_SERVICE_HOST default values
set CONFIGPROXY_SSL_KEY_PASSPHRASE for setting passphrase of SSL keys
proxy: chp: extraEnv: # basic notation (for literal values only) MY_ENV_VARS_NAME1: "my env var value 1" # explicit notation (the "name" field takes precedence) CHP_NAMESPACE: name: CHP_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace # implicit notation (the "name" field is implied) PREFIXED_CHP_NAMESPACE: value: "my-prefix-$(CHP_NAMESPACE)" SECRET_VALUE: valueFrom: secretKeyRef: name: my-k8s-secret key: password
A 32-byte cryptographically secure randomly generated string used to secure communications between the hub and the configurable-http-proxy.
Changing this value will cause the proxy and hub pods to restart. It is good security practice to rotate these values over time. If this secret leaks, immediately change it to something else, or user data can be compromised
Object to configure the service the JupyterHub’s proxy will be exposed on by the Kubernetes server.
Default LoadBalancer. See hub.service.type for supported values.
LoadBalancer
hub.service.type
Extra labels to add to the proxy service.
Annotations to apply to the service that is exposing the proxy.
Object to set NodePorts to expose the service on for http and https.
See the Kubernetes documentation for more details about NodePorts.
The HTTP port the proxy-public service should be exposed on.
The HTTPS port the proxy-public service should be exposed on.
The public IP address the proxy-public Kubernetes service should be exposed on. This entry will end up at the configurable proxy server that JupyterHub manages, which will direct traffic to user pods at the /user path and the hub pod at the /hub path.
/user
/hub
Set this if you want to use a fixed external IP address instead of a dynamically acquired one. This is relevant if you have a domain name that you want to point to a specific IP and want to ensure it doesn’t change.
A list of IP CIDR ranges that are allowed to access the load balancer service. Defaults to allowing everyone to access it.
Object for customizing the settings for HTTPS used by the JupyterHub’s proxy. For more information on configuring HTTPS for your JupyterHub, see the HTTPS section in our security guide
Indicator to set whether HTTPS should be enabled or not on the proxy. Defaults to true if the https object is provided.
true
The type of HTTPS encryption that is used. Decides on which ports and network policies are used for communication via HTTPS. Setting this to secret sets the type to manual HTTPS with a secret that has to be provided in the https.secret object. Defaults to letsencrypt.
secret
https.secret
letsencrypt
The contact email to be used for automatically provisioned HTTPS certificates by Let’s Encrypt. For more information see Set up automatic HTTPS. Required for automatic HTTPS.
Object for providing own certificates for manual HTTPS configuration. To be provided when setting https.type to manual. See Set up manual HTTPS
https.type
manual
The RSA private key to be used for HTTPS. To be provided in the form of
key: | -----BEGIN RSA PRIVATE KEY----- ... -----END RSA PRIVATE KEY-----
The certificate to be used for HTTPS. To be provided in the form of
cert: | -----BEGIN CERTIFICATE----- ... -----END CERTIFICATE-----
Secret to be provided when setting https.type to secret.
Name of the secret
Path to the private key to be used for HTTPS. Example: 'tls.key'
'tls.key'
Path to the certificate to be used for HTTPS. Example: 'tls.crt'
'tls.crt'
You domain in list form. Required for automatic HTTPS. See Set up automatic HTTPS. To be provided like:
hosts: - <your-domain-name>
Set the Pod Disruption Budget for the proxy pod.
Whether PodDisruptionBudget is enabled for the proxy pod.
Configure the traefik proxy used to terminate TLS when ‘autohttps’ is enabled
Extra environment variables that should be set for the traefik pod.
Environment Variables here may be used to configure traefik.
proxy: traefik: extraEnv: # basic notation (for literal values only) MY_ENV_VARS_NAME1: "my env var value 1" # explicit notation (the "name" field takes precedence) TRAEFIK_NAMESPACE: name: TRAEFIK_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace # implicit notation (the "name" field is implied) PREFIXED_TRAEFIK_NAMESPACE: value: "my-prefix-$(TRAEFIK_NAMESPACE)" SECRET_VALUE: valueFrom: secretKeyRef: name: my-k8s-secret key: password
Enable persisting auth_state (if available). See: the documentation on authenticators
auth_state will be encrypted and stored in the Hub’s database. This can include things like authentication tokens, etc. to be passed to Spawners as environment variables. Encrypting auth_state requires the cryptography package. It must contain one (or more, separated by ;) 32-byte encryption keys. These can be either base64 or hex-encoded. The JUPYTERHUB_CRYPT_KEY environment variable for the hub pod is set using this entry.
If encryption is unavailable, auth_state cannot be persisted.
Options for customizing the environment that is provided to the users after they log in.
Template for the pod name of each user, such as jupyter-{username}{servername}.
jupyter-{username}{servername}
Set CPU limits & guarantees that are enforced for each user. See: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
Set Memory limits & guarantees that are enforced for each user.
See the Kubernetes docs for more info.
Note that this field is referred to as requests by the Kubernetes API.
Creates an image pull secret for you and makes the user pods utilize it, allowing them to pull images from private image registries.
# you won't need to run this manually... kubectl create secret docker-registry singleuser-image-credentials \ --docker-server=<REGISTRY> \ --docker-username=<USERNAME> \ --docker-email=<EMAIL> \ --docker-password=<PASSWORD>
# you won't need to specify this manually... spec: imagePullSecrets: - name: singleuser-image-credentials
Enable the creation of a Kubernetes Secret containing credentials to access a image registry. By enabling this, user pods and image puller pods will also be configured to use these credentials when they pull their container images.
Set custom image name / tag used for spawned users.
This image is used to launch the pod for each user.
yuvipanda/wikimedia-hub-user
gcr.io/my-project/my-user-image
The tag of the image to use.
Set the imagePullPolicy on the singleuser pods that are spun up by the hub.
# example existing pull secret singleuser: image: pullSecrets: - gcr-pull
list of initContainers to be run every singleuser pod. See Kubernetes Docs
singleuser: initContainers: - name: init-myservice image: busybox:1.28 command: ['sh', '-c', 'command1'] - name: init-mydb image: busybox:1.28 command: ['sh', '-c', 'command2']
For more information about the profile list, see KubeSpawner’s documentation as this is simply a passthrough to that configuration.
NOTE: The image-pullers are aware of the overrides of images in singleuser.profileList but they won’t be if you configure it in JupyterHub’s configuration of ‘c.KubeSpawner.profile_list.
singleuser.profileList
c.KubeSpawner.profile_list
singleuser: profileList: - display_name: "Default: Shared, 8 CPU cores" description: "Your code will run on a shared machine with CPU only." default: True - display_name: "Personal, 4 CPU cores & 26GB RAM, 1 NVIDIA Tesla K80 GPU" description: "Your code will run a personal machine with a GPU." kubespawner_override: extra_resource_limits: nvidia.com/gpu: "1"
Deprecated and no longer does anything. Use the user-scheduler instead in order to accomplish a good packing of the user pods.
Extra environment variables that should be set for the user pods.
String literals with $(ENV_VAR_NAME) will be expanded by Kubelet which is a part of Kubernetes. Note that the user pods will already have access to a set of environment variables that you can use, like JUPYTERHUB_USER and JUPYTERHUB_HOST. For more information about these inspect this source code.
JUPYTERHUB_USER
JUPYTERHUB_HOST
singleuser: extraEnv: # basic notation (for literal values only) MY_ENV_VARS_NAME1: "my env var value 1" # explicit notation (the "name" field takes precedence) USER_NAMESPACE: name: USER_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace # implicit notation (the "name" field is implied) PREFIXED_USER_NAMESPACE: value: "my-prefix-$(USER_NAMESPACE)" SECRET_VALUE: valueFrom: secretKeyRef: name: my-k8s-secret key: password
Tolerations allow a pod to be scheduled on nodes with taints. These are additional tolerations other than the user pods and core pods default ones hub.jupyter.org/dedicated=user:NoSchedule or hub.jupyter.org/dedicated=core:NoSchedule. Note that a duplicate set of tolerations exist where / is replaced with _ as the Google cloud does not support the character / yet in the toleration.
hub.jupyter.org/dedicated=user:NoSchedule
hub.jupyter.org/dedicated=core:NoSchedule
/
_
Pass this field an array of Toleration objects.
Toleration
Affinities describe where pods prefer or require to be scheduled, they may prefer or require a node where they are to be scheduled to have a certain label (node affinity). They may also require to be scheduled in proximity or with a lack of proximity to another pod (pod affinity and anti pod affinity).
Pass this field an array of NodeSelectorTerm objects.
NodeSelectorTerm
Pass this field an array of PreferredSchedulingTerm objects.
PreferredSchedulingTerm
See the description of singleuser.extraNodeAffinity.
singleuser.extraNodeAffinity
Pass this field an array of PodAffinityTerm objects.
PodAffinityTerm
Pass this field an array of WeightedPodAffinityTerm objects.
WeightedPodAffinityTerm
Objects for customizing the scheduling of various pods on the nodes and related labels.
The user scheduler is making sure that user pods are scheduled tight on nodes, this is useful for autoscaling of user node pools.
Enables the user scheduler.
You can have multiple schedulers to share the workload or improve availability on node failure.
The image containing the kube-scheduler binary.
Set the Pod Disruption Budget for the user scheduler.
Whether PodDisruptionBudget is enabled for the user scheduler.
Pod Priority is used to allow real users evict placeholder pods that in turn triggers a scale up by a cluster autoscaler. So, enabling this option will only make sense if the following conditions are met:
Your Kubernetes cluster has at least version 1.11
A cluster autoscaler is installed
user-placeholer pods is configured to get a priority equal or higher than the cluster autoscaler’s priority cutoff
Normal user pods have a higher priority than the user-placeholder pods
Note that if the default priority cutoff if not configured on cluster autoscaler, it will currently default to 0, and that in the future this is meant to be lowered. If your cloud provider is installing the cluster autoscaler for you, they may also configure this specifically.
Recommended settings for a cluster autoscaler…
… with a priority cutoff of -10 (GKE):
podPriority: enabled: true globalDefault: false defaultPriority: 0 userPlaceholderPriority: -10
… with a priority cutoff of 0:
podPriority: enabled: true globalDefault: true defaultPriority: 10 userPlaceholderPriority: 0
Warning! This will influence all pods in the cluster.
The priority a pod usually get is 0. But this can be overridden with a PriorityClass resource if it is declared to be the global default. This configuration option allows for the creation of such global default.
The actual value for the default pod priority.
The actual value for the user-placeholder pods’ priority.
User placeholders simulate users but will thanks to PodPriority be evicted by the cluster autoscaler if a real user shows up. In this way placeholders allow you to create a headroom for the real users and reduce the risk of a user having to wait for a node to be added. Be sure to use the the continuous image puller as well along with placeholders, so the images are also available when real users arrive.
To test your setup efficiently, you can adjust the amount of user placeholders with the following command:
# Configure to have 3 user placeholders kubectl scale sts/user-placeholder --replicas=3
How many placeholder pods would you like to have?
Unless specified here, the placeholder pods will request the same resources specified for the real singleuser pods.
These settings influence the core pods like the hub, proxy and user-scheduler pods.
Where should pods be scheduled? Perhaps on nodes with a certain label is preferred or even required?
Decide if core pods ignore, prefer or require to schedule on nodes with this label:
hub.jupyter.org/node-purpose=core
These settings influence the user pods like the user-placeholder, user-dummy and actual user pods named like jupyter-someusername.
Decide if user pods ignore, prefer or require to schedule on nodes with this label:
hub.jupyter.org/node-purpose=user
Enable the creation of a Kubernetes Ingress to proxy-public service.
See Advanced Topics — Zero to JupyterHub with Kubernetes 0.7.0 documentation for more details.
Annotations to apply to the Ingress.
List of hosts to route requests to the proxy.
Suffix added to Ingress’s routing path pattern.
Specify * if your ingress matches path by glob pattern.
*
TLS configurations for Ingress.
Annotations to apply to the hook and continous image puller pods. One example use case is to disable istio sidecars which could interfere with the image pulling.
These are standard Kubernetes resources with requests and limits for cpu and memory. They will be used on the containers in the pods pulling images. These should be set extremely low as the containers shut down directly or is a pause container that just idles.
They were made configurable as usage of ResourceQuota may require containers in the namespace to have explicit resources set.
See the optimization section for more details.
The hook-image-awaiter has a criteria to await all the hook-image-puller DaemonSet’s pods to both schedule and finish their image pulling. This flag can be used to relax this criteria to instead only await the pods that has already scheduled to finish image pulling after a certain duration.
hook-image-awaiter
hook-image-puller
The value of this is that sometimes the newly created hook-image-puller pods cannot be scheduled because nodes are full, and then it probably won’t make sense to block a helm upgrade.
helm upgrade
An infinite duration to wait for pods to schedule can be represented by -1. This was the default behavior of version 0.9.0 and earlier.
-1
NOTE: If used with a Cluster Autoscaler (an autoscaling node pool), also add user-placeholders and enable pod priority.
prePuller: extraImages: myExtraImageIWantPulled: name: jupyter/all-spark-notebook tag: 2343e33dec46
Additional values to pass to the Hub. JupyterHub will not itself look at these, but you can read values in your own custom config via hub.extraConfig. For example:
custom: myHost: "https://example.horse" hub: extraConfig: myConfig.py: | c.MyAuthenticator.host = get_config("custom.myHost")