Wednesday, January 31, 2018

Auto starting Jupyter Notebook on AWS Deep Learning server

Cloud and computing on demand is an increasingly powerful and cost effective combination of enabling technologies for data scientists. Further, utilizing machine learning servers such as those based on AWS deep learning AMIs can make a full suite of machine learning tools available in a matter of minutes.

Jupyter Notebook is a popular development interface for data analysis and model training. Currently, AWS has a published procedure for configuring, starting, and connecting to notebook server.
https://docs.aws.amazon.com/dlami/latest/devguide/setup-jupyter.html

However, setting up can be challenging, and repeating the above step each time an instance restarts is not ideal, especially when server is offered to the broad data science community.

Here is an alternative and enhancement to auto start notebook server.

Adapt for your specific environment. Here we assume we to use AWS deep learning conda image (ubuntu). Specially we install into "python3" environment (source activate python3).

Configure Jupyter Notebook

Similar to steps outlined here, configure Jupyter Notebook, which consists of:

Create key and cert. For example, in ~/.jupyter/ directory:
openssl req -x509 -nodes -days 11499 -newkey rsa:1024 -keyout "jupytercert.key" -out "jupytercert.pem" -batch

Create notebook password, copy generated string in .json file
jupyter notebook password

update ~/.jupyter/jupyter_notebook_config.py
c.NotebookApp.open_browser = False
c.NotebookApp.ip = '*'
c.NotebookApp.port = 8888
c.NotebookApp.password = sha1:xxx
c.NotebookApp.certfile = '/home/ubuntu/.jupyter/jupytercert.pem'
c.NotebookApp.keyfile = '/home/ubuntu/.jupyter/jupytercert.key'

Set up Auto Start Jupyter Notebook (virtualenv)

Setting up auto start is usually straightforward (for example, use /etc/rc.local). In this case, because the target environment is virtualenv. We don't want to auto start in the default python environement, or as root user. But we still want to use rc.local. Use the following 2 step process.

create a script /home/ubuntu/.jupyter/start_notebook.sh (note use of absolute path to invoke the executable) 
#!/bin/bash
source /home/ubuntu/anaconda3/bin/activate python3
/home/ubuntu/anaconda3/envs/python3/bin/jupyter notebook &


Edit /etc/rc.local and add the following, note we switch to ubuntu user, and invoke the startup script:
cd /home/ubuntu
su ubuntu -c "nohup /home/ubuntu/.jupyter/start_notebook.sh >/dev/null 2>&1 &"

The reason for this two step process is to be able to execute multiple commands (I didn't find effective ways to do that easily in rc.local)

User Access to Jupyter Notebook

Jupyter Notebook will always start automatically with instance. Without any additional set up, user can conveniently access Jupyter server at
"https:(server IP):8888"

No comments:

Post a Comment