Sunday, March 18, 2018

SageMaker model deployment - 3 simple steps

AWS SageMaker is a platform designed to support full lifecycle of data science process, from data preparation, to model training, to deployment. Having clean separation yet easy pipelining between model training and deployment is one of its greatest strength.  A model can be developed using a training instances and saved as files. The deployment process can retrieve model artifacts saved in S3, and deploy a run time environment as HTTP endpoints. Finally, any application can send REST queries and get prediction results back from deployed endpoints.

While simple in concept, information regarding the practical implementation of SageMaker model deployment and prediction queries is currently lacking and scattered. It is easier to grasp in the simple 3 step process contained in a notebook.

1. create deployment model

We assume a model has been built (trained), with results saved in S3. A deployment model is defined with both model artifacts and algorithm containers.

2. configure deployment instances

Next, define the size and number of deployment instances, which will host the run time for deployment model service endpoints.

3. deploy to service endpoints

Finally, create service endpoints, wait for completion, and model deployment is finished, now ready to service prediction requests.

The complete deployment process can be visualized as follows:

The complete sample notebook can be seen here:

Sunday, February 4, 2018

Auto starting R Studio on AWS Deep Learning server

As an enhancement to machine learning servers built on AWS or Azure, it is often necessary to set up R development environment to meet the needs of data science community.

Adapt for your specific environment. Here we assume we to use AWS deep learning conda image (ubuntu). Specially we use "python3" virtual environment (source activate python3). One of the reasons to use this environment is it is already set up to run Jupyter Notebook (see auto start jupyter), we can therefore add an additional R kernel to it.  Then we have a consolidated image that can be offered to both Python and R users.

The easiest method to install R is using conda:
conda install r r-essentials

RStudio is a popular development environment. Follow instructions to install RStudio, for example:
sudo apt-get install gdebi-core
sudo gdebi rstudio-server-1.1.419-i386.deb

The above procedure also sets up auto start of R studio server by adding /etc/systemd/system/rstudio-server.service. However, because the only available procedure installs RStudio with "sudo" into the default system environment, it cannot find R which has been installed into a different environment. As a result, RStudio fails to start with error indicating RStudio cannot find R.
rstudio-server verify-installation
Unable to find an installation of R on the system (which R didn't return valid output); Unable to locate R binary by scanning standard locations

This can be easily fixed by specifying the exact path to R for RStudio, replace path with your installation of R:
sudo sh -c 'echo "rsession-which-r=/home/ubuntu/anaconda3/envs/python3/bin/R" >> /etc/rstudio/rserver.conf'

Restart instance, now RStudio Server starts successfully. Login with Linux credential at:
"http:(server IP):8787"

Wednesday, January 31, 2018

Auto starting Jupyter Notebook on AWS Deep Learning server

Cloud and computing on demand is an increasingly powerful and cost effective combination of enabling technologies for data scientists. Further, utilizing machine learning servers such as those based on AWS deep learning AMIs can make a full suite of machine learning tools available in a matter of minutes.

Jupyter Notebook is a popular development interface for data analysis and model training. Currently, AWS has a published procedure for configuring, starting, and connecting to notebook server.

However, setting up can be challenging, and repeating the above step each time an instance restarts is not ideal, especially when server is offered to the broad data science community.

Here is an alternative and enhancement to auto start notebook server.

Adapt for your specific environment. Here we assume we to use AWS deep learning conda image (ubuntu). Specially we install into "python3" environment (source activate python3).

Configure Jupyter Notebook

Similar to steps outlined here, configure Jupyter Notebook, which consists of:

Create key and cert. For example, in ~/.jupyter/ directory:
openssl req -x509 -nodes -days 11499 -newkey rsa:1024 -keyout "jupytercert.key" -out "jupytercert.pem" -batch

Create notebook password, copy generated string in .json file
jupyter notebook password

update ~/.jupyter/
c.NotebookApp.open_browser = False
c.NotebookApp.ip = '*'
c.NotebookApp.port = 8888
c.NotebookApp.password = sha1:xxx
c.NotebookApp.certfile = '/home/ubuntu/.jupyter/jupytercert.pem'
c.NotebookApp.keyfile = '/home/ubuntu/.jupyter/jupytercert.key'

Set up Auto Start Jupyter Notebook (virtualenv)

Setting up auto start is usually straightforward (for example, use /etc/rc.local). In this case, because the target environment is virtualenv. We don't want to auto start in the default python environement, or as root user. But we still want to use rc.local. Use the following 2 step process.

create a script /home/ubuntu/.jupyter/ (note use of absolute path to invoke the executable) 
source /home/ubuntu/anaconda3/bin/activate python3
/home/ubuntu/anaconda3/envs/python3/bin/jupyter notebook &

Edit /etc/rc.local and add the following, note we switch to ubuntu user, and invoke the startup script:
cd /home/ubuntu
su ubuntu -c "nohup /home/ubuntu/.jupyter/ >/dev/null 2>&1 &"

The reason for this two step process is to be able to execute multiple commands (I didn't find effective ways to do that easily in rc.local)

User Access to Jupyter Notebook

Jupyter Notebook will always start automatically with instance. Without any additional set up, user can conveniently access Jupyter server at
"https:(server IP):8888"

Sunday, January 14, 2018

Azure automation with Logic App - passing variable in workflow

Similar to AWS Lambda, Azure Logic App can be used for automated workflow. However, clear documentation is harder to come by, with fewer working examples, and often lack of effective technical support.

In a workflow, it should be a common requirement to pass the output of one step to another step. The motivation to post this working solution, is there is no clear example that illustrates how exactly that is done. It should be learned in a few minutes, rather than hours of trial and error.

output from step 1

Using a simple two step workflow to illustrate, in step 1, we use an Azure Function App with a powershell script.  We can obtain a user email dynamically from Azure VM's user defined tag field.
$user_email = (Get-AzureRmVM -ResourceGroupName $resourceGroupName -Name $resourceName -ErrorAction $ErrorActionPreference -WarningAction $WarningPreference).Tags["user_email"]

More importantly, the obtained result needs to be sent to this rather odd "Out-File" structure. This is how variable can be passed in the workflow:
$result = $user_email | ConvertTo-Json
Out-File -Encoding Ascii -FilePath $res -inputObject $result

input to step 2

In a subsequent step, we can use the output of previous step, in this case, sending an email to VM's user per tag. This is best illustrated using the graphical interface of Logic App Designer:

Azure recognized a step generates an output, and make it available to be used for subsequent steps. The particular handle is shown as "Body" of Step 1 Function App, again, rather odd representation.

But it does work. And this simple mechanism is a much needed building block to construct complex features in a workflow.

Saturday, March 18, 2017

Three Networking features AWS should support

AWS is continuously enhancing and adding new features. However, a number of fundamental networking features have been discussed for a while, based on recent interactions with AWS team, still not on roadmap.

Here are three of those features high on my list, and why.

1. Multi-Path Routing (ECMP)
Currently, AWS routing table does not allow multiple routes to the same destination. For example, I can only define my default route in a private route table to a single destination (which can be a single point of failure).
If ECMP is supported, user will have a lot of load sharing and resiliency options. For example, I can define multiple default route to point to redundant load sharing gateways in multiple zones.

However, user still needs to keep those route up to date if the target instances changes. This can be done by keeping the ENI persistent and reattaching to new instances, or trigger lambda to update routes when instance refreshes

2. ELB as Route Table target
Supporting load balancer as a routing target may not seem natural as a network solution, there needs to be internal implementation that forward traffic to resolved load balancer and instances behind them.
This type of capability will allow user to fully benefit from the scalability and resiliency of load balancer, and have "native" high availability without the need for a self-maintained layer of lambda checks and actions.

An example that this can be done can be found with Azure, User Defined Route (UDR) can point to Azure Load Balancer (ALB), this enables route table to send traffic to a cluster of gateway nodes behind of load balancer, which leads to simple and elegant resiliency.

3. Native Transit VPC
In large scale enterprise use of AWS, as the number of VPCs go up, transit VPC can really help to scale by consolidating connectivity. Currently, there is a Cisco CSR based solution. But any third party appliances would require maintenance overhead, and introduce bottlenecks.

The ideal solution would be AWS enabled transit, to allow user to self define, much like peering connections.

I hope the these requirements are echoed by user communities.