Autorun Python Notebooks in AWS Sagemaker

Erima Goyal
4 min readOct 23, 2020

Problem Statement: Deploy machine learning models on AWS without getting into dockers, containers, and other advanced deployment tools.

I created a few python notebooks with each notebook containing one machine learning model, prefaced by some feature engineering and data manipulation steps. Without getting into advanced deployment options, I just wanted to run all the jupyter notebooks, within the Sagemaker notebook, on a weekly basis. Of course, I did not want to babysit and manually run each notebook either.

Solution: As part of a bigger solution,
1. Lambda functions can auto start the Sagemaker notebook at a certain time
2. Lifecycle configuration can execute the notebook on start and then close the Sagemaker instance once it is done executing

In this blog post, I am capturing how to use Lifecycle configuration.

There are two parts to the Lifecycle Configuration
1. Run the notebook on start: Use nbconvert to run multiple python notebooks within the Sagemaker instance
2. Stop the Sagemaker notebook: Use the auto-stop-idle configuration that is widely available on git and AWS blogs. However, I intend to capture some nuances.

Part 1: Run the notebook on start

Some housekeeping edits in the notebook as they are not supported by nbconvert and lifecycle configurations
1. Take out all package installations from your notebook cells
2. Remove matplotlib and all related visualizations

Lifecycle configuration script

set -e
ENVIRONMENT=python3
#Declare all the jupyter notebooks that need to run, within the Sagemaker instance
FILE1="/home/ec2-user/SageMaker/Notebook1.ipynb"
FILE2="/home/ec2-user/SageMaker/Notebook2.ipynb"
FILE3="/home/ec2-user/SageMaker/Notebook3.ipynb"
#Activate python environment. The lifecycle configuration cannot autodetect the environment
source /home/ec2-user/anaconda3/bin/activate "$ENVIRONMENT"
#Install packages here instead of putting them inside the notebooks
pip install --upgrade pip
pip install PyAthena
#Execute the notebook in background
nohup jupyter nbconvert "$FILE1" "$FILE2" "$FILE3" --ExecutePreprocessor.kernel_name=python3 --to notebook --inplace --ExecutePreprocessor.timeout=7200 --execute &
#Deactivate the python environment
source /home/ec2-user/anaconda3/bin/deactivate

Decoding the script

nohup

nohup jupyter nbconvert "$FILE1" "$FILE2" "$FILE3" --ExecutePreprocessor.kernel_name=python3 --to notebook --inplace  --ExecutePreprocessor.timeout=7200 --execute &

Lifecycle config has 5 minutes to get through. If it does not finish then the notebook fails to start. However, using “nohup” makes the notebook run in the background while it starts in the console as well. If your notebook takes less than 5 mins to run, you can ignore this.

nbconvert

nohup jupyter nbconvert "$FILE1" "$FILE2" "$FILE3" --ExecutePreprocessor.kernel_name=python3 --to notebook --inplace  --ExecutePreprocessor.timeout=7200 --execute &

Executes the notebook and saves it as HTML format, which is the default output format for nbconvert. To run the notebooks sequentially, club them all in one nbconvert; however, to run them in parallel, use multiple nbconvert statements.

to notebook and inplace

nohup jupyter nbconvert "$FILE1" "$FILE2" "$FILE3" --ExecutePreprocessor.kernel_name=python3 --to notebook --inplace  --ExecutePreprocessor.timeout=7200 --execute &

Updates the same notebook once it finishes running executing the notebook. Not using “inplace” will create save the result in another copy of the .ipynb notebook.

timeout

nohup jupyter nbconvert "$FILE1" "$FILE2" "$FILE3" --ExecutePreprocessor.kernel_name=python3 --to notebook --inplace  --ExecutePreprocessor.timeout=7200 --execute &

The arguments passed to ExecutePreprocessor are configuration options called traitlets. The timeout traitlet defines the maximum time (in seconds) each notebook cell is allowed to run, if the execution takes longer an exception will be raised. The default is 30 s, so in cases of long-running cells, you may want to specify a higher value. The timeout option can also be set to None or -1 to remove any restriction on execution time.

For further customization, detailed documentation on nbconvert is available online.

Debugging errors

1. Track the logs in Cloudwatch by clicking “View logs” on the link below the Lifecycle configuration

2. Run the commands in the following order one by one in the terminal

$ cd Sagemaker\YourNotebook
$ source /home/ec2-user/anaconda3/bin/activate python3
$ pip install <packages>
$ jupyter nbconvert TestNotebook.ipynb --ExecutePreprocessor.kernel_name=python3 --to notebook --inplace --execute
$ source /home/ec2-user/anaconda3/bin/deactivate

Part 2: Stop the Sagemaker notebook

Appending the following to the lifecycle configuration will stop the notebook after the time specified in IDLE_TIME.

IDLE_TIME=7200  # 2 hrs
echo "Fetching the autostop script"
wget https://raw.githubusercontent.com/aws-samples/amazon-sagemaker-notebook-instance-lifecycle-config-samples/master/scripts/auto-stop-idle/autostop.py
echo "Starting the SageMaker autostop script in cron"
(crontab -l 2>/dev/null; echo "*/1 * * * * /usr/bin/python $PWD/autostop.py --time $IDLE_TIME --ignore-connections") | crontab

The key here is to put idle time as the total runtime that your notebooks will take to run. In my case, the notebooks would take ~2hr to run, so I changed the idle time to 7200.

Bad character error

You must be wondering where is that ^M and /r in your code that is showing up in the logs. It is not your problem, it is unix/windows communication challenge where the new line in windows is not interpreted well by Unix. There are few ways to deal with this — use a visual studio code editor, use notepad++ OR copy over the code directly from git.

--

--

Erima Goyal

Erima has 10+ years of experience in Data Science space. Currently she is leading the data science team at Parkland Corporation.