We recently started to learn about cloud computing. We are looking into both AWS and GCP. And while there are many great tutorials, we noticed that most of them have something in common: they use great software! By that we mean the software they deploy is easy to set up and run. But what if you need to deploy code that is messy? That has weird dependencies? That can't just be apt-get installed? In order to learn that we wrote a messy application and deployed it.
First we introduce maap, the messy application. What makes maap messy? It's written in C++ so it has to be compiled and some of the code it uses only gets created when it is compiled! Maap reads its input data from .mf files or messy files, which is a custom file format that only maap understands. Also it depends on Google's protobuf library which we installed using conda, so maap needs to know about that too. Here is a sketch of what maap does, how it relates to other tools and how to set it up:
Let's assume we are not allowed to change anything about maap and our task is to deploy it to a cloud. Since we can't just install it and it's dependencies are weird we decide to put maap in a docker image. Everything that needs to happen to make maap possible, like setting up conda, installing dependencies, declaring environment variables and compiling maap is put in the Dockerfile:
# Create an image with conda already installed
FROM continuumio/anaconda3
# Declare environment variable that links to conda
ENV CONDA /opt/conda
# Install dependencies
RUN conda install -c anaconda protobuf=3.5.1
# Add maap code to the image
WORKDIR /root
ADD . /root
# Compile maap using its Makefile
RUN make
# When the container gets started, maap gets started
CMD make run
Now that we have the docker image, it is easy to set up maap. But before we can deploy it we have to make the image available. We do that by pushing our image to our docker hub repository:
$ docker tag [IMAGE_ID] [REPO_NAME]/maap:latest
$ docker push [REPO_NAME]/maap:latest
Now the image can be used from anywhere! For this example we want to deploy maap to GCP. (We assume to already have a GCP account and gcloud). The GCP compute engine allows us to create a new virtual machine instance that has our docker image already loaded. (more info here) As we are writing this, the feature is still in beta though. Here is how you can set up an instance with the image loaded:$ gcloud beta compute instances create-with-container maap --container-image [image]
$ gcloud compute --project "your project name" ssh --zone "your vm zone" "maap"
If something failed you can check the log using this command:$ sudo journalctl -u konlet-startup
You can upload data using scp and process it by running the container and mounting the folder you put your input data into the container:$ docker run --rm -v /path/to/input:/root/input [REPO_NAME]/maap
The code for maap is on our github, where we also explain a bit more how the compilation works.

Keine Kommentare:
Kommentar veröffentlichen