Running SQLFlow server as Workflow Mode

This tutorial explains how to configure the SQLFLow server to translate SQL programs into workflows. For what is a workflow and why we need it, please refer to here. For the workflow package of SQLFLow that interact with the workflow execution engine, please refer to here

For a typical workload, users usually type in a SQL program in a GUI system like Jupyter Notebook. The SQLFlow magic command would send the the SQL program to SQLFlow server via gRPC. The SQLFlow gRPC server accepts a SQL program as the request and translate into a workflow. Some steps of the workflow send normal SQL statement to the database system, and some others might submit an AI job to train a model, to predict using a trained model or to visually explain a trained model.

A Step-by-step Guide

As the following step-by-step guide, you can setup the requirements components e.g. a mini Kubernetes cluster(minikube), SQLFlow server and Jupyter Notebook on you laptop. For the last step, you can run a SQL program in a Jupyter Notebook.

Setup Kubernetes and Argo

  1. Install Minikube
  2. Setup Minikube

    minikube start --cpus 2 --memory 4000
  3. Install Argo controller and Argo UI.

     kubectl crete namespace argo
     kubectl apply -n argo -f
  4. Forward the port of Argo UI server to the host

     kubectl -n argo port-forward deployment/argo-server 2746:2746

Launch SQLFlow server

docker run --rm -it --name sqlflow sqlflow/sqlflow bash
SQLFLOW_WORKFLOW_LOGVIEW_ENDPOINT=http://localhost:8001 sqlflowserver --argo-mode=true

SQLFLOW_WORKFLOW_LOGVIEW_ENDPOINT specify the Argo UI endpoint which can be used to assemble the step log url.

Launch a MySQL server on Kubernetes

The SQLFLow all-in-one Docker image packaged the MySQL server and sample datasets, you can launch a MySQL server on Kubernetes with populating the sample datasets as the following command:

kubectl run mysql --port 3306 --env="SQLFLOW_MYSQL_HOST=" --env="SQLFLOW_MYSQL_PORT=3306" --image=sqlflow/sqlflow --command -- bash / mysql

Launch Jupyter Notebook Server and Run SQLFlow SQL

Run docker exec to exec into the sqlflow container and launch Jupyter Notebook server:

docker exec -it sqlflow bash
SQLFLOW_MYSQL_HOST= SQLFLOW_MYSQL_PORT=3306 bash / sqlflow-notebook

SQLFLOW_MYSQL_HOST is the Pod IP of MySQL, you can check it out by kubectl get pod -l run=mysql -o jsonpath="{.items[0]}"

Open browser and goto http://localhost:8888, create a new Python3 Kerenel and type in the following SQL program:

SELECT * FROM iris.train
LABEL class
INTO sqlflow_models.my_dnn_model;

SELECT * FROM iris.test
TO PREDICT iris.predict.class
USING sqlflow_models.my_dnn_model;

Then You would see the step logs like the following screen-cap:

If you want to take a look at the detailed logs of step, you can click the Step Log link to jump to the log viewer webpage: