The source code of SQLFlow is in Go, Java, protobuf, yacc, and Python. To build from source code, we need toolchains of all these languages. In addition to that, we need to install MySQL, Hive, and MaxCompute client for unit tests. To ease the software installation and configuration, we provide a
Dockerfile that contains all the requirement software for building and testing.
- Git for checking out the source code.
- Docker CE >= 18.x for building the Docker image of development tools.
We can clone the source code to any working directory, say,
cd ~ git clone https://github.com/sql-machine-learning/sqlflow
We can build the Docker image from the
cd sqlflow docker build -t sqlflow .
Or, we can pull the Docker image pre-built by the CI system from DockerHub.
docker pull sqlflow/sqlflow docker tag sqlflow/sqlflow:latest sqlflow:latest
Let us start a container running the development Docker image.
docker run --rm -it -v $HOME/sqlflow:/sqlflow -w /sqlflow sqlflow bash
In the Docker container, we need to start a MySQL server for testing.
service mysql start
Then, we can build and run tests.
go generate ./... PYTHONPATH=/sqlflow/python SQLFLOW_TEST_DB=mysql gotest -v -p 1 ./...
go generate is necessary to call
protoc for translating gRPC interface and to call
goyacc for generating the parser.
The environment variable
PYTHONPATH=$GOPATH/src/sqlflow.org/sqlflow/python ensures the python part of SQLFlow in the Docker image is up to date.
The environment variable
SQLFLOW_TEST_DB=mysql specify MySQL as the SQL engine during testing. You can also choose
hive for Apache Hive and
maxcompute for Alibaba MaxCompute.
-p 1 argument is necessary to run all tests, otherwise you will encounter the same problem as this
issue. Please feel free to use
go test instead of
gotest. We use the latter one for colorized output.
As the above
docker run command binds the source code directory on the host computer to the container, we can edit the source code on the host using any editor, VS Code, Emacs, etc.
After the editing and before you can Git commit, please install the
pre-commit tool. SQLFlow needs it to run pre-commit checks.
SQLFlow provides a command-line tool
repl for evaluating SQL statements. This tool makes it easy to debug. To build it, run the following commands.
cd cmd/repl go install ~/go/bin/repl --datasource="mysql://root:root@tcp(localhost:3306)/?maxAllowedPacket=0"
Please follow the REPL tutorial to understand what we can do with the REPL.