Developer Guidelines
The repository contains a devcontainer configuration compatible with VsCode. This configuration will create a docker container with all the necessary libraries and configurations to develop and execute the source code.
๐ ๏ธ Development Environment Setup
This repository provides a docker dev-container with a system completely configured to execute the source code as well as useful libraries and tools for the development of the multimno software. Thanks to the dev-container, it is guaranteed that all developers are developing/executing code in the same environment.
The dev-container specification is all stored inside the .devcontainer
directory.
Configuring the docker container
Configuration parameters for building the docker image and for creating the docker container are specified in
the configuration file .devcontainer/.env
file. This file
contains user specific container configuration (like the path in the host machine to the data). As this file will change
for each developer it is ignored for the git version control and must be created after cloning the repository.
A template file
is stored in this repostiory. You can use this
file as a baseline copying it to the .devcontainer
directory.
cp resources/templates/dev_container_template.env .devcontainer/.env
Please edit the .devcontainer/.env
file Docker run parameters
section
with your preferences:
# ------------------- Docker Build parameters -------------------
PYTHON_VERSION=3.11 # Python version.
JDK_VERSION=17 # Java version.
SPARK_VERSION=3.4.1 # Spark/Pyspark version.
SCALA_VERSION=2.12 # Spark dependency.
SEDONA_VERSION=1.5.1 # Sedona
GEOTOOLS_WRAPPER_VERSION=28.2 # Sedona dependency
# ------------------- Docker run parameters -------------------
CONTAINER_NAME=multimno_dev_container # Container name.
DATA_DIR=../sample_data # Path of the host machine to the data to be used within the container.
SPARK_LOGS_DIR=../sample_data/logs # Path of the host machine to where the spark logs will be stored.
JL_PORT=8888 # Port of the host machine to deploy a jupyterlab.
JL_CPU=4 # CPU cores of the container.
JL_MEM=16g # RAM of the container.
Starting the dev environment
VsCode
Prerequisite: Dev-Container/Docker extension
In VsCode: F1 -> Dev Containers: Rebuild and Reopen in container
Manual
Docker image creation
Execute the following command:
docker compose -f .devcontainer/docker-compose.yml --env-file=.devcontainer/.env build
Docker container creation
Create a container and start a shell session in it with the commands:
docker compose -f .devcontainer/docker-compose.yml --env-file=.devcontainer/.env up -d
docker exec -it multimno_dev_container bash
Stopping the dev environment
VsCode
Closing VsCode will automatically stop the devcontainer.
Manual
Exit the terminal with:
Ctrl+D or writing exit
Deleting the dev environment
Delete the container created with:
docker compose -f .devcontainer/docker-compose.yml --env-file=.devcontainer/.env down
๐ Execution
Launching a single component
In a terminal execute the command:
spark-submit multimno/main.py <component_id> <path_to_general_config> <path_to_component_config>
Example:
spark-submit multimno/main.py SyntheticEvents pipe_configs/configurations/general_config.ini pipe_configs/configurations/synthetic_events/synth_config.ini
Launching a pipeline
In a terminal execute the command:
python multimno/orchestrator.py <pipeline_json_path>
Example:
python multimno/orchestrator.py pipe_configs/pipelines/pipeline.json
๐ Monitoring/Debug
Launching a spark history server
The history server will access SparkUI logs stored at the path ${SPARK_LOGS_DIR} defined in the .devcontainer/.env
file.
Starting the history server
start-history-server.sh
๐ชถ Style
Coding style
The code shall follow the standard PEP 8 which is the coding style proposed for writing clean, readable, and maintainable Python code. It was created to promote consistency in Python code and make it easier for developers to collaborate on projects.
Docstring style
The docstrings written in the code shall follow the Google Docstrings style. Adhering to a unique docstring style guarantees consistency within software development in a project. Google Docstrings are the most popular convention for docstrings which facilitates readability and collaboration in open-source projects.
Google Docstrings official guide
๐งผ Code cleaning
Developed code shall be formatted and jupyter notebooks shall be cleaned of outputs to guarantee consistency and reduce unnecessary differences between commits.
Code Linting
The python code generated shall be formatted with black. For formatting all source code execute the following command:
black -l 120 multimno tests/test_code/
Clean jupyter notebooks
find ./notebooks/ -type f -name \*.ipynb | xargs jupyter nbconvert --clear-output --inplace
Code Security Scan
bandit -r multimno
๐งช Testing
Launch Tests
Manual
pytest tests/test_code/multimno
VsCode
1) Open the test view at the left panel. 2) Launch tests.
Generate test & coverage reports
Launch the command
pytest --cov-config=tests/.coveragerc \
--cov-report="html:docs/autodoc/coverage" \
--cov=multimno --html=docs/autodoc/test_report.md \
--self-contained-html tests/test_code/multimno
Test reports will be stored in the dir: docs/autodoc
See coverage in IDE (VsCode extension)
1) Launch tests with coverage to generate the coverage report (xml)
pytest --cov-report="xml" --cov=multimno tests/test_code/multimno
Note: You can see the coverage percentage at the bottom bar
๐ Code Documentation
Documentation server Debug(Mkdocs)
A code documentation can be deployed using mkdocs backend.
1) Create documentation (This will launch all tests)
./resources/scripts/generate_docs.sh
mkdocs serve
Documentation deploy (mike)
Set latest
as default version
mike set-default --push latest
Deploy a version of the documentation with:
mike deploy --push --update-aliases <version> latest
Example:
mike deploy --push --update-aliases 0.2 latest