--- title: LexImpact Prepare Data keywords: fastai sidebar: home_sidebar nb_path: "notebooks/contributing.ipynb" ---
{% raw %}
{% endraw %}

Pré-requis

Sur Ubuntu il faut installer python3.8-venv avant Poetry :

sudo apt-get install python3.8-venv

Poetry

curl -sSL https://install.python-poetry.org | python3 -

Ajouter la commande suivante dans le .bashrc (toujours nécessaire ?) :

export PATH="$HOME/.local/bin:$PATH"

Debug Poetry

Pour supprimer un environnement : https://python-poetry.org/docs/managing-environments/

poetry env list
poetry env remove 3.7

Pour afficher l'arbre des dépendances:

 poetry show --tree

Activer la version 3.8

poetry env use 3.8

Installation des dépendances

poetry install

Pour développer la pipeline, il faut des packages supplémentaires :

poetry install --extras "pipeline"

Specifier la version de Python à Poetry

poetry env use /usr/bin/python3.8

How to develop

{% raw %}
!ln -s ../leximpact_prepare_data
!cd analyses && ln -s ../../leximpact_prepare_data
!cd extractions_base_des_impots && ln -s ../../leximpact_prepare_data
!cd retraitement_erfs-fpr && ln -s ../../leximpact_prepare_data
{% endraw %}

Update package to last version

poetry update

Jupyter

First time, and after adding a librairy :

{% raw %}
!~/.local/bin/poetry run python -m ipykernel install --name leximpact-prepare-data-kernel --user
{% endraw %}

Launch jupyter

poetry run jupyter lab

Jupyter Plotly

Pour voir les graph Plotly

poetry run jupyter labextension install jupyterlab-plotly On ne l'utilise pas (encore ?)

Check style

poetry run pre-commit  run --all-files

NBDev

# Run pre-commit before converting notebooks
poetry run pre-commit  run --all-files
# Build lib from notebook
poetry run nbdev_build_lib
# Build docs from notebook
poetry run nbdev_build_docs
# Re-run pre-commit
poetry run pre-commit  run --all-files
{% raw %}
!make precommit
{% endraw %} {% raw %}
#!poetry run nbdev_build_docs
!cd .. && make docs
{% endraw %}

Lien sécurisé vers l'ERFS-FPR

sudo mkdir -p /mnt/data-in /mnt/data-out
sudo chown $USER:$USER /mnt/data-*
sshfs dc5:/rpool/private-data/input /mnt/data-in
sshfs dc5:/rpool/private-data/output /mnt/data-out

How we build the docs

The documentation is available at https://documentation.leximpact.dev/leximpact_prepare_data/

It's build with NBDev in the GitLab CI.

Due to dependancies conflicts, we have to do it like this:

  • Use Poetry env for default environnnement
  • Use venv for specific env to remove notebook output, because --clear-output do not work with nbconvert < 6 that is needed for nbdev. We do it to avoid publishing sensitive data. We have to find a better way to publish outputs without sensitive data.
  • Use docker for nbdev_build_docs because it does not work in our env for unkown reason.

Then we copy the docs via scp to our server and build the final statics docs with Jekyll on it.

NBDev build docs with Jekyll because it is supported by Github for free hosting.

Test de la doc en local

Pour convertir les Notebooks en Jekyll:

docker run -v $PWD:/project -w /project -v /media/data-in:/mnt/data-in -v /media/data-out:/mnt/data-out fastai/jekyll sh deploy/build_docs.sh

Pour interpréter le Jekyll:

make docs_serve

Puis aller sur http://127.0.0.1:4000/leximpact_prepare_data//.