Subscribe to our newsletter
📬 Receive new lessons straight to your inbox (once a month) and join 40K+ developers in learning how to responsibly deliver value with ML.
Before performing a commit to our local repository, there are a lot of items on our mental todo list, ranging from styling, formatting, testing, etc. And it's very easy to forget some of these steps, especially when we want to "push to quick fix". To help us manage all these important steps, we can use pre-commit hooks, which will automatically be triggered when we try to perform a commit. These hooks can ensure that certain rules are followed or specific actions are executed successfully and if any of them fail, the commit will be aborted.
We'll be using the Pre-commit framework to help us automatically perform important checks via hooks when we make a commit.
We'll start by installing and autoupdating pre-commit (we only have to do this once).
We define our pre-commit hooks via a .pre-commit-config.yaml configuration file. We can either create our yaml configuration from scratch or use the pre-commit CLI to create a sample configuration which we can add to.
When it comes to creating and using hooks, we have several options to choose from.
Inside the sample configuration, we can see that pre-commit has added some default hooks from it's repository. It specifies the location of the repository, version as well as the specific hook ids to use. We can read about the function of these hooks and add even more by exploring pre-commit's built-in hooks. Many of them also have additional arguments that we can configure to customize the hook.
1
2
3
4
5
6 | # Inside .pre-commit-config.yaml
...
- id: check-added-large-files
args: ['--maxkb=1000']
exclude: "notebooks"
...
|
Be sure to explore the many other built-in hooks because there are some really useful ones that we use in our project. For example, check-merge-conflict to see if there are any lingering merge conflict strings or detect-aws-credentials if we accidentally left our credentials exposed in a file, and so much more.
And we can also exclude certain files from being processed by the hooks by using the optional exclude key. There are many other optional keys we can configure for each hook ID.
1
2
3
4
5 | # Inside .pre-commit-config.yaml
...
- id: check-yaml
exclude: "mkdocs.yml"
...
|
Besides pre-commit's built-in hooks, there are also many custom, 3rd party popular hooks that we can choose from. For example, if we want to apply formatting checks with Black as a hook, we can leverage Black's pre-commit hook.
1
2
3
4
5
6
7
8
9 | # Inside .pre-commit-config.yaml
...
- repo: https://github.com/psf/black
rev: 20.8b1
hooks:
- id: black
args: []
files: .
...
|
This specific hook is defined under a .pre-commit-hooks.yaml inside Black's repository, as are other custom hooks under their respective package repositories.
We can also create our own local hooks without configuring a separate .pre-commit-hooks.yaml. Here we're defining two pre-commit hooks, test-non-training and clean, to run some commands that we've defined in our Makefile. Similarly, we can run any entry command with arguments to create hooks very quickly.
1
2
3
4
5
6
7
8
9
10 | # Inside .pre-commit-config.yaml
...
- repo: local
hooks:
- id: clean
name: clean
entry: make
args: ["clean"]
language: system
pass_filenames: false
|
Our pre-commit hooks will automatically execute when we try to make a commit. We'll be able to see if each hook passed or failed and make any changes. If any of the hooks fail, we have to fix the errors ourselves or, in many instances, reformatting will occur automatically.
check yaml..............................................PASSED clean...................................................FAILEDIn the event that any of the hooks failed, we need to add and commit again to ensure that all hooks are passed.
Though pre-commit hooks are meant to run before (pre) a commit, we can manually trigger all or individual hooks on all or a set of files.
It is highly not recommended to skip running any of the pre-commit hooks because they are there for a reason. But for some highly urgent, world saving commits, we can use the no-verify flag.
Highly recommend not doing this because no commit deserves to be force pushed no matter how "small" your change was. If you accidentally did this and want to clear the cache, run pre-commit run --all-files and execute the commit message operation again.
In our .pre-commit-config.yaml configuration files, we've had to specify the versions for each of the repositories so we can use their latest hooks. Pre-commit has an autoupdate CLI command which will update these versions as they become available.
Upcoming live cohorts
Sign up for our upcoming live cohort, where we'll provide live lessons + QA, compute (GPUs) and community to learn everything in one day.
To cite this content, please use:
1
2
3
4
5
6 | @article{madewithml,
author = {Goku Mohandas},
title = { Pre-commit - Made With ML },
howpublished = {\url{https://madewithml.com/}},
year = {2023}
}
|