Typechecking Python for fun (and profit?)
May 30, 2020 -
I'm assuming you agree (or will consider) that adding some type-checking to your Python code can help you find bugs or otherwise improve your software. You've definitely heard of mypy
, and possibly one or more of pytype
, pyre
, and pyright
.
That's a lot of options! What should you use?
tl;dr conclusions
- Use
pytype
, in your testing / continuous integration step (you do have one, I hope?)- Unless you're using Python 3.8, in which case you can't yet, and you should use
pyright
instead - Update, from the day after I wrote this: partial support is now here for Python 3.8 in
pytype
- Unless you're using Python 3.8, in which case you can't yet, and you should use
- If you want more constant assurance, use
pyright
in a commit hook. - If your editor has good support (e.g. PyCharm) that might suffice for you, but it's nice to have tools that work for all your collaborators who might not use the same editor
preamble & motivation
This is not a discussion of the whys and wherefores of type-checking Python code. Nor am I going to write a basic tutorial for getting started with it, or a detailed guide to its ins and outs (there's more than one). I'm also not going to be telling the world about how my large team of engineers implemented it in our million-line codebase for fun and profit.
I'm going to compare these four tools, for my use case, as of May 2020, in hopes that this will help someone else dip their toes in the water for their project. In that sense, this is more along the lines of a follow up (or perhaps even a response) to this "field test" post from a few months ago) -- my opinions have been shaped by using and re-evaluating these tools since early 2019 for production Python web and data science applications.
With that out of the way... here we go!
setup
So that you have a sense of what I'm working with (and how that may compare to what you're using):
A fresh Python 3.7.7 Conda environment on x86_64 GNU/Linux
A fresh checkout of dask/zict
- it's a small but not trivial project, with test subdirectories that need to be excluded, optional dependencies, a little bit of clever code, and other real-world things
- about 650 lines of code according to
tokei
- I'd hoped to be able to find a nice example of a type bug that was lurking undiscovered here, but I havent found one, so this will have to serve as a sample "clean" codebase
- (I've run
pip install -e .
in the checkout, to make sure my Python environment is set up)
A sample script with a number of errors I'd expect a typechecker to find:
- In
obvious_annotated_error
, we expect anint
argument that's then concatenated to a string - In
unannotated_error
, we call_innocuous_helper
with anint
when it's expecting a string - in
less_obvious_error
, we assume that the result from_ambiguous_returning_helper
will be a string, but if thewords
we've passed in don't contain"hi"
, it'll beNone
- Conversely, there's also the
AllKindsOfDynamic
class which I hope the typechecker will leave undisturbed
so, what are my options?
mypy
mypy
is probably the first project that comes to mind when you think of typing and Python -- with good reason, since the development of mypy
helped to drive a lot of the discussions and PEPs around typing in Python. It's actively developed, and there are lots of conference talks about it. It also benefits from the halo effect of being associated with Python's creator.
Mypy is sponsored by Dropbox.
configuration
(I'm using mypy==0.770
)
No configuration seems to be necessary -- running mypy
is as simple as pip install mypy
followed by mypy zict/
or mypy sample.py
. A setup.cfg
or mypy.ini
can be added, but I couldn't find a way to ignore the tests/
subdirectory within zict/
. So you might end up running it over your tests, too. You'll probably also want --ignore-missing-imports
to get started with, since otherwise mypy
will complain about not having type information for all the libraries you use.
speed
At well under a second this is fast enough for me on a small project, and there is a long-running daemon available available for larger projects.
$ time mypy -p zict --ignore-missing-imports
Success: no issues found in 18 source files
real 0m0.818s
user 0m0.754s
sys 0m0.063s
accuracy
For me, mypy
falls down on accuracy and useful errors -- there's no happy medium. Without any options, mypy
only finds 1 out of 3 expected errors in sample.py
:
$ mypy -m sample
sample.py:6: error: Unsupported operand types for + ("str" and "int")
Found 1 error in 1 file (checked 1 source file)
With the --strict
option, it demands that type annotations be added, but doesn't catch any more errors:
$ mypy -m sample --strict
sample.py:4: error: Function is missing a return type annotation
sample.py:6: error: Unsupported operand types for + ("str" and "int")
sample.py:9: error: Function is missing a type annotation
sample.py:14: error: Function is missing a type annotation
sample.py:18: error: Call to untyped function "_innocuous_helper" in typed context
sample.py:21: error: Function is missing a return type annotation
sample.py:33: error: Function is missing a type annotation
sample.py:49: error: Function is missing a type annotation
sample.py:54: error: Function is missing a type annotation
sample.py:61: error: Call to untyped function "AllKindsOfDynamic" in typed context
Found 10 errors in 1 file (checked 1 source file)
pytype
There's been less noise made about pytype
, but I've seen some mentions at conferences and the occasional tutorial.
Pytype is sponsored by Google.
configuration
(I'm using pytype==2020.5.13
)
pytype
will work if you just point it at your source directory, but in order to get it to ignore your tests files you need a configuration file -- this is pytype.cfg
by default (and there's a handy --generate-config
option to create one) but it'll read setup.cfg
too. If you use a configuration file, though, you have to configure everything within it -- including where to look for code to typecheck.
If you don't mind running it over your test files, I recommend the --keep-going
command-line option so it reports all errors rather than stopping at the first one.
My trimmed-down pytype.cfg
:
[pytype]
# Space-separated files / directories to exclude.
exclude =
**/versioneer.py
**/tests/**
**/test_*.py
# Space-separated files / directories to process.
inputs =
zict/
# Keep going past errors, analyze as many files as possible.
keep_going = True
speed
The biggest issue I have with pytype
is that it's slow. Even on this small project it's slow enough that I would be irked by running it afresh every commit:
$ time pytype --config pytype.cfg
Computing dependencies
Analyzing 11 sources with 0 local dependencies
ninja: Entering directory `/home/<...>/.pytype'
[11/11] check conf
Success: no errors found
real 0m5.651s
user 0m13.495s
sys 0m0.339s
It does have nice incremental checks based on ninja
, so subsequent runs are certainly fast enough:
$ time pytype --config pytype.cfg
Computing dependencies
Analyzing 11 sources with 0 local dependencies
ninja: Entering directory `/home/<...>/.pytype'
ninja: no work to do.
Success: no errors found
real 0m0.556s
user 0m0.480s
sys 0m0.076s
However, I've found that occasionally pytype
generates flaky results and clearing out the .pytype/
cache directory and re-running it fixes things. So I can't entirely shake a mistrust of the incremental builds, and the slowness irks me all the more.
accuracy
This is where pytype
shines for me -- it caught all three of the real errors in sample.py
, with partial tracebacks pointing out the error, and didn't touch the perfectly sound AllKindsOfDynamic
class:
$ pytype sample.py
Computing dependencies
Analyzing 1 sources with 0 local dependencies
ninja: Entering directory `/home/<...>/.pytype'
[1/1] check sample
FAILED: /home/<...>/.pytype/pyi/sample.pyi
/home/<...>/bin/python -m pytype.single --imports_info \
/home/<...>/.pytype/imports/sample.imports --module-name sample -V 3.7 -o /home/<...>/.pytype/pyi/sample.pyi --analyze-annotated --nofail --quick /home/<...>/sample.py
File "/home/<...>/sample.py", line 6, in obvious_annotated_error: unsupported operand type(s) for +: 'str' and 'int' [unsupported-operands]
Function __add__ on str expects str
File "/home/<...>/sample.py", line 11, in _innocuous_helper: No attribute 'split' on int [attribute-error]
Called from (traceback):
line 18, in unannotated_error
File "/home/<...>/sample.py", line 39, in less_obvious_error: No attribute 'upper' on None [attribute-error]
In Optional[str]
For more details, see https://google.github.io/pytype/errors.html.
ninja: build stopped: subcommand failed.
extras
The merge-pyi
script that comes with pytype
is interesting -- it can take the inferring .pyi
type stub file generated by pytype
and merge the annotations back into your code! In my experience I've found that pyre
's infer
subcommand (see below) does this just as well, with one fewer step in hunting down the generated .pyi
file, but it's very impressive nonetheless.
pyre
Not to be outdone by other companies, Facebook sponsors pyre
, which has even fewer conference talks about it (just the one hit that I could find) and has the temerity not even to be written in Python! Performance is the claim all over its website.
configuration
(I'm using pyre-check==0.0.46
)
Unlike mypy
or pytype
, pyre
won't run without any kind of configuration -- it complains until pyre init
is run to generate a JSON-formatted .pyre_configuration
file. It took me a little fiddling with the "source_directories"
setting to get pyre
to run without throwing up a lot of spurious "Undefined import"
errors -- and it looks like the only way to silence them is to add#pyre-ignore-all-errors[21]
to all the files affected.
speed
For an initial run, this is barely fast enough for me to want to run for each commit -- it's more than twice as fast as pytype
, but much slower than mypy
.
$ time pyre check
ƛ No type errors found
real 0m1.720s
user 0m0.480s
sys 0m0.099s
It's worth noting there's anincremental
command that spins up a server in the background -- this will work with the LSP protocol for VS Code & Nuclide to incrementally run additional checks as code changes. I don't use either VS Code or Nuclide and wasn't able to get it to work, unfortunately.
accuracy
Unfortunately, pyre
only found 1 of 3 expected errors in sample.py
:
$ pyre check
ƛ Found 1 type error!
foo/sample.py:6:11 Incompatible parameter type [6]: Expected `int` for 1st positional only parameter to call `int.__radd__` but got `str`.
extras
I like the variety and personality of subcommands available -- they had me at rage
for verbose debugging. But I'm also very impressed by the infer subcommand that can, with the right flags, add annotations to your code in-place! Unlike the merge-pyi
helper supplied by pytype
, this doesn't require you to generate and specify a separate .pyi
file -- which I for one think is very handy.
pyright
Another entry that isn't even written in Python is pyright
, sponsored by Microsoft, and probably your best option if you're a VS Code user because it's readily available as a VS Code extension.
Since it's written in Typescript, you'll need to install it through npm
: npm install pyright@1.1.38
configuration
pyright
is similar to pytype
in that it works fairly seamlessly without a configuration file, but if you want more tuning (e.g. excluding test files) you need to add a pyrightconfig.json
file. I used this configuration:
{
"include": [
"zict"
],
"exclude": [
"**/node_modules",
"**/__pycache__",
"**/tests/**"
],
"reportMissingImports": false
}
I should add that pyrightconfig.json
exposes a lot of options for warnings to tune, and I found the documentation quite comprehensive and helpful.
A warning! Folks at a previous company found that by default
pyright
also wanted to check their entire virtual environment, if it happened to be next to their code. I suggest you use apyrightconfig.json
and be sure to explicitly exclude your virtual environment /pyenv
directories.
speed
No complaints as far as speed goes -- it's barely over a second for a fresh run, and there's a --watch
option for a long-running process:
$ pyright
Loading configuration file at /home/<...>/pyrightconfig.json
stubPath /home/<...>/typings is not a valid directory.
Searching for source files
Found 11 source files
/home/<...>/zict/doc/source/conf.py
69:16 - error: "__version__" is not a known member of module (reportGeneralTypeIssues)
/home/<...>/zict/zict/buffer.py
61:19 - error: "Unknown" is not iterable
70:19 - error: "Unknown" is not iterable
/home/<...>/zict/zict/lru.py
66:23 - error: "Unknown" is not iterable
88:19 - error: "Unknown" is not iterable
5 errors, 0 warnings
Completed in 1.027sec
accuracy
Here's where we find my biggest gripe with pyright
-- it's limited in working with more dynamic Python. As with the spurious errors when checking zict
, it's also complaining about self.callbacks
in our sample script -- even though careful examination of the code will show self.callbacks
should always be a list.
It also only found one of the three expected errors in our file:
$ pyright sample.py
stubPath /home/<...>/typings is not a valid directory.
Searching for source files
Found 1 source file
/home/<...>/sample.py
6:12 - error: Operator "+" not supported for types "Literal['Hello, ']" and "int" (reportGeneralTypeIssues)
56:19 - error: "Unknown" is not iterable
2 errors, 0 warnings
Completed in 0.619sec
conclusions
So, to sum things up, and expand on the reasoning for my tl;dr section above:
mypy
is fast and the reference implementation, but doesn't catch all the errors we'd wantpytype
is the most accurate, but slowest, and doesn't support Python 3.8pyre
is slower and no more accurate thanmypy
, but does have helpful tools like theinfer
subcommandpyright
is almost as fast asmypy
, and the most configurable, but reports unnecessary errors
To echo what I said above, then: I've found pytype
really helpful in a continuous integration step, where I'm expecting my tests to take a few seconds to run anyway, so another 10-15 seconds are much less painful. (One work project regularly took over a minute). For constant checking, because of its configurability, I like pyright
.
I should add that mypy
continues to improve rapidly. And while pyre
doesn't serve my day-to-day needs as well, I've used the infer
subcommand effectively before and liked it.
Any additional typechecking you add to your project will, in my opinion, probably help -- but these are the tools I recommend.