Let's consider the two following syntax variations:
class Foo:x: intdef __init__(self, an_int: int):self.x = an_int
And
class Foo:def __init__(self, an_int: int):self.x = an_int
Apparently the following code raises a mypy error in both cases (which is expected):
obj = Foo(3)
obj.x.title() # this is a str operation
But I really want to enforce the contract: I want to make it clear that x is an instance variable of every Foo
object. So which syntax should be preferred, and why?
This is ultimately a matter of personal preference. To use the example in the other answer, doing both:
class Foo:x: Union[int, str]def __init__(self, an_int: int) -> None:self.x = an_int
...and doing:
class Foo:def __init__(self, an_int: int) -> None:self.x: Union[int, str] = an_int
...will be treated in the exact same way by type checkers.
The main advantage of doing the former is that it makes the types of your attributes more obvious in the cases where your constructor is complex to the point where it's difficult to trace what type inference is being performed.
This style is also consistent with how you declare and use things like dataclasses:
from dataclasses import dataclass@dataclass
class Foo:x: inty: Union[int, str]z: str# You get an `__init__` for free. Mypy will check to make sure the types match.
# So this type checks:
a = Foo(1, "b", "c")# ...but this doesn't:
b = Foo("bad", 3.14, 0)
This isn't really a pro or a con, just more of an observation that the standard library has, in some specific cases, embraced the former style.
The main disadvantage is that this style is somewhat verbose: you're forced into repeating the variable name two times (three, if you include the __init__
parameter), and often forced into repeating the type hint twice (once in your variable annotation and once in in the __init__
signature).
It also opens up a possible correctness issue in your code: mypy will never actually check to make sure you've assigned anything to your attribute! For example, the following code will happily type check despite that it crashes at runtime:
class Foo:x: intdef __init__(self, x: int) -> None:# Whoops, I forgot to do 'self.x = x'passf = Foo(1)# Type checks, but crashes at runtime!
print(f.x)
The latter style dodges these issues: if you forget to assign an attribute, mypy will complain that it doesn't exist when you try using it later.
The other main advantage of the latter style is that you can also get away with not adding an explicit type hint a lot of the time, especially if you're just assigning a parameter directly to a field. The type checker will infer the exact same type in those cases.
So given these factors, my personal preference is to:
- Use dataclasses (and by proxy, the former style) if I just want a simple, record-like object with an automatically generated
__init__
.
- Use the latter style if I either feel dataclasses are overkill or need to write a custom
__init__
, to decrease both verbosity and the odds of running into the "forgot-to-assign-an-attribute" bug.
- Switch back to the former style if I have a sufficiently large and complex
__init__
that's somewhat difficult to read. (Or better yet, just refactor my code so I can keep the __init__
simple!)
You may end up weighing these factors differently and come up with a different set of tradeoffs, of course.
One final tangent -- when you do:
class Foo:x: int
...you are not actually annotating a class variable. At this point, x has no value, so doesn't actually exist as a variable.
The only thing you're creating is an annotation, which is just pure metadata and distinct from the variable itself.
But if you do:
class Foo:x: int = 3
...then you are creating both a class variable and an annotation. Somewhat confusingly, while you may be creating a class variable/attribute (as opposed to an instance variable/attribute), mypy and other type checker will continue assuming that type annotation is meant to annotate specifically an instance attribute.
This inconsistency usually doesn't matter in practice, especially if you follow the general best practice of avoiding mutable default values for anything. But this may cause some surprises if you're trying to do something fancy.
If you want mypy/other type checkers to understand your annotation is a class variable annotation, you need to use the ClassVar
type:
# Import this from 'typing_extensions' if you're using Python 3.7 or earlier
from typing import ClassVarclass Foo:x: ClassVar[int] = 3