Prev Source
Next Source

Effective Python

Python supports closures: functions that refer to variables from the scope in which they were defined.

However, much like the anti-pattern of global variables, I’d caution against using nonlocal for anything beyond simple functions. The side effects of nonlocal can be hard to follow. It’s especially hard to understand in long functions where the nonlocal statements and assignments to associated variables are far apart.

Closure functions can refer to variables from any of the scopes in which they were defined.

By default, closures can’t affect enclosing scopes by assigning variables.

In Python 3, use the nonlocal statement to indicate when a closure can modify a variable in its enclosing scopes.

In Python 2, use a mutable value (like a single-item list) to work around the lack of the nonlocal statement.

Avoid using nonlocal statements for anything beyond simple functions.

Generators can produce a sequence of outputs for arbitrarily large inputs because their working memory doesn’t include all inputs and outputs.

Beware of functions that iterate over input arguments multiple times. If these arguments are iterators, you may see strange behavior and missing values.

Python’s iterator protocol defines how containers and iterators interact with the iter and next built-in functions, for loops, and related expressions.

You can easily define your own iterable container type by implementing the iter method as a generator.

You can detect that a value is an iterator (instead of a container) if calling iter on it twice produces the same result, which can then be progressed with the next built-in function.

The first issue is that the variable arguments are always turned into a tuple before they are passed to your function. This means that if the caller of your function uses the * operator on a generator, it will be iterated until it’s exhausted. The resulting tuple will include every value from the generator, which could consume a lot of memory and cause your program to crash.

Functions that accept *args are best for situations where you know the number of inputs in the argument list will be reasonably small.

The second issue with *args is that you can’t add new positional arguments to your function in the future without migrating every caller. If you try to add a positional argument in the front of the argument list, existing callers will subtly break if they aren’t updated.

Functions can accept a variable number of positional arguments by using *args in the def statement.

You can use the items from a sequence as the positional arguments for a function with the * operator.

Using the * operator with a generator may cause your program to run out of memory and crash.

Adding new positional parameters to functions that accept *args can introduce hard-to-find bugs.

Function arguments can be specified by position or by keyword.

Keywords make it clear what the purpose of each argument is when it would be confusing with only positional arguments.

Keyword arguments with default values make it easy to add new behaviors to a function, especially when the function has existing callers.

Optional keyword arguments should always be passed by keyword instead of by position.

Default arguments are only evaluated once: during function definition at module load time. This can cause odd behaviors for dynamic values (like {} or []).

Use None as the default value for keyword arguments that have a dynamic value. Document the actual default behavior in the function’s docstring.

In Python 3, you can demand clarity by defining your functions with keyword-only arguments. These arguments can only be supplied by keyword, never by position.

The * symbol in the argument list indicates the end of positional arguments and the beginning of keyword-only arguments.

Unfortunately, Python 2 doesn’t have explicit syntax for specifying keyword-only arguments like Python 3. But you can achieve the same behavior of raising TypeErrors for invalid function calls by using the ** operator in argument lists.

Keyword arguments make the intention of a function call more clear.

Use keyword-only arguments to force callers to supply keyword arguments for potentially confusing functions, especially those that accept multiple Boolean flags.

Python 3 supports explicit syntax for keyword-only arguments in functions.

Python 2 can emulate keyword-only arguments for functions by using **kwargs and manually raising TypeError exceptions.

You can’t specify default argument values for namedtuple classes. This makes them unwieldy when your data may have many optional properties. If you find yourself using more than a handful of attributes, defining your own class may be a better choice.

The attribute values of namedtuple instances are still accessible using numerical indexes and iteration. Especially in externalized APIs, this can lead to unintentional usage that makes it harder to move to a real class later. If you’re not in control of all of the usage of your namedtuple instances, it’s better to define your own class.

Avoid making dictionaries with values that are other dictionaries or long tuples.

Use namedtuple for lightweight, immutable data containers before you need the flexibility of a full class.

Move your bookkeeping code to use multiple helper classes when your internal state dictionaries get complicated.

In other languages, you might expect hooks to be defined by an abstract class. In Python, many hooks are just stateless functions with well-defined arguments and return values. Functions are ideal for hooks because they are easier to describe and simpler to define than classes.

call allows an object to be called just like a function. It also causes the callable built-in function to return True for such an instance.

Instead of defining and instantiating classes, functions are often all you need for simple interfaces between components in Python.

References to functions and methods in Python are first class, meaning they can be used in expressions like any other type.

The call special method enables instances of a class to be called like plain Python functions.

When you need a function to maintain state, consider defining a class that provides the call method instead of defining a stateful closure

Python only supports a single constructor per class, the init method.

Use @classmethod to define alternative constructors for your classes.

Use class method polymorphism to provide generic ways to build and connect concrete subclasses.

To solve these problems, Python 2.2 added the super built-in function and defined the method resolution order (MRO). The MRO standardizes which superclasses are initialized before others (e.g., depth-first, left-to-right). It also ensures that common superclasses in diamond hierarchies are only run once.

Python’s standard method resolution order (MRO) solves the problems of superclass initialization order and diamond inheritance.

Always use the super built-in function to initialize parent classes.

Python is an object-oriented language with built-in facilities for making multiple inheritance tractable (see Item 25: Initialize Parent Classes with super”). However, it’s better to avoid multiple inheritance altogether.

A mix-in is a small class that only defines a set of additional methods that a class should provide. Mix-in classes don’t define their own instance attributes nor require their init constructor to be called.

Writing mix-ins is easy because Python makes it trivial to inspect the current state of any object regardless of its type.

Dynamic inspection lets you write generic functionality a single time, in a mix-in, that can be applied to many other classes. Mix-ins can be composed and layered to minimize repetitive code and maximize reuse.

The best part about mix-ins is that you can make their generic functionality pluggable so behaviors can be overridden when required.

Mix-ins can also be composed together. For example, say you want a mix-in that provides generic JSON serialization for any class.

Avoid using multiple inheritance if mix-in classes can achieve the same outcome.

Use pluggable behaviors at the instance level to provide per-class customization when mix-in classes may require it.

Compose mix-ins to create complex functionality from simple behaviors.

In Python, there are only two types of attribute visibility for a class’s attributes: public and private.

Why doesn’t the syntax for private attributes actually enforce strict visibility? The simplest answer is one often-quoted motto of Python: We are all consenting adults here.” Python programmers believe that the benefits of being open outweigh the downsides of being closed.

Private attributes aren’t rigorously enforced by the Python compiler.

Plan from the beginning to allow subclasses to do more with your internal APIs and attributes instead of locking them out by default.

Use documentation of protected fields to guide subclasses instead of trying to force access control with private attributes.

Only consider using private attributes to avoid naming conflicts with subclasses that are out of your control.

Inherit directly from Python’s container types (like list or dict) for simple use cases.

Beware of the large number of methods required to implement custom container types correctly.

Have your custom container types inherit from the interfaces defined in collections.abc to ensure that your classes match required interfaces and behaviors.

Simply put, metaclasses let you intercept Python’s class statement and provide special behavior each time a class is defined.

Define new class interfaces using simple public attributes, and avoid set and get methods.

Use @property to define special behavior when attributes are accessed on your objects, if necessary.

Ensure that @property methods are fast; do slow or complex work using normal methods.


Date
June 21, 2022