CodeAnt AI home pagelight logodark logo
  • Support
  • Dashboard
  • Dashboard
Documentation
API Reference
Start Here
  • What is CodeAnt?
  • Join Community
Setup
  • Github
  • GitHub Enterprise
  • Bitbucket
  • Gitlab
  • Azure Devops
Pull Request Review
  • Features
  • Customize Review
  • Quality Gates
  • Integrations
Scan center
  • Code Security
  • Code Quality
  • Cloud Security
  • Engineering Productivity
Integrations
  • Jira
  • Test Coverage
  • CI/CD
IDE
  • Setup
  • Review
  • Enhancements
Rule Reference
  • Compliance
  • Anti-Patterns
    • Pyspark
    • Python
      • Python - 1
      • Python - 2
      • Python - 3
    • Java
    • C / CPP
    • C #
    • JavaScript
    • Jcl
    • Kotlin
    • Kubernetes
    • Abap
    • Apex
    • Azure Source Manager
    • Php
    • Pli
    • Plsql
    • Secrets
    • Swift
    • Terraform
    • Text
    • Tsql
    • Rpg
    • Ruby
    • Scala
    • Vb6
    • Vbnet
    • Xml
    • Flex
    • Go
    • Html
    • Docker
    • Css
    • Cobol
    • Common
  • Code Governance
  • Infrastructure Security Database
  • Application Security Database
Resources
  • Open Source
  • Blogs
Python

Python - 2

Learn about Python Anti-Patterns and How they help you write better code, and avoid common pitfalls.

The pandas.DataFrame.to_numpy() method should be preferred to the pandas.DataFrame.values attribute

The `values attribute and the to_numpy() method in pandas both provide a way to return a NumPy representation of the DataFrame. However, there are some reasons why the to_numpy() method is recommended over the values attribute:

  • Future Compatibility: The values attribute is considered a legacy feature, while the to_numpy() is the recommended method to extract data and is considered more future-proof.

  • Data type consistency: If the DataFrame has columns with different data types, NumPy will choose a common data type that can hold all the data. This may lead to loss of information, unexpected type conversions, or increased memory usage. The to_numpy() allows you to select the common type manually, passing the dtype argument.

  • View vs Copy: The values attribute can return a view or a copy of the data depending on whether the data needs to be transposed. This can lead to confusion when modifying the extracted data. On the other hand, to_numpy() has copy argument allowing to force it always to return a new NumPy array, ensuring that any changes you make won’t affect the original DataFrame.

  • Missing values control: The to_numpy() allows to specify the default value used for missing values in the DataFrame, while the values will always use numpy.nan` for missing values.

Copy
import pandas as pd

df = pd.DataFrame({
    'X': ['A', 'B', 'A', 'C'],
    'Y': [10, 7, 12, 5]
})

arr = df.values # Noncompliant: using the 'values' attribute is not recommended

Operators should be used on compatible types

For a specific operator, two types are considered incompatible if no built-in operations between those types exist and none of the operands has implemented the operator’s corresponding special methods. Performing such an operation on incompatible types will raise a TypeError.

Calling an operator in Python is equivalent to calling a special method (except for the identity operator is). Python provides a set of built-in operations. For example, to add two integers: 1 + 2, calling the built-in operator + is equivalent to calling the special method `add on the type int.

Python allows developers to define how an operator will behave with a custom class by implementing the corresponding special method. When defining such methods for symmetrical binary operators, developers need to define two methods so that the order of operands doesn’t matter, ex: add and radd`.

For a complete list of operators and their methods see the Python documentation: arithmetic and bitwise operators, comparison operators.

Copy
class Empty:
pass

class Add:
def __add__(self, other):
    return 42

Empty() + 1  # Noncompliant: no __add__ method is defined on the Empty class
Add() + 1
1 + Add()  # Noncompliant: no __radd__ method is defined on the Add class
Add() + Empty()
Empty() + Add()  # Noncompliant: no __radd__ method is defined on the Add class

Exceptions __cause__ should be either an Exception or None

Exception chaining enables users to see if an exception is the direct consequence of another exception (see PEP-3134). This is useful to propagate the original context of the error.

Exceptions are chained using either of the following syntax:

  • With the from keyword

Copy
try:
...
except OSError as e:
raise RuntimeError("Something went wrong") from e

Loops with at most one iteration should be refactored

A loop with at most one iteration is equivalent to the use of an `if statement to conditionally execute one piece of code. No developer expects to find such a use of a loop statement. If the initial intention of the author was really to conditionally execute one piece of code, an if statement should be used instead.

At worst that was not the initial intention of the author and so the body of the loop should be fixed to use the nested return, break or raise` statements in a more appropriate way.

Copy
while node is not None:
node = node.parent()
print(node)
break

Generic classes should be defined using the type parameter syntax

Through PEP 695, Python 3.12 introduces the type parameter syntax to allow for a more compact and explicit way to define generic classes and functions.

Prior to Python 3.12, defining a generic class would be done through the following syntax:

Copy
from typing import Generic, TypeVar

_T_co = TypeVar("_T_co", covariant=True, bound=str)

class ClassA(Generic[_T_co]):
def method1(self)  _T_co:
    ...

Some special methods should return NotImplemented instead of raising NotImplementedError

In Python, special methods corresponding to numeric operators and rich comparison operators should return `NotImplemented when the operation is not supported.

For example A + B is equivalent to calling A.add(B). If this binary operation is not supported by class A, A.add(B) should return NotImplemented. The interpreter will then try the reverse operation, i.e. B.radd(A). If these special methods were to raise NotImplementedError, the callers would not catch the exception and the reverse operation would not be called.

Below is the list of special methods this rule applies to:

  • lt(self, other)

  • le(self, other)

  • eq(self, other)

  • ne(self, other)

  • gt(self, other)

  • ge(self, other)

  • add(self, other)

  • sub(self, other)

  • mul(self, other)

  • matmul(self, other)

  • truediv(self, other)

  • floordiv(self, other)

  • mod(self, other)

  • divmod(self, other)

  • pow(self, other[, modulo])

  • lshift(self, other)

  • rshift(self, other)

  • and(self, other)

  • xor(self, other)

  • or(self, other)

  • radd(self, other)

  • rsub(self, other)

  • rmul(self, other)

  • rmatmul(self, other)

  • rtruediv(self, other)

  • rfloordiv(self, other)

  • rmod(self, other)

  • rdivmod(self, other)

  • rpow(self, other[, modulo])

  • rlshift(self, other)

  • rrshift(self, other)

  • rand(self, other)

  • rxor(self, other)

  • ror(self, other)

  • iadd(self, other)

  • isub(self, other)

  • imul(self, other)

  • imatmul(self, other)

  • itruediv(self, other)

  • ifloordiv(self, other)

  • imod(self, other)

  • ipow(self, other[, modulo])

  • ilshift(self, other)

  • irshift(self, other)

  • iand(self, other)

  • ixor(self, other)

  • ior(self, other)

  • length_hint(self)`

Copy
class MyClass:
def __add__(self, other):
    raise NotImplementedError()  # Noncompliant: the exception will be propagated
def __radd__(self, other):
    raise NotImplementedError()  # Noncompliant: the exception will be propagated

class MyOtherClass:
def __add__(self, other):
    return 42
def __radd__(self, other):
    return 42

MyClass() + MyOtherClass()  # This will raise NotImplementedError

Identity operators should not be used with dissimilar types

Operators is and is not check if their operands point to the same instance, thus they will always return respectively False and True when they are used to compare objects of different types.

Copy
a = 1
b = "1"
value = a is b  # Noncompliant. Always False
value = a is not b  # Noncompliant. Always True

Tests should be skipped explicitly

Test frameworks provide a mechanism to skip tests if their prerequisites are not met, by either calling dedicated methods (e.g: `unittest.TestCase.skipTest, pytest.skip, …​) or using decorators (e.g: unittest.skip, pytest.mark.skip, …​)

Using a return statement instead will make the test succeed, even though no assertion has been performed. It is therefore better to flag the test as skipped in such situation.

This rule raises an issue when a return is performed conditionally at the beginning of a test method.

No issue will be raised if the return is unconditional as S1763 already raises an issue in such case.

The supported frameworks are Pytest and Unittest`.

Copy
import unittest
class MyTest(unittest.TestCase):

def test_something(self):
    if not external_resource_available():
        return  # Noncompliant
    self.assertEqual(foo(), 42)

The first argument to class methods should follow the naming convention

By convention, the first argument in a class method, i.e. methods decorated with `@classmethod, is named cls as a representation and a reminder that the argument is the class itself. If you were to name the argument something else, you would stand a good chance of confusing both users and maintainers of the code. It might also indicate that the cls parameter was forgotten, in which case calling the method will most probably fail. This rule also applies to methods init_subclass, class_getitem and new as their first argument is always the class instead of “self”.

By default this rule accepts cls and mcs, which is sometime used in metaclasses, as valid names for class parameters. You can set your own list of accepted names via the parameter classParameterNames`.

Copy
class Rectangle(object):

@classmethod
def area(bob, height, width):  #Noncompliant
return height * width

Union type expressions should be preferred over typing.Union in type hints

Python 3.10 introduced a specific syntax using the “or” operator (X | Y) to represent a union of types. This syntax has the same functionality as typing.Union, but it is more concise and easier to read.

Using typing.Union is more verbose and less convenient. It can also create inconsistencies when different parts of the codebase use different syntaxes for the same type.

Copy
from typing import Union

def foo(arg: Union[int, str])  Union[int, str]:
if isinstance(arg, int):
    return arg + 1
else:
    return arg.upper()

Dictionarys setdefault should be used instead of checking key existence

A common anti-pattern is to check that a key does not exist in a dictionary before adding it with a corresponding value. This pattern works but is less readable than the equivalent call to the built-in dictionary method “setdefault()”.

Note that if a default value is set for every key of the dictionary it is possible to use python’s defaultdict instead.

This rule raises an issue when a key presence is checked before being set. It only raises an issue when the value is a hard-coded string, number, list, dictionary or tuple. Computed values will not raise an issue as they can have side-effects.

Copy
if "key" not in my_dictionary:
my_dictionary["key"] = ["a", "b", "c"]  # Noncompliant

if "key" not in my_dictionary:
my_dictionary["key"] = generate_value()  # Compliant. No issue is raised as generate_value() might have some side-effect.

Functions and lambdas should not reference variables defined in enclosing loops

Nested functions and lambdas can reference variables defined in enclosing scopes. This can create tricky bugs when the variable and the function are defined in a loop. If the function is called in another iteration or after the loop finishes, it will see the variables’ last value instead of seeing the values corresponding to the iteration where the function was defined.

Capturing loop variables might work for some time but:

  • it makes the code difficult to understand.

  • it increases the risk of introducing a bug when the code is refactored or when dependencies are updated. See an example with the builtin “map” below.

One solution is to add a parameter to the function/lambda and use the previously captured variable as its default value. Default values are only executed once, when the function is defined, which means that the parameter’s value will remain the same even when the variable is reassigned in following iterations.

Another solution is to pass the variable as an argument to the function/lambda when it is called.

This rule raises an issue when a function or lambda references a variable defined in an enclosing loop.

Copy
def run():
mylist = []
for i in range(5):
    mylist.append(lambda: i)  # Noncompliant

    def func():
        return i  # Noncompliant
    mylist.append(func)

def example_of_api_change():
""""
Passing loop variable as default values also makes sure that the code is future-proof.
For example the following code will work as intended with python 2 but not python 3.
Why? because "map" behavior changed. It now returns an iterator and only executes
the lambda when required. The same is true for other functions such as "filter".
"""
lst = []
for i in range(5):
    lst.append(map(lambda x: x + i, range(3)))  # Noncompliant
for sublist in lst:
    # prints [4, 5, 6] x 4 with python 3, with python 2 it prints [0, 1, 2], [1, 2, 3], ...
    print(list(sublist))

Type hints of generic types should specify their type parameters

Generic types, such as list or dict accept type arguments to specify the type of elements contained in the list or the keys and values in the dictionary.

If a generic type is used without a type argument, the type arguments will implicitly assumed to be Any. This makes the type hint less informative and makes the contract of the function or variable annotated with the type hint more difficult to understand.

Furthermore, incomplete type hints can hinder IDE autocompletion and code insight capabilities of static analyis tools.

Copy
def print_list(numbers: list)  None:
for n in numbers:
    print(n)

Set members and dictionary keys should be hashable

The hash value of an object is an integer returned by its `hash method. Objects that are considered equal to each other (as per the eq method) should have the same hash value.

Whenever using an object as a dictionary key or inserting it into a set, the hash value of that object will be used to derive a bucket in which the object will be inserted.

When attempting to insert an unhashable object into a set, a TypeError will be raised instead.

If an object defines a hash method derived from mutable properties, no TypeError` will be raised. However, having a mutable hash value should never be done as this would prevent dictionaries and set from retrieving the object.

Copy
def foo():
my_list = [1,2,3]
my_set = {my_list}  # Noncompliant: list is not hashable.

np.nonzero should be preferred over np.where when only the condition parameter is set

The NumPy function np.where provides a way to execute operations on an array under a certain condition:

Copy
import numpy as np

arr = np.array([1,2,3,4])

result = np.where(arr > 3, arr * 2, arr)

Methods and field names should not differ only by capitalization

Looking at the set of methods and fields in a class and finding two that differ only by capitalization is confusing to users of the class.

This situation may simply indicate poor naming. Method names should be action-oriented, and thus contain a verb, which is unlikely in the case where both a method and a field have the same name (with or without capitalization differences). However, renaming a public method could be disruptive to callers. Therefore renaming the member is the recommended action.

Copy
class SomeClass:
lookUp = false
def lookup():       # Non-compliant; method name differs from field name only by capitalization
    pass

await should be used on awaitable objects

The await keyword can only be used on “Awaitable” objects. Python has three types of awaitables: Future, Task and Coroutines. Calling await on any other object will raise a TypeError.

Copy
import asyncio

def myfunction():
print("myfunction")

async def otherfunction():
await myfunction()  # Noncompliant. myfunction is not marked as "async"

asyncio.run(otherfunction())

Dates should be formatted correctly when using pandas.to_datetime with dayfirst or yearfirst arguments

The pandas.to_datetime function transforms a string to a date object. The string representation of the date can take multiple formats. To correctly parse these strings, pandas.to_datetime provides several arguments to setup the parsing, such as dayfirst or yearfirst. For example setting dayfirst to True indicates to pandas.to_datetime that the date and time will be represented as a string with the shape day month year time. Similarly with yearfirst, the string should have the following shape year month day time.

These two arguments are not strict, meaning if the shape of the string is not the one expected by pandas.to_datetime, the function will not fail and try to figure out which part of the string is the day, month or year.

In the following example the dayfirst argument is set to True but we can clearly see that the month part of the date would be incorrect. In this case pandas.to_datetime will ignore the dayfirst argument, and parse the date as the 22nd of January.

Copy
import pandas as pd

pd.to_datetime(["01-22-2000 10:00"], dayfirst=True)

Static methods should not have self or cls arguments

Unlike class and instance methods, static methods don’t receive an implicit first argument. Nonetheless naming the first argument self or clz guarantees confusion - either on the part of the original author, who may never understand why the arguments don’t hold the values he expected, or on that of future maintainers.

Copy
class MyClass:
@staticmethod
def s_meth(self, arg1, arg2):  #Noncompliant
# ...

Exceptions should not be created without being raised

Creating a new Exception without actually raising it has no effect and is probably due to a mistake.

Copy
def func(x):
if not isinstance(x, int):
    TypeError("Wrong type for parameter 'x'. func expects an integer")  # Noncompliant
if x < 0:
    ValueError  # Noncompliant
return x + 42

isinstance() should be preferred to direct type comparisons

In Python, using the isinstance() function is generally preferred over direct type comparison for several reasons:

  1. Compatibility with inheritance: isinstance() considers inheritance hierarchy, whereas direct type comparison does not. This means that isinstance() can handle cases where an object belongs to a subclass of the specified type, making your code more flexible and robust. It allows you to write code that can work with objects of different but related types.

  2. Support for duck typing: Python follows the principle of “duck typing,” which focuses on an object’s behavior rather than its actual type. isinstance() enables you to check if an object has certain behavior (by checking if it belongs to a particular class or subclass) rather than strictly requiring a specific type. This promotes code reusability and enhances the flexibility of your programs.

  3. Code maintainability and extensibility: By using isinstance(), your code becomes more maintainable and extensible. If you directly compare types, you would need to modify your code whenever a new subtype is introduced or the inheritance hierarchy is changed. On the other hand, isinstance() allows your code to accommodate new types without requiring any modifications, as long as they exhibit the desired behavior.

  4. Polymorphism and interface-based programming: isinstance() supports polymorphism, which is the ability of different objects to respond to the same method calls. It allows you to design code that interacts with objects based on their shared interface rather than their specific types. This promotes code reuse and modularity, as you can write functions and methods that operate on a range of compatible objects.

  5. Third-party library compatibility: Many third-party libraries and frameworks in Python rely on isinstance() for type checking and handling different types of objects. By using isinstance(), your code becomes more compatible with these libraries and frameworks, making it easier to integrate your code into larger projects or collaborate with other developers.

In summary, using isinstance() over direct type comparison in Python promotes flexibility, code reusability, maintainability, extensibility, and compatibility with the wider Python ecosystem. It aligns with the principles of object-oriented programming and supports the dynamic nature of Python. It is also recommended by the PEP8 style guide.

Copy
class MyClass:
...

def foo(a):
if type(a) == MyClass: # Noncompliant
...

Boolean expressions of exceptions should not be used in except statements

The only two possible types for an `except’s expression are a class deriving from BaseException, or a tuple composed of such classes.

Trying to catch multiple exception in the same except with a boolean expression of exceptions may not work as intended. The result of a boolean expression of exceptions is a single exception class, thus using a boolean expression in an except` block will result in catching only one kind of exception.

Copy
error = ValueError or TypeError  
error is ValueError # True
error is TypeError # False

error = ValueError and TypeError  
error is ValueError # False
error is TypeError # True

Identity comparisons should not be used with cached types

The identity operators is and is not check if the same object is on both sides, i.e. a is b returns True if ++id(a)

Copy
my_int = 1
other_int = 1

id(my_int) == id(other_int) # True

self should be the first argument to instance methods

Instance methods, i.e. methods not annotated with `@classmethod or @staticmethod, are expected to have at least one parameter. This parameter will reference the object instance on which the method is called. By convention, this first parameter is named “self”.

Naming the first parameter something different from “self” is not recommended as it could lead to confusion. It might indicate that the “self” parameter was forgotten, in which case calling the method will most probably fail.

Note also that creating methods which are used as static methods without the @staticmethod decorator is a bad practice. Calling these methods on an instance will raise a TypeError. Either move the method out of the class or decorate it with @staticmethod`.

Copy
class MyClass(ABC):
@myDecorator
def method(arg):  # No issue will be raised.
    pass

New objects should not be created only to check their identity

Identity operators is and is not check if the same object is on both sides, i.e. a is b returns True if ++id(a)

Copy
def func(param):
param is {1: 2}  # Noncompliant: always False
param is not {1, 2, 3}  # Noncompliant: always True
param is [1, 2, 3]  # Noncompliant: always False

param is dict(a=1)  # Noncompliant: always False

mylist = []  # mylist is assigned a new object
param is mylist  # Noncompliant: always False

startswith or endswith methods should be used instead of string slicing in condition expressions

Using the startswith and endswith methods in Python instead of string slicing offers several advantages:

  1. Readability and Intent: Using startswith and endswith methods provides code that is more readable and self-explanatory. It clearly communicates your intention to check if a string starts or ends with a specific pattern. This makes the code more maintainable and easier to understand for other developers.

  2. Flexibility: The startswith and endswith methods allow you to check for patterns of varying lengths. With string slicing, you would need to specify the exact length of the substring to compare. However, with the methods, you can pass in a pattern of any length, making your code more flexible and adaptable.

  3. Error Handling: The methods handle edge cases automatically. If you pass a substring length that exceeds the length of the original string, slicing would raise an IndexError exception. On the other hand, the methods gracefully handle such cases and return False, avoiding any potential errors.

  4. Performance Optimization: In some cases, using startswith and endswith methods can provide better performance. These methods are optimized and implemented in C, which can make them faster than manually slicing the string in Python. Although the performance gain might be negligible for small strings, it can be significant when working with large strings or processing them in a loop.

Overall, using startswith and endswith methods provides a cleaner, more readable, and error-resistant approach for checking if a string starts or ends with a specific pattern. It promotes code clarity, flexibility, and can potentially improve performance. This is also recommended by the PEP8 style guide.

Copy
message = "Hello, world!"

if message[:5] == "Hello":
...

if message[-6:] == "world!":
...

Increment and decrement operators should not be used

Python has no pre/post increment/decrement operator. For instance, x++ and x— will fail to parse. More importantly, ++x and —x will do nothing. To increment a number, simply write x += 1.

Copy
++x # Noncompliant: pre and post increment operators do not exist in Python.

x-- # Noncompliant: pre and post decrement operators do not exist in Python.

Bare raise statements should not be used in finally blocks

A bare `raise statement, i.e. a raise with no exception provided, will re-raise the last active exception in the current scope. If no exception is active a RuntimeError is raised instead.

If the bare “raise” statement is in a finally block, it will only have an active exception to re-raise when an exception from the try block is not caught or when an exception is raised by an except or else block. Thus bare raise statements should not be relied upon in finally` blocks. It is simpler to let the exception raise automatically.

Copy
def foo(param):
result = 0
try:
    print("foo")
except ValueError as e:
    pass
else:
    if param:
        raise ValueError()
finally:
    if param:
        raise  # Noncompliant: This will fail in some context.
    else:
        result = 1
return result

__init__ should not return a value

By contract, every Python function returns something, even if it is the `None value, which can be returned implicitly by omitting the return statement, or explicitly.

The init method is required to return None. A TypeError will be raised if the init method either yields or returns any expression other than None. While explicitly returning an expression that evaluates to None will not raise an error, it is considered bad practice.

To fix this issue, make sure that the init` method does not contain any return statement.

Copy
class MyClass(object):
def __init__(self):
    self.message = 'Hello'
    return self  # Noncompliant: a TypeError will be raised

Redundant pairs of parentheses should be removed

Parentheses can disambiguate the order of operations in complex expressions and make the code easier to understand.

Copy
a = (b * c) + (d * e) # Compliant: the intent is clear.

`str.replace` should be preferred to `re.sub`

An re.sub call always performs an evaluation of the first argument as a regular expression, even if no regular expression features were used. This has a significant performance cost and therefore should be used with care.

When re.sub is used, the first argument should be a real regular expression. If it’s not the case, str.replace does exactly the same thing as re.sub without the performance drawback of the regex.

This rule raises an issue for each re.sub used with a simple string as first argument which doesn’t contains special regex character or pattern.

Copy
init = "Bob is a Bird... Bob is a Plane... Bob is Superman!"
changed = re.sub(r"Bob is", "It's", init) # Noncompliant
changed = re.sub(r"\.\.\.", ";", changed) # Noncompliant

Function return types should be consistent with their type hint

Developers can use type hints to specify which type a function is expected to return. Doing so improves maintainability since it clarifies the contract of the function, making it easier to use and understand.

When annotating a function with a specific type hint, it is expected that the returned value matches the type specified in the hint.

If the type hint specifies a class or a named type, then the value returned should be an instance of that class or type. If the type hint specifies a structural type, then the value returned should have the same structure as the type hint.

In the following example, while Bucket does not directly inherit from Iterable, it does implement the Iterable protocol thanks to its iter method and can therefore be used as a valid Iterable return type.

Copy
from collections.abc import Iterator, Iterable

class Bucket:  # Note: no base classes
...
def __len__(self)  int: ...
def __iter__(self)  Iterator[int]: ...


def collect()  Iterable: return Bucket()

The abs_tol parameter should be provided when using math.isclose to compare values to 0

Comparing float values for equality directly is not reliable and should be avoided, due to the inherent imprecision in the binary representation of floating point numbers. Such comparison is reported by S1244.

One common solution to this problem is to use the math.isclose function to perform the comparison. Behind the scenes, the math.isclose function uses a tolerance value (also called epsilon) to define an acceptable range of difference between two floats. A tolerance value may be relative (based on the magnitude of the numbers being compared) or absolute.

Using a relative tolerance would be equivalent to:

Copy
def isclose_relative(a, b, rel_tol=1e-09):
diff = abs(a - b)
max_diff = rel_tol * max(abs(a), abs(b))
return diff <= max_diff

Identical expressions should not be used on both sides of a binary operator

Using the same value on either side of a binary operator is almost always a mistake. In the case of logical operators, it is either a copy/paste error and therefore a bug, or it is simply wasted code, and should be simplified. In the case of bitwise operators and most binary mathematical operators, having the same value on both sides of an operator yields predictable results, and should be simplified.

Note that this rule will raise issues on “++a

Copy
if a == a: # Noncompliant
work()

if  a != a: # Noncompliant
work()

if  a == b and a == b: # Noncompliant
work()

if a == b or a == b: # Noncompliant
work()

j = 5 / 5 # Noncompliant
k = 5 - 5 # Noncompliant

Floating point numbers should not be tested for equality

Floating point math is imprecise because of the challenges of storing such values in a binary representation.

In base 10, the fraction 1/3 is represented as 0.333…​ which, for a given number of significant digit, will never exactly be 1/3. The same problem happens when trying to represent 1/10 in base 2, with leads to the infinitely repeating fraction 0.0001100110011…​. This makes floating point representations inherently imprecise.

Even worse, floating point math is not associative; push a `float through a series of simple mathematical operations and the answer will be different based on the order of those operation because of the rounding that takes place at each step.

Even simple floating point assignments are not simple, as can be vizualized using the format` function to check for significant digits:

Copy
>>> format(0.1, ".17g")
'0.10000000000000001'

Only existing object members should be accessed

Accessing a non-existing member on an object will raise in most case an AttributeError exception.

This rule raises an issue when a non-existing member is accessed on a class instance and nothing indicates that this was expected.

Copy
def access_attribute():
x = 42
return x.isnumeric()  # Noncompliant

Recursion should not be infinite

Recursion happens when control enters a loop that has no exit. This can happen when a method invokes itself or when a pair of methods invoke each other. It can be a useful tool, but unless the method includes a provision to break out of the recursion and return, the recursion will continue until the stack overflows and the program crashes.

Copy
def my_pow(num, exponent):  # Noncompliant
num = num * my_pow(num, exponent - 1)
return num  # this is never reached

Custom Exception classes should inherit from Exception or one of its subclasses

`SystemExit is raised when sys.exit() is called. KeyboardInterrupt is raised when the user asks the program to stop by pressing interrupt keys. Both exceptions are expected to propagate up until the application stops.

In order to avoid catching SystemExit and KeyboardInterrupt by mistake, PEP-352 created the root class BaseException from which SystemExit, KeyboardInterrupt and Exception derive. Thus developers can use except Exception: without preventing the software from stopping.

The GeneratorExit class also derives from BaseException as it is not really an error and is not supposed to be caught by user code.

As said in Python’s documentation, user-defined exceptions are not supposed to inherit directly from BaseException. They should instead inherit from Exception` or one of its subclasses.

Copy
class MyException(BaseException):  # Noncompliant
pass

Comparison to None should not be constant

Checking if a variable or parameter is None should only be done when you expect that it can be None. Doing so when the variable is always None or never None is confusing at best. At worse, there is a bug and the variable is not updated properly.

This rule raises an issue when expressions X is None, X is not None, `X

Copy
def foo():
my_var = None
if my_var == None:  # Noncompliant: always True.
    ...

Iterator classes should have a valid __iter__ method

The iterator protocol specifies that an iterator object should have

  • a `next method retrieving the next value or raising StopIteration when there are no more values left.

  • an iter method which should always return self. This enables iterators to be used as sequences in for-loops and other places.

This rule raises an issue when a class has a next method and either:

  • it doesn’t have an iter method.

  • or its iter` method does not return “self”.

Copy
class MyIterator:  # Noncompliant. Class has a __next__ method but no __iter__ method

def __init__(self, values):
    self._values = values
    self._index = 0

def __next__(self):
    if self._index >= len(self._values):
        raise StopIteration()
    value = self._values[self._index]
    self._index += 1
    return value


class MyIterator:
def __init__(self, values):
    self._values = values
    self._index = 0

def __next__(self):
    if self._index >= len(self._values):
        raise StopIteration()
    value = self._values[self._index]
    self._index += 1
    return value

def __iter__(self):
    return 42  # Noncompliant. This __iter__ method does not return self


class MyIterable:  # Ok. This is an iterable, not an iterator, i.e. it has an __iter__ method but no __next__ method. Thus __iter__ doesn't have to return "self"

def __init__(self, values):
    self._values = values

def __iter__(self):
    return MyIterator(self._values)

Dictionarys get(..., default) should be used instead of checking key existence

A common anti-pattern is to check that a key exists in a dictionary before retrieving its corresponding value and providing a default value otherwise. This pattern works but is less readable than the equivalent call to the built-in dictionary method “get()” with a default value.

Note that if a default value is set for every key of the dictionary it is possible to use python’s defaultdict instead.

This rule raises an issue when a key presence is checked before retrieving its value or providing a default value. It only raises an issue when the default value is a hard-coded string, number, list, dictionary or tuple. Computed values will not raise an issue as they can have side-effects.

Copy
result = "default"
if "missing" in mydict:
result = mydict["missing"]  # Noncompliant

if "missing" in mydict:
result = mydict["missing"]  # Noncompliant
else:
result = "default"

if "missing" in mydict:
result = mydict["missing"]  # Compliant. No issue is raised as generate_value() might have some side-effect.
else:
result = generate_value()

Equality checks should not be made against numpy.nan

The `numpy.nan is a floating point representation of Not a Number (NaN) used as a placeholder for undefined or missing values in numerical computations.

Equality checks of variables against numpy.nan in NumPy will always be False due to the special nature of numpy.nan. This can lead to unexpected and incorrect results.

Instead of standard comparison the numpy.isnan()` function should be used.

Copy
import numpy as np

x = np.nan

if x == np.nan: # Noncompliant: always False
...

NotImplemented should not be raised

NotImplemented is a constant which is intended to be used only by comparison methods such as lt. Use it instead of NotImplementedError, which is an exception, and callers will have a hard time using your code.

Copy
class MyClass:
def do_something(self):
    raise NotImplemented("Haven't gotten this far yet.")  #Noncompliant

Back references in regular expressions should only refer to capturing groups that are matched before the reference

When a back reference in a regex refers to a capturing group that hasn’t been defined yet (or at all), it can never be matched and will fail with an re.error exception

Copy
import re
pattern1 = re.compile(r"\1(.)") # Noncompliant, group 1 is defined after the back reference
pattern2 = re.compile(r"(.)\2") # Noncompliant, group 2 isn't defined at all
pattern3 = re.compile(r"(.)|\1") # Noncompliant, group 1 and the back reference are in different branches
pattern4 = re.compile(r"(?P<x>.)|(?P=x)") # Noncompliant, group x and the back reference are in different branches

Non-existent operators like =+ should not be used

Using operator pairs (=+ or =-) that look like reversed single operators (+= or -=) is confusing. They compile and run but do not produce the same result as their mirrored counterpart.

Copy
target = -5
num = 3

target =- num  # Noncompliant: target = -3. Is that really what's meant?
target =+ num # Noncompliant: target = 3

Unused private nested classes should be removed

”Private” nested classes that are never used inside the enclosing class are usually dead code: unnecessary, inoperative code that should be removed. Cleaning out dead code decreases the size of the maintained codebase, making it easier to understand the program and preventing bugs from being introduced.

Python has no real private classes. Every class is accessible. There are however two conventions indicating that a class is not meant to be “public”:

  • classes with a name starting with a single underscore (ex: `_MyClass) should be seen as non-public and might change without prior notice. They should not be used by third-party libraries or software. It is ok to use those classes inside the library defining them but it should be done with caution.

  • ”class-private” classes are defined inside another class, and have a name starting with at least two underscores and ending with at most one underscore. These classes’ names will be automatically mangled to avoid collision with subclasses’ nested classes. For example __MyClass will be renamed as _classname__MyClass, where classname` is the enclosing class’s name without its leading underscore(s). Class-Private classes shouldn’t be used outside of their enclosing class.

This rule raises an issue when a private nested class (either with one or two leading underscores) is never used inside its parent class.

Copy
class TopLevel:
class __Nested():  # Noncompliant: __Nested is never used
    pass

When using pandas.merge or pandas.join, the parameters on, how and validate should be provided

The Pandas library provides a user-friendly API to concatenate two data frames together with the methods `merge and join.

When using these methods, it is possible to specify how the merge will be performed:

  • The parameter how specifies the type of merge (left, inner, outer, etc..).

  • The parameter on specifies the column(s) on which the merge will be performed.

  • The parameter validate` specifies a way to verify if the merge result is what was expected.

Copy
import pandas as pd

age_df = pd.DataFrame({"user_id":[1,2,4], "age":[42,45, 35]})
name_df = pd.DataFrame({"user_id":[1,2,3,4], "name":["a","b","c","d"]})

result = age_df.merge(name_df, on="user_id", how="right", validate="1:1")

Numpy weekmask should have a valid value

To allow a datetime to be used in contexts where only certain days of the week are valid, NumPy includes a set of business day functions. Weekmask is used to customize valid business days.

Weekmask can be specified in several formats:

  1. As an array of 7 1 or 0 values, e.g. [1, 1, 1, 1, 1, 0, 0]

  2. As a string of 7 1 or 0 characters, e.g. “1111100”

  3. As a string with abbreviations of valid days from this list: Mon Tue Wed Thu Fri Sat Sun, e.g. “Mon Tue Wed Thu Fri”

Setting an incorrect weekmask leads to ValueError.

Copy
import numpy as np

offset = np.busday_offset('2012-05', 1, roll='forward', weekmask='01') # Noncompliant: ValueError

ExceptionGroup and BaseExceptionGroup should not be caught with except*

Python 3.11 introduced `except* and ExceptionGroup, making it possible to handle and raise multiple unrelated exceptions simultaneously.

In the example below, we gather multiple exceptions in an ExceptionGroup. This ExceptionGroup` is then caught by a single except block:

Copy
try:
exception_group = ExceptionGroup("Files not found", [FileNotFoundError("file1.py"), FileNotFoundError("file2.py")])

raise exception_group

except ExceptionGroup as exceptions:
# Do something with all the exceptions
pass

All code should be reachable

Jump statements (return, break, continue, and raise) move control flow out of the current code block. So any statements that come after a jump are dead code.

Copy
def fun(a):
i = 10
return i + a       # Noncompliant 
i += 1             # this is never executed

Fields of a Django ModelFom should be defined explicitly

In Django, when creating a ModelForm, it is common to use exclude to remove fields from the form. It is also possible to set the fields value to all to conveniently indicate that all the model fields should be included in the form. However, this can lead to security issues when new fields are added to the model, as they will automatically be included in the form, which may not be intended. Additionally, exclude or all can make it harder to maintain the codebase by hiding the dependencies between the model and the form.

Copy
from django import forms

class MyForm(forms.ModelForm):
class Meta:
    model = MyModel
    exclude = ['field1', 'field2']  # Noncompliant


class MyOtherForm(forms.ModelForm):
class Meta:
    model = Post
    fields = '__all__'  # Noncompliant

Conditionally executed code should be reachable

Unreachable code is never executed, so it has no effect on the behaviour of the program. If it is not executed because it no longer serves a purpose, then it adds unnecessary complexity. Otherwise, it indicates that there is a logical error in the condition.

Copy
def foo(a, b):
flag = True

if (a and not a):  # Noncompliant
    doSomething()  # Never executed

if (flag): # Noncompliant
    return "Result 1"
return "Result 2" # Never executed

Functions, methods and lambdas should not have too many mandatory parameters

Function, methods and lambdas should not have too many mandatory parameters, i.e. parameters with no default value. Calling them will require code difficult to read and maintain. To solve this problem you could wrap some parameters in an object, split the function into simpler functions with less parameters or provide default values for some parameters.

Copy
def do_something(param1, param2, param3, param4, param5):  # Noncompliant
...

List comprehensions should be used

There are several ways to create a new list based on the elements of some other collection, but the use of a list comprehension has multiple benefits. First, it is both concise and readable, and second, it yields a fully-formed object without requiring a mutable object as input that must be updated multiple times in the course of the list creation.

Copy
squares = []
for x in range(10):
squares.append(x**2)  # Noncompliant

squares = map(lambda x: x**2, range(10))  #Noncompliant

zoneinfo should be preferred to pytz when using Python 3.9 and later

In Python 3.9 and later, the zoneinfo module is the recommended tool for handling timezones, replacing the pytz library. This recommendation is based on several key advantages.

First, zoneinfo is part of Python’s standard library, making it readily available without needing additional installation, unlike pytz.

Second, zoneinfo integrates seamlessly with Python’s datetime module. You can directly use zoneinfo timezone objects when creating datetime objects, making it more intuitive and less error-prone than pytz, which requires a separate localize method for this purpose.

Third, zoneinfo handles historical timezone changes more accurately than pytz. When a pytz timezone object is used, it defaults to the earliest known offset, which can lead to unexpected results. zoneinfo does not have this issue.

Lastly, zoneinfo uses the system’s IANA time zone database when available, ensuring it works with the most up-to-date timezone data. In contrast, pytz includes its own copy of the IANA database, which may not be as current.

In summary, zoneinfo offers a more modern, intuitive, and reliable approach to handling timezones in Python 3.9 and later, making it the preferred choice over pytz.

Copy
from datetime import datetime
import pytz

dt = pytz.timezone('America/New_York').localize(datetime(2022, 1, 1))  # Noncompliant: the localize method is needed to avoid bugs (see S6887)

Collection sizes and array length comparisons should make sense

The length of a collection is always greater than or equal to zero. Testing it doesn’t make sense, since the result is always true.

Copy
mylist = []
if len(myList) >= 0:  # Noncompliant: always true
pass

SystemExit should be re-raised

A `SystemExit exception is raised when sys.exit() is called. This exception is used to signal the interpreter to exit. The exception is expected to propagate up until the program stops. It is possible to catch this exception in order to perform, for example, clean-up tasks. It should, however, be raised again to allow the interpreter to exit as expected. Not re-raising such exception could lead to undesired behaviour.

A bare except: statement, i.e. an except block without any exception class, is equivalent to except BaseException. Both statements will catch every exceptions, including SystemExit. It is recommended to catch instead a more specific exception. If it is not possible, the exception should be raised again.

It is also a good idea to re-raise the KeyboardInterrupt exception. Similarly to SystemExit,KeyboardInterrupt` is used to signal the interpreter to exit. Not re-raising such exception could also lead to undesired behaviour.

Copy
try:
...
except SystemExit:  # Noncompliant: the SystemExit exception is not re-raised.
pass

try:
...
except BaseException:  # Noncompliant: BaseExceptions encompass SystemExit exceptions and should be re-raised.
pass

try:
...
except:  # Noncompliant: exceptions caught by this statement should be re-raised or a more specific exception should be caught.
pass

inplace=True should not be used when modifying a Pandas DataFrame

Using inplace=True when modifying a Pandas DataFrame means that the method will modify the DataFrame in place, rather than returning a new object:

Copy
df.an_operation(inplace=True)

Silly equality checks should not be made

In some cases a comparison with operators `==, or != will always return True or always return False. When this happens, the comparison and all its dependent code can simply be removed. This includes:

  • comparing unrelated builtin types such as string and integer.

  • comparing class instances which do not implement eq or ne to an object of a different type (builtin or from an unrelated class which also doesn’t implement eq or ne`).

Copy
foo = 1 == "1"  # Noncompliant. Always False.

foo = 1 != "1"  # Noncompliant. Always True.

class A:
pass

myvar = A() == 1  # Noncompliant. Always False.
myvar = A() != 1  # Noncompliant. Always True.

Assertions should not fail or succeed unconditionally

Assertions are meant to detect when code behaves as expected. An assertion which fails or succeeds all the time does not achieve this. Either it is redundant and should be removed to improve readabity or it is a mistake and the assertion should be corrected.

This rule raises an issue when an assertion method is given parameters which will make it succeed or fail all the time. It covers three cases:

  • an `assert statement or a unittest’s assertTrue or assertFalse method is called with a value which will be always True or always False.

  • a unittest’s assertIsNotNone or assertIsNone method is called with a value which will always be None or never be None.

  • a unittest’s assertIsNot or assertIs method is called with a literal expression creating a new object every time (ex: [1, 2, 3]`).

Copy
import unittest

class MyTestCase(unittest.TestCase):
def expect_not_none(self):
    self.assertIsNotNone(round(1.5))  # Noncompliant: This assertion always succeeds because "round" returns a number, not None.

def helper_compare(param):
    self.assertIs(param, [1, 2, 3])  # Noncompliant: This assertion always fails because [1, 2, 3] creates a new object.

Function parameters initial values should not be ignored

While it is technically correct to assign to parameters from within function bodies, doing so before the parameter value is read is likely a bug. Instead, initial values of parameters should be, if not treated as read-only, then at least read before reassignment.

Copy
def foo(strings, param):
param = 1  # NonCompliant

The method __ne__ should not be implemented without also implementing __eq__

Implementing the special method `ne is not equivalent to implementing the special method eq. By default ne will call eq, but the default implementation of eq does not call ne.

This rule raises an issue when the special method ne is implemented but not the eq` method.

Copy
class Ne:
def __ne__(self, other):   # Noncompliant.
    return False

myvar = Ne() == 1  # False. __ne__ is not called
myvar = 1 == Ne()  # False. __ne__ is not called
myvar = Ne() != 1  # False
myvar = 1 != Ne()  # False

__iter__ should return an iterator

An iterable object is an object capable of returning its members one at a time. To do so, it must define an `iter method that returns an iterator.

The iterator protocol specifies that, in order to be a valid iterator, an object must define a next and an iter method (because iterators are also iterable).

Defining an iter method that returns anything else than an iterator will raise a TypeError as soon as the iteration begins.

Note that generators and generator expressions have both next and iter` methods generated automatically.

Copy
class MyIterable:
def __init__(self, values):
    self._values = values

def __iter__(self):
    return None  # Noncompliant: Not a valid iterator

\ should only be used as an escape character outside of raw strings

Typically, backslashes are seen only as part of escape sequences. Therefore, the use of a backslash outside of a raw string or escape sequence looks suspiciously like a broken escape sequence.

Characters recognized as escape-able are: abfnrtvox‘“

Copy
s = "Hello \world."
t = "Nice to \ meet you"
u = "Let's have \ lunch"

Unread private attributes should be removed

Python has no real private attribute. Every attribute is accessible. There are however two conventions indicating that an attribute is not meant to be “public”:

  • attributes with a name starting with a single underscore (ex: `_myattribute) should be seen as non-public and might change without prior notice. They should not be used by third-party libraries or software. It is ok to use those methods inside the library defining them but it should be done with caution.

  • ”class-private” attributes have a name starting with at least two underscores and ending with at most one underscore. These attributes’ names will be automatically mangled to avoid collision with subclasses’ attributes. For example __myattribute will be renamed as _classname__myattribute, where classname` is the attribute’s class name without its leading underscore(s). They shouldn’t be used outside of the class defining the attribute.

This rule raises an issue when a class-private attribute (two leading underscores, max one underscore at the end) is never read inside the class. It optionally raises an issue on unread attributes prefixed with a single underscore. Both class attributes and instance attributes will raise an issue.

Copy
class Noncompliant:
_class_attr = 0  # Noncompliant if enable_single_underscore_issues is enabled
__mangled_class_attr = 1  # Noncompliant

def __init__(self, value):
    self._attr = 0  # Noncompliant if enable_single_underscore_issues is enabled
    self.__mangled_attr = 1  # Noncompliant

def compute(self, x):
    return x * x

Unpacking should be done with the same number of elements of the iterable.

In Python, the unpacking assignment is a powerful feature that allows you to assign multiple values to multiple variables in a single statement.

The basic rule for the unpacking assignment is that the number of variables on the left-hand side must be equal to the number of elements in the iterable. If this is not respected, a ValueError will be produced at runtime.

Copy
def foo(param):
ls = [1, 2, 3]
x, y = ls # Noncompliant: 'ls' contains more elements than there are variables on the left-hand side

The axis argument should be specified when using TensorFlows reduction operations

The result of TensorFlow’s reduction operations (i.e. tf.math.reduce_sum, tf.math.reduce_std), highly depends on the shape of the Tensor provided.

Copy
import tensorflow as tf

x = tf.constant([[1, 1, 1], [1, 1, 1]])
tf.math.reduce_sum(x)

Caught Exceptions must derive from BaseException

In Python 3’s except statement, attempting to catch an object that does not derive from BaseException will raise a TypeError.

In order to catch multiple exceptions in an except statement, a tuple of exception classes should be provided. Trying to catch multiple exceptions with a list or a set will raise a TypeError.

If you are about to create a custom exception class, note that custom exceptions should inherit from `Exception, rather than BaseException.

BaseException is the base class for all built-in exceptions in Python, including system-exiting exceptions like SystemExit or KeyboardInterrupt, which are typically not meant to be caught. On the other hand, Exception is intended for exceptions that are expected to be caught, which is generally the case for user-defined exceptions. See PEP 352 for more information.

To fix this issue, make sure the expression used in an except statement is an exception which derives from BaseException/Exception` or a tuple of such exceptions.

Copy
class CustomException(object):
"""An Invalid exception class."""
pass

try:
...
except CustomException:  # Noncompliant: this custom exception does not derive from BaseException or Exception.
print("exception")

try:
...
except [TypeError, ValueError]:  # Noncompliant: list of exceptions, only tuples are valid.
print("exception")

Only strings should be listed in __all__

The all property of a module is used to define the list of names that will be imported when performing a wildcard import of this module, i.e. when from mymodule import * is used.

In the following example:

Copy
# mymodule.py
def foo(): ...
def bar(): ...
__all__ = ["foo"]

__slots__ should not be used in old-style classes

A class without an explicit extension of object (class ClassName(object)) is considered an old-style class, and slots declarations are ignored in old-style classes. Having such a declaration in an old-style class could be confusing for maintainers and lead them to make false assumptions about the class.

Copy
class A:
__slots__ = ["id"]  # Noncompliant; this is ignored

def __init__(self):
self.id = id
self.name = "name"  # name wasn't declared in __slots__ but there's no error

a = A()

Function returns should have type hints

Being a dynamically typed language, the Python interpreter only does type checking during runtime. Getting the typing right is important as certain operations may result in a TypeError.

Type hints can be used to clarify the expected return type of a function, enabling developers to better document its contract. Applying them consistently makes the code easier to read and understand.

In addition, type hints allow some development environments to offer better autocompletion and improve the precision of static analysis tools.

Copy
def hello(name):
return 'Hello ' + name

Method names should comply with a naming convention

Shared naming conventions allow teams to collaborate efficiently.

This rule raises an issue when a method name does not match a provided regular expression.

For example, with the default provided regular expression ^[a-z_][a-z0-9_]*$, the method:

Copy
class MyClass:
def MyMethod(a,b): # Noncompliant
    ...

Django models should define a __str__ method

The str method in Django models is used to represent the model instance as a string. For example, the return value of this method will be inserted in a template when displaying an object in the Django admin site. Without this method, the model instance will be represented by its object identifier, which is not meaningful to end-users. This can result in confusion and make debugging more difficult.

Copy
from django.db import models

class MyModel(models.Model):
name = models.CharField(max_length=100)

Doubled prefix operators not and ~ should not be used

The repetition of a prefix operator (not or ~) is usually a typo. The second operator invalidates the first one:

Copy
a = False
b = ~~a # Noncompliant: equivalent to "a"

Type checks shouldnt be confusing

Checking that variable X has type T with type annotations implies that X’s value is of type T or a subtype of T. After such a check, it is a good practice to limit actions on X to those allowed by type T, even if a subclass of T allows different actions. Doing otherwise will confuse your fellow developers.

Just to be clear, it is common in python to perform an action without checking first if it is possible (see “Easier to ask for forgiveness than permission.”). However when type checks are performed, they should not contradict the following actions.

This rule raises an issue when an action performed on a variable might be possible, but it contradicts a previous type check. The list of checked actions corresponds to rules S2159, S3403, S5607, S5756, S5644, S3862, S5797, S5795 and S5632. These other rules only detect cases where the type of a variable is certain, i.e. it cannot be a subclass.

Copy
def add_the_answer(param: str):
return param + 42  # Noncompliant. Fix this "+" operation; Type annotation on "param" suggest that operands have incompatible types.
# Note: In practice it is possible to create a class inheriting from both "str" and "int", but this would be a very confusing design.

Names of regular expressions named groups should be used

Why use named groups only to never use any of them later on in the code?

This rule raises issues every time named groups are:

  • referenced while not defined;

  • defined but called elsewhere in the code by their number instead.

Copy
import re

def foo():
pattern = re.compile(r"(?P<a>.)")
matches = pattern.match("abc")
g1 = matches.group("b") # Noncompliant - group "b" is not defined
g2 = matches.group(1) # Noncompliant - Directly use 'a' instead of its group number.

Control flow statements if, for, while, try and with should not be nested too deeply

Nested control flow statements if, for, while, try, and with are often key ingredients in creating what’s known as “Spaghetti code”. This code smell can make your program difficult to understand and maintain.

When numerous control structures are placed inside one another, the code becomes a tangled, complex web. This significantly reduces the code’s readability and maintainability, and it also complicates the testing process.

Copy
if condition1:           # Compliant - depth = 1
# ...
if condition2:         # Compliant - depth = 2
# ...
for i in range(10):  # Compliant - depth = 3
  # ...
  if condition3:     # Compliant - depth = 4
    if condition4:     # Non-Compliant - depth = 5, which exceeds the limit
      if condition5:   # Depth = 6, exceeding the limit, but issues are only reported on depth = 5
        # ...

Conditional expressions should not be nested

Nested conditionals are hard to read and can make the order of operations complex to understand.

Copy
class Job:
@property
def readable_status(self):
    return "Running" if job.is_running else "Failed" if job.errors else "Succeeded"  # Noncompliant

A subclass should not be in the same except statement as a parent class

In Python it is possible to catch multiple types of exception in a single except statement using a tuple of the exceptions.

Repeating an exception class in a single except statement will not fail but it does not have any effect. Either the exception class is not the one which should be caught, or it is duplicated code which should be removed.

Having a subclass and a parent class in the same except statement does not provide any benefit either. It is enough to keep only the parent class.

Copy
try:
...
except (TypeError, TypeError):  # Noncompliant: duplicated code or incorrect exception class.
print("Foo")

try:
...
except (NotImplementedError, RuntimeError):  # Noncompliant: NotImplementedError inherits from RuntimeError.
print("Foo")

tf.Variable objects should be singletons when created inside of a tf.function

tensorflow.functions only supports singleton tensorflow.Variables. This means the variable will be created on the first call of the tensorflow.function and will be reused across the subsequent calls. Creating a tensorflow.Variable that is not a singleton will raise a ValueError.

Copy
import tensorflow as tf

@tf.function
def f(x):
v = tf.Variable(1.0)
return v

__future__ imports should be the first statements in a module

Importing a feature from the future module turns on that feature from a future version of Python in your module. The purpose is to allow you to gradually transition to the new features or incompatible changes in future language versions, rather than having to make the entire jump at once.

Because such changes must be applied to the entirety of a module to work, putting such imports anywhere but in the beginning of the module doesn’t make sense. It would mean applying those restrictions to only part of your code. Because that would lead to inconsistencies and massive confusion, it’s not allowed.

Copy
name = "John"

from __future__ import division # Noncompliant

except clauses should do more than raise the same issue

An except clause that only rethrows the caught exception has the same effect as omitting the except altogether and letting it bubble up automatically.

Copy
a = {}
try:
a[5]
except KeyError:
raise  # Noncompliant

numpy.random.Generator should be preferred to numpy.random.RandomState

Using a predictable seed is a common best practice when using NumPy to create reproducible results. To that end, using np.random.seed(number) to set the seed of the global numpy.random.RandomState has been the privileged solution for a long time.

numpy.random.RandomState and its associated methods rely on a global state, which may be problematic when threads or other forms of concurrency are involved. The global state may be altered and the global seed may be reset at various points in the program (for instance, through an imported package or script), which would lead to irreproducible results.

Instead, the preferred best practice to generate reproducible pseudorandom numbers is to instantiate a numpy.random.Generator object with a seed and reuse it in different parts of the code. This avoids the reliance on a global state. Whenever a new seed is needed, a new generator may be created instead of mutating a global state.

Below is the list of legacy functions and their alternatives:

Legacy function name

numpy.random.Generator alternative

numpy.random.RandomState.seed

numpy.random.default_rng

numpy.random.RandomState.rand

numpy.random.Generator.random

numpy.random.RandomState.randn

numpy.random.Generator.standard_normal

numpy.random.RandomState.randint

numpy.random.Generator.integers

numpy.random.RandomState.random_integers

numpy.random.Generator.integers

numpy.random.RandomState.random_sample

numpy.random.Generator.random

numpy.random.RandomState.choice

numpy.random.Generator.choice

numpy.random.RandomState.bytes

numpy.random.Generator.bytes

numpy.random.RandomState.shuffle

numpy.random.Generator.shuffle

numpy.random.RandomState.permutation

numpy.random.Generator.permutation

numpy.random.RandomState.beta

numpy.random.Generator.beta

numpy.random.RandomState.binomial

numpy.random.Generator.binomial

numpy.random.RandomState.chisquare

numpy.random.Generator.chisquare

numpy.random.RandomState.dirichlet

numpy.random.Generator.dirichlet

numpy.random.RandomState.exponential

numpy.random.Generator.exponential

numpy.random.RandomState.f

numpy.random.Generator.f

numpy.random.RandomState.gamma

numpy.random.Generator.gamma

numpy.random.RandomState.geometric

numpy.random.Generator.geometric

numpy.random.RandomState.gumbel

numpy.random.Generator.gumbel

numpy.random.RandomState.hypergeometric

numpy.random.Generator.hypergeometric

numpy.random.RandomState.laplace

numpy.random.Generator.laplace

numpy.random.RandomState.logistic

numpy.random.Generator.logistic

numpy.random.RandomState.lognormal

numpy.random.Generator.lognormal

numpy.random.RandomState.logseries

numpy.random.Generator.logseries

numpy.random.RandomState.multinomial

numpy.random.Generator.multinomial

numpy.random.RandomState.multivariate_normal

numpy.random.Generator.multivariate_normal

numpy.random.RandomState.negative_binomial

numpy.random.Generator.negative_binomial

numpy.random.RandomState.noncentral_chisquare

numpy.random.Generator.noncentral_chisquare

numpy.random.RandomState.noncentral_f

numpy.random.Generator.noncentral_f

numpy.random.RandomState.normal

numpy.random.Generator.normal

numpy.random.RandomState.pareto

numpy.random.Generator.pareto

numpy.random.RandomState.poisson

numpy.random.Generator.poisson

numpy.random.RandomState.power

numpy.random.Generator.power

numpy.random.RandomState.rayleigh

numpy.random.Generator.rayleigh

numpy.random.RandomState.standard_cauchy

numpy.random.Generator.standard_cauchy

numpy.random.RandomState.standard_exponential

numpy.random.Generator.standard_exponential

numpy.random.RandomState.standard_gamma

numpy.random.Generator.standard_gamma

numpy.random.RandomState.standard_normal

numpy.random.Generator.standard_normal

numpy.random.RandomState.standard_t

numpy.random.Generator.standard_t

numpy.random.RandomState.triangular

numpy.random.Generator.triangular

numpy.random.RandomState.uniform

numpy.random.Generator.uniform

numpy.random.RandomState.vonmises

numpy.random.Generator.vonmises

numpy.random.RandomState.wald

numpy.random.Generator.wald

numpy.random.RandomState.weibull

numpy.random.Generator.weibull

numpy.random.RandomState.zipf

numpy.random.Generator.zipf

numpy.random.beta

numpy.random.Generator.beta

numpy.random.binomial

numpy.random.Generator.binomial

numpy.random.bytes

numpy.random.Generator.bytes

numpy.random.chisquare

numpy.random.Generator.chisquare

numpy.random.choice

numpy.random.Generator.choice

numpy.random.dirichlet

numpy.random.Generator.dirichlet

numpy.random.exponential

numpy.random.Generator.exponential

numpy.random.f

numpy.random.Generator.f

numpy.random.gamma

numpy.random.Generator.gamma

numpy.random.geometric

numpy.random.Generator.geometric

numpy.random.gumbel

numpy.random.Generator.gumbel

numpy.random.hypergeometric

numpy.random.Generator.hypergeometric

numpy.random.laplace

numpy.random.Generator.laplace

numpy.random.logistic

numpy.random.Generator.logistic

numpy.random.lognormal

numpy.random.Generator.lognormal

numpy.random.logseries

numpy.random.Generator.logseries

numpy.random.multinomial

numpy.random.Generator.multinomial

numpy.random.multivariate_normal

numpy.random.Generator.multivariate_normal

numpy.random.negative_binomial

numpy.random.Generator.negative_binomial

numpy.random.noncentral_chisquare

numpy.random.Generator.noncentral_chisquare

numpy.random.noncentral_f

numpy.random.Generator.noncentral_f

numpy.random.normal

numpy.random.Generator.normal

numpy.random.pareto

numpy.random.Generator.pareto

numpy.random.permutation

numpy.random.Generator.permutation

numpy.random.poisson

numpy.random.Generator.poisson

numpy.random.power

numpy.random.Generator.power

numpy.random.rand

numpy.random.Generator.random

numpy.random.randint

numpy.random.Generator.integers

numpy.random.randn

numpy.random.Generator.standard_normal

numpy.random.random

numpy.random.Generator.random

numpy.random.random_integers

numpy.random.Generator.integers

numpy.random.random_sample

numpy.random.Generator.random

numpy.random.ranf

numpy.random.Generator.random

numpy.random.rayleigh

numpy.random.Generator.rayleigh

numpy.random.sample

numpy.random.Generator.random

numpy.random.seed

numpy.random.default_rng

numpy.random.shuffle

numpy.random.Generator.shuffle

numpy.random.standard_cauchy

numpy.random.Generator.standard_cauchy

numpy.random.standard_exponential

numpy.random.Generator.standard_exponential

numpy.random.standard_gamma

numpy.random.Generator.standard_gamma

numpy.random.standard_normal

numpy.random.Generator.standard_normal

numpy.random.standard_t

numpy.random.Generator.standard_t

numpy.random.triangular

numpy.random.Generator.triangular

numpy.random.uniform

numpy.random.Generator.uniform

numpy.random.vonmises

numpy.random.Generator.vonmises

numpy.random.wald

numpy.random.Generator.wald

numpy.random.weibull

numpy.random.Generator.weibull

numpy.random.zipf

numpy.random.Generator.zipf

Copy
import numpy as np
def foo():
np.random.seed(42)
x = np.random.randn()  # Noncompliant: this relies on numpy.random.RandomState, which is deprecated

Docstrings should be defined

A string literal that is the first statement in a module, function, class, or method is a docstring. A docstring should document what a caller needs to know about the code. Information about what it does, what it returns, and what it requires are all valid candidates for documentation. Well written docstrings allow callers to use your code without having to first read it and understand its logic.

By convention, docstrings are enclosed in three sets of double-quotes.

Copy
def my_function(a,b):

tensorflow.function should not be recursive

When defining a tensorflow.function it is generally a bad practice to make this function recursive. TensorFlow does not support recursive tensorflow.function and will in the majority of cases throw an exception. However it is possible as well that the execution of such function succeeds, but with multiple tracings which has strong performance implications. When executing tensorflow.function, the code is split into two distinct stages. The first stage call tracing creates a new tensorflow.Graph, runs the Python code normally, but defers the execution of TensorFlow operations (i.e. adding two Tensors). These operations are added to the graph without being ran. The second stage which is much faster than the first, runs everything that was deferred previously. Depending on the input of the tensorflow.function the first stage may not be needed, see: Rules of tracing. Skipping this first stage is what provides the user with TensorFlow’s high performance.

Having a recursive tensorflow.function prevents the user from benefiting of TensorFlow’s capabilities.

Copy
import tensorflow as tf

@tf.function
def factorial(n):
 if n == 1:
    return 1
else:
    return (n * factorial(n-1)) # Noncompliant: the function is recursive

Default parameter values should be immutable

While the assignment of default parameter values is typically a good thing, it can go very wrong very quickly when mutable objects are used. That’s because a new instance of the object is not created for each function invocation. Instead, all invocations share the same instance, and the changes made for one caller are made for all!

Copy
def get_attr_array(obj, arr=[]):  # Noncompliant
props = (name for name in dir(obj) if not name.startswith('_'))
arr.extend(props)  # after only a few calls, this is a big array!
return arr

Functions returns should not be invariant

When a function is designed to return an invariant value, it may be poor design, but it shouldn’t adversely affect the outcome of your program. However, when it happens on all paths through the logic, it is surely a bug.

This rule raises an issue when a function contains several return statements that all return the same value.

Copy
def foo(a):  # NonCompliant
b = 12
if a == 1:
    return b
return b

Exception and BaseException should not be raised

Raising instances of `Exception and BaseException will have a negative impact on any code trying to catch these exceptions.

From a consumer perspective, it is generally a best practice to only catch exceptions you intend to handle. Other exceptions should ideally not be caught and let to propagate up the stack trace so that they can be dealt with appropriately. When a generic exception is thrown, it forces consumers to catch exceptions they do not intend to handle, which they then have to re-raise.

Besides, when working with a generic type of exception, the only way to distinguish between multiple exceptions is to check their message, which is error-prone and difficult to maintain. Legitimate exceptions may be unintentionally silenced and errors may be hidden.

For instance, if an exception such as SystemExit` is caught and not re-raised, it will prevent the program from stopping.

When raising an exception, it is therefore recommended to raising the most specific exception possible so that it can be handled intentionally by consumers.

Copy
def check_value(value):
if value < 0:
    raise BaseException("Value cannot be negative") # Noncompliant: this will be difficult for consumers to handle

The input_shape parameter should not be specified for tf.keras.Model subclasses

Keras provides a full-featured model class called tensorflow.keras.Model. It inherits from tensorflow.keras.layers.Layer, so a Keras model can be used and nested in the same way as Keras layers. Keras models come with extra functionality that makes them easy to train, evaluate, load, save, and even train on multiple machines.

As the tensorflow.keras.Model class inherits from the ‘tensorflow.keras.layers’ you do not need to specify input_shape in a subclassed model; this argument will be ignored.

Copy
import tensorflow as tf

class MyModel(tf.keras.Model):
def __init__(self):
    super(MyModel, self).__init__(input_shape=...)  # Noncompliant: this parameter will be ignored

Function parameters should have type hints

Being a dynamically typed language, the Python interpreter only does type checking during runtime. Getting the typing right is important as certain operations may result in a TypeError.

Type hints can be used to clarify the expected parameters of a function, enabling developers to better document its contract. Applying them consistently makes the code easier to read and understand.

In addition, type hints allow some development environments to offer better autocompletion and improve the precision of static analysis tools.

Copy
def hello(name)  str:
return 'Hello ' + name

Generic functions should be defined using the type parameter syntax

Prior to Python 3.12 functions using generic types were created as follows:

Copy
from typing import TypeVar

_T = TypeVar("_T")

def func(a: _T, b: _T)  _T:
...

Function parameters default values should not be modified or assigned

In Python, function parameters can have default values.

These default values are expressions which are evalutated when the function is defined, i.e. only once. The same default value will be used every time the function is called. Therefore, modifying it will have an effect on every subsequent call. This can lead to confusing bugs.

Copy
def myfunction(param=foo()):  # foo is called only once, when the function is defined.
...

New-style classes should be used

The new style of class creation, with the declaration of a parent class, created a unified object model in Python, so that the type of an instantiated class is equal to its class. In Python 2.2-2.7, this is not the case for old-style classes. In Python 3+ all classes are new-style classes. However, since the behavior can differ from 2.2+ to 3+, explicitly inheriting from object (if there is no better candidate) is recommended.

Copy
class MyClass():
pass

String formatting should not lead to runtime errors

Formatting strings, either with the % operator or str.format method, requires a valid string and arguments matching this string’s replacement fields.

This rule raises an issue when formatting a string will raise an exception because the input string or arguments are invalid. Rule S3457 covers cases where no exception is raised and the resulting string is simply not formatted properly.

Copy
print('Error code %d' % '42')  # Noncompliant. Replace this value with a number as %d requires.

print('User {1} is not allowed to perform this action'.format('Bob'))  # Noncompliant. Replacement field numbering should start at 0.

print('User {0} has not been able to access {}'.format('Alice', 'MyFile'))  # Noncompliant. Use only manual or only automatic field numbering, don't mix them.

print('User {a} has not been able to access {b}'.format(a='Alice'))  # Noncompliant. Provide a value for field "b".

Using timezone-aware datetime objects should be preferred over using datetime.datetime.utcnow and datetime.datetime.utcfromtimestamp

Python’s datetime API provide several different ways to create datetime objects. One possibility is the to use datetime.datetime.utcnow or datetime.datetime.utcfromtimestamp functions. The issue with these two functions is they are not time zone aware, even if their name would suggest otherwise.

Using these functions could cause issue as they may not behave as expected, for example:

Copy
from datetime import datetime
timestamp = 1571595618.0
date = datetime.utcfromtimestamp(timestamp)
date_timestamp = date.timestamp()

assert timestamp == date_timestamp

The exec statement should not be used

Use of the `exec statement could be dangerous, and should be avoided. Moreover, the exec statement was removed in Python 3.0. Instead, the built-in exec() function can be used.

Use of the exec statement is strongly discouraged for several reasons such as:

  • Security Risks: Executing code from a string opens up the possibility of code injection attacks.

  • Readability and Maintainability: Code executed with exec statement is often harder to read and understand since it is not explicitly written in the source code.

  • Performance Implications: The use of exec statement can have performance implications since the code is compiled and executed at runtime.

  • Limited Static Analysis: Since the code executed with exec` statement is only known at runtime, static code analysis tools may not be able to catch certain errors or issues, leading to potential bugs.

Copy
exec 'print 1' # Noncompliant

Collections should not be modified while they are iterated

Iterating over a collection using a for loop in Python relies on iterators.

An iterator is an object that allows you to traverse a collection of elements, such as a list or a dictionary. Iterators are used in for loops to iterate over the elements of a collection one at a time.

When you create an iterator, it keeps track of the current position in the collection and provides a way to access the next element. The next() function is used to retrieve the next element from the iterator. When there are no more elements to iterate over, the next() function raises a StopIteration exception and the iteration stops.

It is important to note that iterators are designed to be read-only. Modifying a collection while iterating over it can cause unexpected behavior, as the iterator may skip over or repeat elements. A RuntimeError may also be raised in this situation, with the message changed size during iteration. Therefore, it is important to avoid modifying a collection while iterating over it to ensure that your code behaves as expected.

If you still want to modify the collection, it is best to use a second collection or to iterate over a copy of the original collection instead.

Copy
def my_fun(my_dict):
for key in my_dict:
    if my_dict[key] == 'foo':
        my_dict.pop(key) # Noncompliant: this will make the iteration unreliable

Parentheses should not be used after certain keywords

Parentheses are not required after the assert, del, elif, except, for, if, in, not, raise, return, while, and yield keywords, and using them unnecessarily impairs readability. They should therefore be omitted.

Copy
x = 1
while (x < 10):
print "x is now %d" % (x)
x += 1

Static methods should not have self or cls arguments

Unlike class and instance methods, static methods don’t receive an implicit first argument. Nonetheless naming the first argument self or cls guarantees confusion - either on the part of the original author, who may never understand why the arguments don’t hold the values he expected, or on that of future maintainers.

Copy
class MyClass:
@staticmethod
def s_meth(self, arg1, arg2):  #Noncompliant
# ...

Return values from functions without side effects should not be ignored

When the call to a function doesn’t have any side effects, what is the point of making the call if the results are ignored? In such case, either the function call is useless and should be dropped or the source code doesn’t behave as expected.

This rule raises an issue when a builtin function or methods which has no side effects is called and its result is not used.

Copy
myvar = "this is a multiline"
"message from {}".format(sender)  # Noncompliant. The formatted string is not used because the concatenation is not done properly.

The yield keyword should only be used in generators

As soon as the `yield keyword is used the enclosing method or function becomes a generator. Thus yield should never be used in a function or method which is not intended to be a generator.

This rule raises an issue when yield or yield from are used in a function or method which is not a generator because:

  • the function/method’s return type annotation is not [typing.Generator[…]|https://docs.python.org/3/library/typing.html#typing.Generator]

  • it is a special method which can never be a generator (ex: init`).

Copy
class A:
def __init__(self, value):
    self.value = value
    yield value  # Noncompliant

def mylist2()  List[str]:
yield ['string']  # Noncompliant. Return should be used instead of yield

def generator_ok()  Generator[int, float, str]:
sent = yield 42
return '42'

Builtins should not be shadowed by local variables

Defining a variable with the same name as a built-in symbol will “shadow” it. That means that the builtin will no longer be accessible through its original name, having locally been replaced by the variable.

Shadowing a builtin makes the code more difficult to read and maintain. It may also be a source of bugs as you can reference the builtin by mistake.

It is sometimes acceptable to shadow a builtin to improve the readability of a public API or to support multiple versions of a library. In these cases, benefits are greater than the maintainability cost. This should, however, be done with care.

It is generally not a good practice to shadow builtins with variables which are local to a function or method. These variables are not public and can easily be renamed, thus reducing the confusion and making the code less error-prone.

Copy
def a_function():
int = 42  # Noncompliant; int is a builtin

Calls should not be made to non-callable values

In order to be callable, a Python class should implement the `call method. Thanks to this method, an instance of this class will be callable as a function.

However, when making a call to a non-callable object, a TypeError will be raised.

In order to fix this issue, make sure that the object you are trying to call has a call` method.

Copy
class MyClass:
pass

myvar = MyClass()
myvar()  # Noncompliant

none_var = None
none_var()  # Noncompliant

Asserts should not be used to check the parameters of a public method

An assert is inappropriate for parameter validation because assertions are disabled globally at the interpreter level when the application runs as optimize bytecode (-O and -OO command line switches). It means that the optimize version of the application would completely eliminate the intended checks.

This rule raises an issue when a public method uses one or more of its parameters with asserts.

Copy
class Shop:

def setPrice(self, price):
assert(price >= 0 and price <= MAX_PRICE) # Noncompliant
// Set the price

Function arguments should be passed only once

When a function is called, it accepts only one value per parameter. The Python interpreter will raise a SyntaxError when the same parameter is provided more than once, i.e. myfunction(a=1, a=2).

Other less obvious cases will also fail at runtime by raising a TypeError, when:

  • An argument is provided by value and position at the same time.

  • An argument is provided twice, once via unpacking and once by value or position.

Copy
def func(a, b, c):
return a * b * c

func(6, 93, 31, c=62) # Noncompliant: argument "c" is duplicated

params = {'c':31}
func(6, 93, 31, **params) # Noncompliant: argument "c" is duplicated
func(6, 93, c=62, **params) # Noncompliant: argument "c" is duplicated

with statements should be used with context managers

The with statement is used to wrap the execution of a block with methods defined by a context manager. The context manager handles the entry into, and the exit from, the desired runtime context for the execution of the block of code. To do so, a context manager should have an enter and an exit method.

Executing the following block of code:

Copy
class MyContextManager:
def __enter__(self):
    print("Entering")

def __exit__(self, exc_type, exc_val, exc_tb):
    print("Exiting")


with MyContextManager():
print("Executing body")

Wildcard imports should not be used

Importing every public name from a module using a wildcard (from mymodule import *) is a bad idea because:

  • It could lead to conflicts between names defined locally and the ones imported.

  • It reduces code readability as developers will have a hard time knowing where names come from.

  • It clutters the local namespace, which makes debugging more difficult.

Remember that imported names can change when you update your dependencies. A wildcard import that works today might be broken tomorrow.

Copy
# file: mylibrary/pyplot.py
try:
from guiqwt.pyplot import *  # Ok
except Exception:
from matplotlib.pyplot import *  # Ok

__exit__ should accept type, value, and traceback arguments

The exit method is invoked with four arguments: self, type, value and traceback. Leave one of these out of the method declaration and the result will be a TypeError at runtime.

Copy
class MyClass:
def __enter__(self):
   pass
def __exit__(self, exc_type, exc_val):  # Noncompliant
   pass

The number and name of arguments passed to a function should match its parameters

Calling a function or a method with fewer or more arguments than expected will raise a TypeError. This is usually a bug and should be fixed.

Copy
######################
# Positional Arguments
######################

param_args = [1, 2, 3]
param_kwargs = {'x': 1, 'y': 2}

def func(a, b=1):
print(a, b)

def positional_unlimited(a, b=1, *args):
print(a, b, *args)

func(1)
func(1, 42)
func(1, 2, 3)  # Noncompliant. Too many positional arguments
func()  # Noncompliant. Missing positional argument for "a"

positional_unlimited(1, 2, 3, 4, 5)

def positional_limited(a, *, b=2):
print(a, b)

positional_limited(1, 2)  # Noncompliant. Too many positional arguments


#############################
# Unexpected Keyword argument
#############################

def keywords(a=1, b=2, *, c=3):
print(a, b, c)

keywords(1)
keywords(1, z=42)  # Noncompliant. Unexpected keyword argument "z"

def keywords_unlimited(a=1, b=2, *, c=3, **kwargs):
print(a, b, kwargs)

keywords_unlimited(a=1, b=2, z=42)

#################################
# Mandatory Keyword argument only
#################################

def mandatory_keyword(a, *, b):
print(a, b)

mandatory_keyword(1, b=2)
mandatory_keyword(1)  # Noncompliant. Missing keyword argument "b"

break and continue should not be used outside a loop

break and continue are control flow statements used inside of loops. break is used to break out of its innermost enclosing loop and continue will continue with the next iteration.

The example below illustrates the use of break in a while loop:

Copy
n = 1
while n < 10:
if n % 3 == 0:
  print("Found a number divisible by 3", n)
  break
n = n + 1

Functions and methods should only return expected values

Python allows developers to customize how code is interpreted by defining special methods (also called magic methods). For example, it is possible to define the objects own truthiness or falsiness by iverriding `bool method. It is invoked when the built-in bool() function is called on an object. The bool() function returns True or False based on the truth value of the object.

The Python interpreter will call these methods when performing the operation they’re associated with. Each special method expects a specific return type. Calls to a special method will throw a TypeError if its return type is incorrect.

An issue will be raised when one of the following methods doesn’t return the indicated type:

  • bool method should return bool

  • index method should return integer

  • repr method should return string

  • str method should return string

  • bytes method should return bytes

  • hash method should return integer

  • format method should return string

  • getnewargs method should return tuple

  • getnewargs_ex` method should return something which is of the form tuple(tuple, dict)

Copy
class MyClass:
def __bool__(self):
    return 0 # Noncompliant: Return value of type bool here.

obj1 = MyClass()
print(bool(obj1)) # TypeError: __bool__ should return bool, returned int

Item operations should be done on objects supporting them

Getting, setting and deleting items using square brackets requires the accessed object to have special methods:

  • Getting items such as `my_variable[key] requires my_variable to have the getitem method, or the class_getitem method if my_variable is a class.

  • Setting items such as my_variable[key] = 42 requires my_variable to have the setitem method.

  • Deleting items such as del my_variable[key] requires my_variable to have the delitem method.

Performing these operations on an object that doesn’t have the corresponding method will result in a TypeError`.

To fix this issue, make sure that the class for which you are trying to perform item operations implements the required methods.

Copy
del (1, 2)[0]  # Noncompliant: tuples are immutable
(1, 2)[0] = 42  # Noncompliant
(1, 2)[0]

class A:
def __init__(self, values):
    self._values = values

a = A([0,1,2])

a[0]  # Noncompliant
del a[0]  # Noncompliant
a[0] = 42  # Noncompliant

class B:
pass

B[0]  # Noncompliant

Replacement strings should reference existing regular expression groups

The regex function re.sub can be used to perform a search and replace based on regular expression matches. The repl parameter can contain references to capturing groups used in the pattern parameter. This can be achieved with \n to reference the n’th group.

When referencing a nonexistent group an error will be thrown for Python < 3.5 or replaced by an empty string for Python >= 3.5.

Copy
re.sub(r"(a)(b)(c)", r"\1, \9, \3", "abc") # Noncompliant - result is an re.error: invalid group reference

The first argument to a super call should be the name of the calling class

The first argument to super must be the name of the class making the call. If it’s not, the result will be a runtime error.

Copy
class Person(object):
#...

class PoliceOfficer(Person):
def __init__(self, name):
super().__init__(name)  // Noncompliant

null=True should not be used on string-based fields in Django models

Using “null=True” on string-based fields can lead to inconsistent and unexpected behavior. In Django, “null=True” allows the field to have a NULL value in the database. However, the Django convention to represent the absence of data for a string is an empty string. Having two ways to represent the absence of data can cause problems when querying and filtering on the field. For example, if a CharField with “null=True” has a value of NULL in the database, querying for an empty string will not return that object.

Copy
class ExampleModel(models.Model):
name = models.CharField(max_length=50, null=True)

GraphQL introspection should be disabled in production

GraphQL introspection is a feature that allows client applications to query the schema of a GraphQL API at runtime. It provides a way for developers to explore and understand the available data and operations supported by the API.

This feature is a diagnostic tool that should only be used in the development phase as its presence also creates risks.

Clear documentation and API references should be considered better discoverability tools for a public GraphQL API.

Copy
from graphql_server.flask import GraphQLView

app.add_url_rule("/api",
view_func=GraphQLView.as_view(  # Noncompliant
    name="api",
    schema=schema,
)
)

Functions, methods and lambdas should not have too many parameters

Functions, methods, or lambdas with a long parameter list are difficult to use, as maintainers must figure out the role of each parameter and keep track of their position.

Copy
def set_coordinates(x1, y1, z1, x2, y2, z2): # Noncompliant
# ...

The safe flag should be set to False when serializing non-dictionary objects in Django JSON-encoded responses.

By default, only dictionary objects can be serialized in Django JSON-encoded response. Before ECMASCript 5, serializing non-dictionary objects could lead to security vulnerabilities. Since most modern browsers implement ECMAScript 5, this vector of attack is no longer a threat and it is possible to serialize non-dictionary objects by setting the safe flag to False. However, if this flag is not set, a TypeError will be thrown by the serializer.

Despite this possibility, it is still recommended to serialize dictionary objects, as an API based on dict is generally more extensible and easier to maintain.

Copy
from django.http import JsonResponse
response = JsonResponse([1, 2, 3])

Walrus operator should not make code confusing

The walrus operator := (also known as “assignment expression”) should be used with caution as it can easily make code more difficult to understand and thus maintain. In such case it is advised to refactor the code and use an assignment statement (i.e. =) instead.

Reasons why it is better to avoid using the walrus operator in Python:

  • Readability: The walrus operator can lead to more complex and nested expressions, which might reduce the readability of the code, especially for developers who are not familiar with this feature.

  • Compatibility: If you are working on projects that need to be compatible with older versions of Python (before 3.8), you should avoid using the walrus operator, as it won’t be available in those versions.

Copy
v0 = (v1 := f(p))  # Noncompliant: Use an assignment statement ("=") instead; ":=" operator is confusing in this context
f'{(x:=10)}' # Noncompliant: Move this assignment out of the f-string; ":=" operator is confusing in this context

Loops without break should not have else clauses

The else clause of a loop is skipped when a break is executed in this loop. In other words, a loop with an else but no break statement will always execute the else part (unless of course an exception is raised or return is used). If this is what the developer intended, it would be much simpler to have the else statement removed and its body unindented. Thus having a loop with an else and no break is most likely an error.

Copy
from typing import List

def foo(elements: List[str]):
for elt in elements:
    if elt.isnumeric():
        return elt
else:  # Noncompliant: no break in the loop
    raise ValueError("List does not contain any number")

def bar(elements: List[str]):
for elt in elements:
    if elt.isnumeric():
        return elt
else:  # Noncompliant: no break in the loop
    raise ValueError("List does not contain any number")

Methods and properties that dont access instance data should be static

Class methods that don’t access instance data can and should be static because they yield more performant code.

To implement a static method in Python one should use either @classmethod or @staticmethod. A class method receives the class as implicit first argument, just like an instance method receives the instance. A static method does not receive an implicit first argument.

Copy
class Utilities:
def do_the_thing(self, arg1, arg2, ...):  # Noncompliant
    #...

Method overrides should not change contracts

Because a subclass instance may be used as an instance of the superclass, overriding methods should uphold the aspects of the superclass contract that relate to the Liskov Substitution Principle. Specifically, an overriding method should be callable with the same parameters as the overriden one.

The following modifications are OK:

  • Adding an optional parameter, i.e. with a default value, as long as they don’t change the order of positional parameters.

  • Renaming a positional-only parameter.

  • Reordering keyword-only parameters.

  • Adding a default value to an existing parameter.

  • Changing the default value of an existing parameter.

  • Extend the ways a parameter can be provided, i.e. change a keyword-only or positional-only parameter to a keyword-or-positional parameter. This is only true if the order of positional parameters doesn’t change. New positional parameters should be placed at the end.

  • Adding a vararg parameter (`*args).

  • Adding a keywords parameter (**kwargs).

The following modifications are not OK:

  • Removing parameters, even when they have default values.

  • Adding mandatory parameters, i.e. without a default value.

  • Removing the default value of a parameter.

  • Reordering parameters, except when they are keyword-only parameters.

  • Removing some ways of providing a parameter. If a parameter could be passed as keyword it should still be possible to pass it as keyword, and the same is true for positional parameters.

  • Removing a vararg parameter (*args).

  • Removing a keywords parameter (**kwargs`).

This rule raises an issue when the signature of an overriding method does not accept the same parameters as the overriden one. Only instance methods are considered, class methods and static methods are ignored.

Copy
class ParentClass(object):
def mymethod(self, param1):
    pass

class ChildClassRenamed(ParentClass):
def mymethod(self, renamed): # No issue but this is suspicious. Rename this parameter as "param1" or use positional only arguments if possible.
    pass

All branches in a conditional structure should not have exactly the same implementation

Having all branches of an if chain with the same implementation indicates a problem.

In the following code:

Copy
if b == 0:  # Noncompliant
do_one_more_thing()
elif b == 1:
do_one_more_thing()
else:
do_one_more_thing()

b = 4 if a > 12 else 4  # Noncompliant

Instance and class methods should have at least one positional parameter

Every instance method is expected to have at least one positional parameter. This parameter will reference the object instance on which the method is called. Calling an instance method which doesn’t have at least one parameter will raise a TypeError. By convention, this first parameter is usually named self.

Class methods, i.e. methods annotated with @classmethod, also require at least one parameter. The only differences is that they will receive the class itself instead of a class instance. By convention, this first parameter is usually named cls.

Copy
class MyClass:
def instance_method():  # Noncompliant: "self" parameter is missing.
    print("instance_method")

@classmethod
def class_method():  # Noncompliant: "cls" parameter is missing.
    print("class_method")

Assert should not be called on a tuple literal

When tested for truthiness, a sequence or collection will evaluate to False if it is empty (its `len method returns 0) and to True if it contains at least one element.

Using the assert statement on a tuple literal will therefore always fail if the tuple is empty, and always succeed otherwise.

The assert statement does not take parentheses around its parameters. Calling assert(x, y) will test if the tuple (x, y) is True, which is always the case.

There are two possible fixes:

  • If your intention is to test the first value of the tuple and use the second value as a message, simply remove the parentheses.

  • If your intention is to check that every element of the tuple is True`, test each value separately.

Copy
def test_values(a, b):
assert (a, b)  # Noncompliant: will always be True

Long suffix L should be upper case

The long suffix should always be written in uppercase, i.e. ‘L’, as the lowercase ‘l’ can easily be confused with the digit one ‘1’.

Copy
return 10l  // Noncompliant; easily confused with one zero one

The validate_indices argument should not be set for tf.gather function call

The tf.gather function allows you to gather slices from a tensor along a specified axis according to the indices provided. The validate_indices argument is deprecated and setting its value has no effect. Indices are always validated on CPU and never validated on GPU.

Copy
import tensorflow as tf

x = tf.constant([[1, 2], [3, 4]])
y = tf.gather(x, [1], validate_indices=True)  # Noncompliant: validate_indices is deprecated

Special methods should have an expected number of parameters

Python allows developers to customize how code is interpreted by defining special methods (also called magic methods). For example, it is possible to override how the multiplication operator (`a * b) will apply to instances of a class by defining in this class the mul and rmul methods. Whenever a multiplication operation is performed with this class, the Python interpreter will call one of these methods instead of performing the default multiplication.

Each special method expects a specific number of parameters. The Python interpreter will call these methods with those parameters. Calls to a special method will throw a TypeError` if it is defined with an incorrect number of parameters.

Copy
class A:
def __mul__(self, other, unexpected):  # Noncompliant: too many parameters
    return 42

def __add__(self):  # Noncompliant: missing one parameter
    return 42

A() * 3  # TypeError: __mul__() missing 1 required positional argument: 'unexpected'
A() + 3  # TypeError: __add__() takes 1 positional argument but 2 were given

Octal escape sequences should not be used in regular expressions

Using octal escapes in regular expressions can create confusion with backreferences. Octal escapes are sequences of digits that represent a character in the ASCII table, and they are sometimes used to represent special characters in regular expressions. However, they can be easily mistaken for backreferences, which are also sequences of digits that represent previously captured groups. This confusion can lead to unexpected results or errors in the regular expression.

Copy
import re

match = re.match(r"\101", "A")

Unused class-private methods should be removed

A method that is never called is dead code, and should be removed. Cleaning out dead code decreases the size of the maintained codebase, making it easier to understand the program and preventing bugs from being introduced.

Python has no real private methods. Every method is accessible. There are however two conventions indicating that a method is not meant to be “public”:

  • methods with a name starting with a single underscore (ex: `_mymethod) should be seen as non-public and might change without prior notice. They should not be used by third-party libraries or software. It is ok to use those methods inside the library defining them but it should be done with caution.

  • ”class-private” methods have a name which starts with at least two underscores and ends with at most one underscore. These methods’ names will be automatically mangled to avoid collision with subclasses’ methods. For example __mymethod will be renamed as _classname__mymethod, where classname` is the method’s class name without its leading underscore(s). These methods shouldn’t be used outside of their enclosing class.

This rule raises an issue when a class-private method (two leading underscores, max one underscore at the end) is never called inside the class. Class methods, static methods and instance methods will all raise an issue.

Copy
class Noncompliant:

@classmethod
def __mangled_class_method(cls):  # Noncompliant
    print("__mangled_class_method")

@staticmethod
def __mangled_static_method():  # Noncompliant
    print("__mangled_static_method")

def __mangled_instance_method(self):  # Noncompliant
    print("__mangled_instance_method")
Python - 1Python - 3
twitterlinkedin
Powered by Mintlify
Assistant
Responses are generated using AI and may contain mistakes.