Get Started
- CodeAnt AI
- Control Center
- Pull Request Review
- IDE
- Compliance
- Anti-Patterns
- Code Governance
- Infrastructure Security Database
- Application Security Database
Python - 2
Learn about Python Anti-Patterns and How they help you write better code, and avoid common pitfalls.
The `values attribute and the to_numpy() method in pandas both provide a way to return a NumPy representation of the DataFrame. However, there are some reasons why the to_numpy() method is recommended over the values attribute:
Future Compatibility: The values attribute is considered a legacy feature, while the to_numpy() is the recommended method to extract data and is considered more future-proof.
Data type consistency: If the DataFrame has columns with different data types, NumPy will choose a common data type that can hold all the data. This may lead to loss of information, unexpected type conversions, or increased memory usage. The to_numpy() allows you to select the common type manually, passing the dtype argument.
View vs Copy: The values attribute can return a view or a copy of the data depending on whether the data needs to be transposed. This can lead to confusion when modifying the extracted data. On the other hand, to_numpy() has copy argument allowing to force it always to return a new NumPy array, ensuring that any changes you make won’t affect the original DataFrame.
Missing values control: The to_numpy() allows to specify the default value used for missing values in the DataFrame, while the values will always use numpy.nan` for missing values.
import pandas as pd
df = pd.DataFrame({
'X': ['A', 'B', 'A', 'C'],
'Y': [10, 7, 12, 5]
})
arr = df.values # Noncompliant: using the 'values' attribute is not recommended
For a specific operator, two types are considered incompatible if no built-in operations between those types exist and none of the operands has implemented the operator’s corresponding special methods. Performing such an operation on incompatible types will raise a TypeError.
Calling an operator in Python is equivalent to calling a special method (except for the identity operator is). Python provides a set of built-in operations. For example, to add two integers: 1 + 2, calling the built-in operator + is equivalent to calling the special method `add on the type int.
Python allows developers to define how an operator will behave with a custom class by implementing the corresponding special method. When defining such methods for symmetrical binary operators, developers need to define two methods so that the order of operands doesn’t matter, ex: add and radd`.
For a complete list of operators and their methods see the Python documentation: arithmetic and bitwise operators, comparison operators.
class Empty:
pass
class Add:
def __add__(self, other):
return 42
Empty() + 1 # Noncompliant: no __add__ method is defined on the Empty class
Add() + 1
1 + Add() # Noncompliant: no __radd__ method is defined on the Add class
Add() + Empty()
Empty() + Add() # Noncompliant: no __radd__ method is defined on the Add class
Exception chaining enables users to see if an exception is the direct consequence of another exception (see PEP-3134). This is useful to propagate the original context of the error.
Exceptions are chained using either of the following syntax:
With the from keyword
try:
...
except OSError as e:
raise RuntimeError("Something went wrong") from e
A loop with at most one iteration is equivalent to the use of an `if statement to conditionally execute one piece of code. No developer expects to find such a use of a loop statement. If the initial intention of the author was really to conditionally execute one piece of code, an if statement should be used instead.
At worst that was not the initial intention of the author and so the body of the loop should be fixed to use the nested return, break or raise` statements in a more appropriate way.
while node is not None:
node = node.parent()
print(node)
break
Through PEP 695, Python 3.12 introduces the type parameter syntax to allow for a more compact and explicit way to define generic classes and functions.
Prior to Python 3.12, defining a generic class would be done through the following syntax:
from typing import Generic, TypeVar
_T_co = TypeVar("_T_co", covariant=True, bound=str)
class ClassA(Generic[_T_co]):
def method1(self) _T_co:
...
In Python, special methods corresponding to numeric operators and rich comparison operators should return `NotImplemented when the operation is not supported.
For example A + B is equivalent to calling A.add(B). If this binary operation is not supported by class A, A.add(B) should return NotImplemented. The interpreter will then try the reverse operation, i.e. B.radd(A). If these special methods were to raise NotImplementedError, the callers would not catch the exception and the reverse operation would not be called.
Below is the list of special methods this rule applies to:
lt(self, other)
le(self, other)
eq(self, other)
ne(self, other)
gt(self, other)
ge(self, other)
add(self, other)
sub(self, other)
mul(self, other)
matmul(self, other)
truediv(self, other)
floordiv(self, other)
mod(self, other)
divmod(self, other)
pow(self, other[, modulo])
lshift(self, other)
rshift(self, other)
and(self, other)
xor(self, other)
or(self, other)
radd(self, other)
rsub(self, other)
rmul(self, other)
rmatmul(self, other)
rtruediv(self, other)
rfloordiv(self, other)
rmod(self, other)
rdivmod(self, other)
rpow(self, other[, modulo])
rlshift(self, other)
rrshift(self, other)
rand(self, other)
rxor(self, other)
ror(self, other)
iadd(self, other)
isub(self, other)
imul(self, other)
imatmul(self, other)
itruediv(self, other)
ifloordiv(self, other)
imod(self, other)
ipow(self, other[, modulo])
ilshift(self, other)
irshift(self, other)
iand(self, other)
ixor(self, other)
ior(self, other)
length_hint(self)`
class MyClass:
def __add__(self, other):
raise NotImplementedError() # Noncompliant: the exception will be propagated
def __radd__(self, other):
raise NotImplementedError() # Noncompliant: the exception will be propagated
class MyOtherClass:
def __add__(self, other):
return 42
def __radd__(self, other):
return 42
MyClass() + MyOtherClass() # This will raise NotImplementedError
Test frameworks provide a mechanism to skip tests if their prerequisites are not met, by either calling dedicated methods (e.g: `unittest.TestCase.skipTest, pytest.skip, …) or using decorators (e.g: unittest.skip, pytest.mark.skip, …)
Using a return statement instead will make the test succeed, even though no assertion has been performed. It is therefore better to flag the test as skipped in such situation.
This rule raises an issue when a return is performed conditionally at the beginning of a test method.
No issue will be raised if the return is unconditional as S1763 already raises an issue in such case.
The supported frameworks are Pytest and Unittest`.
import unittest
class MyTest(unittest.TestCase):
def test_something(self):
if not external_resource_available():
return # Noncompliant
self.assertEqual(foo(), 42)
By convention, the first argument in a class method, i.e. methods decorated with `@classmethod, is named cls as a representation and a reminder that the argument is the class itself. If you were to name the argument something else, you would stand a good chance of confusing both users and maintainers of the code. It might also indicate that the cls parameter was forgotten, in which case calling the method will most probably fail. This rule also applies to methods init_subclass, class_getitem and new as their first argument is always the class instead of “self”.
By default this rule accepts cls and mcs, which is sometime used in metaclasses, as valid names for class parameters. You can set your own list of accepted names via the parameter classParameterNames`.
class Rectangle(object):
@classmethod
def area(bob, height, width): #Noncompliant
return height * width
Python 3.10 introduced a specific syntax using the “or” operator (X | Y) to represent a union of types. This syntax has the same functionality as typing.Union, but it is more concise and easier to read.
Using typing.Union is more verbose and less convenient. It can also create inconsistencies when different parts of the codebase use different syntaxes for the same type.
from typing import Union
def foo(arg: Union[int, str]) Union[int, str]:
if isinstance(arg, int):
return arg + 1
else:
return arg.upper()
A common anti-pattern is to check that a key does not exist in a dictionary before adding it with a corresponding value. This pattern works but is less readable than the equivalent call to the built-in dictionary method “setdefault()”.
Note that if a default value is set for every key of the dictionary it is possible to use python’s defaultdict
instead.
This rule raises an issue when a key presence is checked before being set. It only raises an issue when the value is a hard-coded string, number, list, dictionary or tuple. Computed values will not raise an issue as they can have side-effects.
if "key" not in my_dictionary:
my_dictionary["key"] = ["a", "b", "c"] # Noncompliant
if "key" not in my_dictionary:
my_dictionary["key"] = generate_value() # Compliant. No issue is raised as generate_value() might have some side-effect.
Nested functions and lambdas can reference variables defined in enclosing scopes. This can create tricky bugs when the variable and the function are defined in a loop. If the function is called in another iteration or after the loop finishes, it will see the variables’ last value instead of seeing the values corresponding to the iteration where the function was defined.
Capturing loop variables might work for some time but:
it makes the code difficult to understand.
it increases the risk of introducing a bug when the code is refactored or when dependencies are updated. See an example with the builtin “map” below.
One solution is to add a parameter to the function/lambda and use the previously captured variable as its default value. Default values are only executed once, when the function is defined, which means that the parameter’s value will remain the same even when the variable is reassigned in following iterations.
Another solution is to pass the variable as an argument to the function/lambda when it is called.
This rule raises an issue when a function or lambda references a variable defined in an enclosing loop.
def run():
mylist = []
for i in range(5):
mylist.append(lambda: i) # Noncompliant
def func():
return i # Noncompliant
mylist.append(func)
def example_of_api_change():
""""
Passing loop variable as default values also makes sure that the code is future-proof.
For example the following code will work as intended with python 2 but not python 3.
Why? because "map" behavior changed. It now returns an iterator and only executes
the lambda when required. The same is true for other functions such as "filter".
"""
lst = []
for i in range(5):
lst.append(map(lambda x: x + i, range(3))) # Noncompliant
for sublist in lst:
# prints [4, 5, 6] x 4 with python 3, with python 2 it prints [0, 1, 2], [1, 2, 3], ...
print(list(sublist))
Generic types, such as list or dict accept type arguments to specify the type of elements contained in the list or the keys and values in the dictionary.
If a generic type is used without a type argument, the type arguments will implicitly assumed to be Any. This makes the type hint less informative and makes the contract of the function or variable annotated with the type hint more difficult to understand.
Furthermore, incomplete type hints can hinder IDE autocompletion and code insight capabilities of static analyis tools.
def print_list(numbers: list) None:
for n in numbers:
print(n)
The hash value of an object is an integer returned by its `hash method. Objects that are considered equal to each other (as per the eq method) should have the same hash value.
Whenever using an object as a dictionary key or inserting it into a set, the hash value of that object will be used to derive a bucket in which the object will be inserted.
When attempting to insert an unhashable object into a set, a TypeError will be raised instead.
If an object defines a hash method derived from mutable properties, no TypeError` will be raised. However, having a mutable hash value should never be done as this would prevent dictionaries and set from retrieving the object.
def foo():
my_list = [1,2,3]
my_set = {my_list} # Noncompliant: list is not hashable.
The NumPy function np.where
provides a way to execute operations on an array under a certain condition:
import numpy as np
arr = np.array([1,2,3,4])
result = np.where(arr > 3, arr * 2, arr)
Looking at the set of methods and fields in a class
and finding two that differ only by capitalization is confusing to users of the class.
This situation may simply indicate poor naming. Method names should be action-oriented, and thus contain a verb, which is unlikely in the case where both a method and a field have the same name (with or without capitalization differences). However, renaming a public method could be disruptive to callers. Therefore renaming the member is the recommended action.
class SomeClass:
lookUp = false
def lookup(): # Non-compliant; method name differs from field name only by capitalization
pass
The await keyword can only be used on “Awaitable” objects. Python has three types of awaitables: Future, Task and Coroutines. Calling await on any other object will raise a TypeError
.
import asyncio
def myfunction():
print("myfunction")
async def otherfunction():
await myfunction() # Noncompliant. myfunction is not marked as "async"
asyncio.run(otherfunction())
The pandas.to_datetime function transforms a string to a date object. The string representation of the date can take multiple formats. To correctly parse these strings, pandas.to_datetime provides several arguments to setup the parsing, such as dayfirst or yearfirst. For example setting dayfirst to True indicates to pandas.to_datetime that the date and time will be represented as a string with the shape day month year time. Similarly with yearfirst, the string should have the following shape year month day time.
These two arguments are not strict, meaning if the shape of the string is not the one expected by pandas.to_datetime, the function will not fail and try to figure out which part of the string is the day, month or year.
In the following example the dayfirst argument is set to True but we can clearly see that the month part of the date would be incorrect. In this case pandas.to_datetime will ignore the dayfirst argument, and parse the date as the 22nd of January.
import pandas as pd
pd.to_datetime(["01-22-2000 10:00"], dayfirst=True)
Unlike class and instance methods, static methods don’t receive an implicit first argument. Nonetheless naming the first argument self or clz
guarantees confusion - either on the part of the original author, who may never understand why the arguments don’t hold the values he expected, or on that of future maintainers.
class MyClass:
@staticmethod
def s_meth(self, arg1, arg2): #Noncompliant
# ...
Creating a new Exception
without actually raising it has no effect and is probably due to a mistake.
def func(x):
if not isinstance(x, int):
TypeError("Wrong type for parameter 'x'. func expects an integer") # Noncompliant
if x < 0:
ValueError # Noncompliant
return x + 42
In Python, using the isinstance() function is generally preferred over direct type comparison for several reasons:
Compatibility with inheritance: isinstance() considers inheritance hierarchy, whereas direct type comparison does not. This means that isinstance() can handle cases where an object belongs to a subclass of the specified type, making your code more flexible and robust. It allows you to write code that can work with objects of different but related types.
Support for duck typing: Python follows the principle of “duck typing,” which focuses on an object’s behavior rather than its actual type. isinstance() enables you to check if an object has certain behavior (by checking if it belongs to a particular class or subclass) rather than strictly requiring a specific type. This promotes code reusability and enhances the flexibility of your programs.
Code maintainability and extensibility: By using isinstance(), your code becomes more maintainable and extensible. If you directly compare types, you would need to modify your code whenever a new subtype is introduced or the inheritance hierarchy is changed. On the other hand, isinstance() allows your code to accommodate new types without requiring any modifications, as long as they exhibit the desired behavior.
Polymorphism and interface-based programming: isinstance() supports polymorphism, which is the ability of different objects to respond to the same method calls. It allows you to design code that interacts with objects based on their shared interface rather than their specific types. This promotes code reuse and modularity, as you can write functions and methods that operate on a range of compatible objects.
Third-party library compatibility: Many third-party libraries and frameworks in Python rely on isinstance() for type checking and handling different types of objects. By using isinstance(), your code becomes more compatible with these libraries and frameworks, making it easier to integrate your code into larger projects or collaborate with other developers.
In summary, using isinstance() over direct type comparison in Python promotes flexibility, code reusability, maintainability, extensibility, and compatibility with the wider Python ecosystem. It aligns with the principles of object-oriented programming and supports the dynamic nature of Python. It is also recommended by the PEP8 style guide.
class MyClass:
...
def foo(a):
if type(a) == MyClass: # Noncompliant
...
The only two possible types for an `except’s expression are a class deriving from BaseException, or a tuple composed of such classes.
Trying to catch multiple exception in the same except with a boolean expression of exceptions may not work as intended. The result of a boolean expression of exceptions is a single exception class, thus using a boolean expression in an except` block will result in catching only one kind of exception.
error = ValueError or TypeError
error is ValueError # True
error is TypeError # False
error = ValueError and TypeError
error is ValueError # False
error is TypeError # True
The identity operators is and is not check if the same object is on both sides,
i.e. a is b returns True if
++id(a)
my_int = 1
other_int = 1
id(my_int) == id(other_int) # True
Instance methods, i.e. methods not annotated with `@classmethod or @staticmethod, are expected to have at least one parameter. This parameter will reference the object instance on which the method is called. By convention, this first parameter is named “self”.
Naming the first parameter something different from “self” is not recommended as it could lead to confusion. It might indicate that the “self” parameter was forgotten, in which case calling the method will most probably fail.
Note also that creating methods which are used as static methods without the @staticmethod decorator is a bad practice. Calling these methods on an instance will raise a TypeError. Either move the method out of the class or decorate it with @staticmethod`.
class MyClass(ABC):
@myDecorator
def method(arg): # No issue will be raised.
pass
Identity operators is and is not check if the same object is on both sides, i.e. a is b returns True if
++id(a)
def func(param):
param is {1: 2} # Noncompliant: always False
param is not {1, 2, 3} # Noncompliant: always True
param is [1, 2, 3] # Noncompliant: always False
param is dict(a=1) # Noncompliant: always False
mylist = [] # mylist is assigned a new object
param is mylist # Noncompliant: always False
Using the startswith and endswith methods in Python instead of string slicing offers several advantages:
Readability and Intent: Using startswith and endswith methods provides code that is more readable and self-explanatory. It clearly communicates your intention to check if a string starts or ends with a specific pattern. This makes the code more maintainable and easier to understand for other developers.
Flexibility: The startswith and endswith methods allow you to check for patterns of varying lengths. With string slicing, you would need to specify the exact length of the substring to compare. However, with the methods, you can pass in a pattern of any length, making your code more flexible and adaptable.
Error Handling: The methods handle edge cases automatically. If you pass a substring length that exceeds the length of the original string, slicing would raise an IndexError exception. On the other hand, the methods gracefully handle such cases and return False, avoiding any potential errors.
Performance Optimization: In some cases, using startswith and endswith methods can provide better performance. These methods are optimized and implemented in C, which can make them faster than manually slicing the string in Python. Although the performance gain might be negligible for small strings, it can be significant when working with large strings or processing them in a loop.
Overall, using startswith and endswith methods provides a cleaner, more readable, and error-resistant approach for checking if a string starts or ends with a specific pattern. It promotes code clarity, flexibility, and can potentially improve performance. This is also recommended by the PEP8 style guide.
message = "Hello, world!"
if message[:5] == "Hello":
...
if message[-6:] == "world!":
...
Python has no pre/post increment/decrement operator. For instance, x++ and x— will fail to parse. More importantly, ++x and —x will do nothing. To increment a number, simply write x += 1
.
++x # Noncompliant: pre and post increment operators do not exist in Python.
x-- # Noncompliant: pre and post decrement operators do not exist in Python.
A bare `raise statement, i.e. a raise with no exception provided, will re-raise the last active exception in the current scope. If no exception is active a RuntimeError is raised instead.
If the bare “raise” statement is in a finally block, it will only have an active exception to re-raise when an exception from the try block is not caught or when an exception is raised by an except or else block. Thus bare raise statements should not be relied upon in finally` blocks. It is simpler to let the exception raise automatically.
def foo(param):
result = 0
try:
print("foo")
except ValueError as e:
pass
else:
if param:
raise ValueError()
finally:
if param:
raise # Noncompliant: This will fail in some context.
else:
result = 1
return result
By contract, every Python function returns something, even if it is the `None value, which can be returned implicitly by omitting the return statement, or explicitly.
The init method is required to return None. A TypeError will be raised if the init method either yields or returns any expression other than None. While explicitly returning an expression that evaluates to None will not raise an error, it is considered bad practice.
To fix this issue, make sure that the init` method does not contain any return statement.
class MyClass(object):
def __init__(self):
self.message = 'Hello'
return self # Noncompliant: a TypeError will be raised
Parentheses can disambiguate the order of operations in complex expressions and make the code easier to understand.
a = (b * c) + (d * e) # Compliant: the intent is clear.
An re.sub call always performs an evaluation of the first argument as a regular expression, even if no regular expression features were used. This has a significant performance cost and therefore should be used with care.
When re.sub is used, the first argument should be a real regular expression. If it’s not the case, str.replace does exactly the same thing as re.sub without the performance drawback of the regex.
This rule raises an issue for each re.sub used with a simple string as first argument which doesn’t contains special regex character or pattern.
init = "Bob is a Bird... Bob is a Plane... Bob is Superman!"
changed = re.sub(r"Bob is", "It's", init) # Noncompliant
changed = re.sub(r"\.\.\.", ";", changed) # Noncompliant
Developers can use type hints to specify which type a function is expected to return. Doing so improves maintainability since it clarifies the contract of the function, making it easier to use and understand.
When annotating a function with a specific type hint, it is expected that the returned value matches the type specified in the hint.
If the type hint specifies a class or a named type, then the value returned should be an instance of that class or type. If the type hint specifies a structural type, then the value returned should have the same structure as the type hint.
In the following example, while Bucket does not directly inherit from Iterable, it does implement the Iterable protocol thanks to its iter method and can therefore be used as a valid Iterable
return type.
from collections.abc import Iterator, Iterable
class Bucket: # Note: no base classes
...
def __len__(self) int: ...
def __iter__(self) Iterator[int]: ...
def collect() Iterable: return Bucket()
Comparing float values for equality directly is not reliable and should be avoided, due to the inherent imprecision in the binary representation of floating point numbers. Such comparison is reported by S1244.
One common solution to this problem is to use the math.isclose function to perform the comparison. Behind the scenes, the math.isclose function uses a tolerance value (also called epsilon) to define an acceptable range of difference between two floats. A tolerance value may be relative (based on the magnitude of the numbers being compared) or absolute.
Using a relative tolerance would be equivalent to:
def isclose_relative(a, b, rel_tol=1e-09):
diff = abs(a - b)
max_diff = rel_tol * max(abs(a), abs(b))
return diff <= max_diff
Using the same value on either side of a binary operator is almost always a mistake. In the case of logical operators, it is either a copy/paste error and therefore a bug, or it is simply wasted code, and should be simplified. In the case of bitwise operators and most binary mathematical operators, having the same value on both sides of an operator yields predictable results, and should be simplified.
Note that this rule will raise issues on “++a
if a == a: # Noncompliant
work()
if a != a: # Noncompliant
work()
if a == b and a == b: # Noncompliant
work()
if a == b or a == b: # Noncompliant
work()
j = 5 / 5 # Noncompliant
k = 5 - 5 # Noncompliant
Floating point math is imprecise because of the challenges of storing such values in a binary representation.
In base 10, the fraction 1/3 is represented as 0.333… which, for a given number of significant digit, will never exactly be 1/3. The same problem happens when trying to represent 1/10 in base 2, with leads to the infinitely repeating fraction 0.0001100110011…. This makes floating point representations inherently imprecise.
Even worse, floating point math is not associative; push a `float through a series of simple mathematical operations and the answer will be different based on the order of those operation because of the rounding that takes place at each step.
Even simple floating point assignments are not simple, as can be vizualized using the format` function to check for significant digits:
>>> format(0.1, ".17g")
'0.10000000000000001'
Accessing a non-existing member on an object will raise in most case an AttributeError
exception.
This rule raises an issue when a non-existing member is accessed on a class instance and nothing indicates that this was expected.
def access_attribute():
x = 42
return x.isnumeric() # Noncompliant
Recursion happens when control enters a loop that has no exit. This can happen when a method invokes itself or when a pair of methods invoke each other. It can be a useful tool, but unless the method includes a provision to break out of the recursion and return
, the recursion will continue until the stack overflows and the program crashes.
def my_pow(num, exponent): # Noncompliant
num = num * my_pow(num, exponent - 1)
return num # this is never reached
`SystemExit is raised when sys.exit() is called. KeyboardInterrupt is raised when the user asks the program to stop by pressing interrupt keys. Both exceptions are expected to propagate up until the application stops.
In order to avoid catching SystemExit and KeyboardInterrupt by mistake, PEP-352 created the root class BaseException from which SystemExit, KeyboardInterrupt and Exception derive. Thus developers can use except Exception: without preventing the software from stopping.
The GeneratorExit class also derives from BaseException as it is not really an error and is not supposed to be caught by user code.
As said in Python’s documentation, user-defined exceptions are not supposed to inherit directly from BaseException. They should instead inherit from Exception` or one of its subclasses.
class MyException(BaseException): # Noncompliant
pass
Checking if a variable or parameter is None should only be done when you expect that it can be None. Doing so when the variable is always None or never None is confusing at best. At worse, there is a bug and the variable is not updated properly.
This rule raises an issue when expressions X is None, X is not None, `X
def foo():
my_var = None
if my_var == None: # Noncompliant: always True.
...
The iterator protocol specifies that an iterator object should have
a `next method retrieving the next value or raising StopIteration when there are no more values left.
an iter method which should always return self. This enables iterators to be used as sequences in for-loops and other places.
This rule raises an issue when a class has a next method and either:
it doesn’t have an iter method.
or its iter` method does not return “self”.
class MyIterator: # Noncompliant. Class has a __next__ method but no __iter__ method
def __init__(self, values):
self._values = values
self._index = 0
def __next__(self):
if self._index >= len(self._values):
raise StopIteration()
value = self._values[self._index]
self._index += 1
return value
class MyIterator:
def __init__(self, values):
self._values = values
self._index = 0
def __next__(self):
if self._index >= len(self._values):
raise StopIteration()
value = self._values[self._index]
self._index += 1
return value
def __iter__(self):
return 42 # Noncompliant. This __iter__ method does not return self
class MyIterable: # Ok. This is an iterable, not an iterator, i.e. it has an __iter__ method but no __next__ method. Thus __iter__ doesn't have to return "self"
def __init__(self, values):
self._values = values
def __iter__(self):
return MyIterator(self._values)
A common anti-pattern is to check that a key exists in a dictionary before retrieving its corresponding value and providing a default value otherwise. This pattern works but is less readable than the equivalent call to the built-in dictionary method “get()” with a default value.
Note that if a default value is set for every key of the dictionary it is possible to use python’s defaultdict instead.
This rule raises an issue when a key presence is checked before retrieving its value or providing a default value. It only raises an issue when the default value is a hard-coded string, number, list, dictionary or tuple. Computed values will not raise an issue as they can have side-effects.
result = "default"
if "missing" in mydict:
result = mydict["missing"] # Noncompliant
if "missing" in mydict:
result = mydict["missing"] # Noncompliant
else:
result = "default"
if "missing" in mydict:
result = mydict["missing"] # Compliant. No issue is raised as generate_value() might have some side-effect.
else:
result = generate_value()
The `numpy.nan is a floating point representation of Not a Number (NaN) used as a placeholder for undefined or missing values in numerical computations.
Equality checks of variables against numpy.nan in NumPy will always be False due to the special nature of numpy.nan. This can lead to unexpected and incorrect results.
Instead of standard comparison the numpy.isnan()` function should be used.
import numpy as np
x = np.nan
if x == np.nan: # Noncompliant: always False
...
NotImplemented is a constant which is intended to be used only by comparison methods such as lt. Use it instead of NotImplementedError
, which is an exception, and callers will have a hard time using your code.
class MyClass:
def do_something(self):
raise NotImplemented("Haven't gotten this far yet.") #Noncompliant
When a back reference in a regex refers to a capturing group that hasn’t been defined yet (or at all), it can never be matched and will fail with an re.error exception
import re
pattern1 = re.compile(r"\1(.)") # Noncompliant, group 1 is defined after the back reference
pattern2 = re.compile(r"(.)\2") # Noncompliant, group 2 isn't defined at all
pattern3 = re.compile(r"(.)|\1") # Noncompliant, group 1 and the back reference are in different branches
pattern4 = re.compile(r"(?P<x>.)|(?P=x)") # Noncompliant, group x and the back reference are in different branches
Using operator pairs (=+ or =-) that look like reversed single operators (+= or -=
) is confusing. They compile and run but do not produce the same result as their mirrored counterpart.
target = -5
num = 3
target =- num # Noncompliant: target = -3. Is that really what's meant?
target =+ num # Noncompliant: target = 3
”Private” nested classes that are never used inside the enclosing class are usually dead code: unnecessary, inoperative code that should be removed. Cleaning out dead code decreases the size of the maintained codebase, making it easier to understand the program and preventing bugs from being introduced.
Python has no real private classes. Every class is accessible. There are however two conventions indicating that a class is not meant to be “public”:
classes with a name starting with a single underscore (ex: `_MyClass) should be seen as non-public and might change without prior notice. They should not be used by third-party libraries or software. It is ok to use those classes inside the library defining them but it should be done with caution.
”class-private” classes are defined inside another class, and have a name starting with at least two underscores and ending with at most one underscore. These classes’ names will be automatically mangled to avoid collision with subclasses’ nested classes. For example __MyClass will be renamed as _classname__MyClass, where classname` is the enclosing class’s name without its leading underscore(s). Class-Private classes shouldn’t be used outside of their enclosing class.
This rule raises an issue when a private nested class (either with one or two leading underscores) is never used inside its parent class.
class TopLevel:
class __Nested(): # Noncompliant: __Nested is never used
pass
The Pandas library provides a user-friendly API to concatenate two data frames together with the methods `merge and join.
When using these methods, it is possible to specify how the merge will be performed:
The parameter how specifies the type of merge (left, inner, outer, etc..).
The parameter on specifies the column(s) on which the merge will be performed.
The parameter validate` specifies a way to verify if the merge result is what was expected.
import pandas as pd
age_df = pd.DataFrame({"user_id":[1,2,4], "age":[42,45, 35]})
name_df = pd.DataFrame({"user_id":[1,2,3,4], "name":["a","b","c","d"]})
result = age_df.merge(name_df, on="user_id", how="right", validate="1:1")
To allow a datetime to be used in contexts where only certain days of the week are valid, NumPy includes a set of business day functions. Weekmask is used to customize valid business days.
Weekmask can be specified in several formats:
As an array of 7 1 or 0 values, e.g. [1, 1, 1, 1, 1, 0, 0]
As a string of 7 1 or 0 characters, e.g. “1111100”
As a string with abbreviations of valid days from this list: Mon Tue Wed Thu Fri Sat Sun, e.g. “Mon Tue Wed Thu Fri”
Setting an incorrect weekmask leads to ValueError.
import numpy as np
offset = np.busday_offset('2012-05', 1, roll='forward', weekmask='01') # Noncompliant: ValueError
Python 3.11 introduced `except* and ExceptionGroup, making it possible to handle and raise multiple unrelated exceptions simultaneously.
In the example below, we gather multiple exceptions in an ExceptionGroup. This ExceptionGroup` is then caught by a single except block:
try:
exception_group = ExceptionGroup("Files not found", [FileNotFoundError("file1.py"), FileNotFoundError("file2.py")])
raise exception_group
except ExceptionGroup as exceptions:
# Do something with all the exceptions
pass
Jump statements (return, break, continue, and raise
) move control flow out of the current code block. So any statements that come after a jump are dead code.
def fun(a):
i = 10
return i + a # Noncompliant
i += 1 # this is never executed
In Django, when creating a ModelForm, it is common to use exclude to remove fields from the form. It is also possible to set the fields value to all to conveniently indicate that all the model fields should be included in the form. However, this can lead to security issues when new fields are added to the model, as they will automatically be included in the form, which may not be intended. Additionally, exclude or all
can make it harder to maintain the codebase by hiding the dependencies between the model and the form.
from django import forms
class MyForm(forms.ModelForm):
class Meta:
model = MyModel
exclude = ['field1', 'field2'] # Noncompliant
class MyOtherForm(forms.ModelForm):
class Meta:
model = Post
fields = '__all__' # Noncompliant
Unreachable code is never executed, so it has no effect on the behaviour of the program. If it is not executed because it no longer serves a purpose, then it adds unnecessary complexity. Otherwise, it indicates that there is a logical error in the condition.
def foo(a, b):
flag = True
if (a and not a): # Noncompliant
doSomething() # Never executed
if (flag): # Noncompliant
return "Result 1"
return "Result 2" # Never executed
Function, methods and lambdas should not have too many mandatory parameters, i.e. parameters with no default value. Calling them will require code difficult to read and maintain. To solve this problem you could wrap some parameters in an object, split the function into simpler functions with less parameters or provide default values for some parameters.
def do_something(param1, param2, param3, param4, param5): # Noncompliant
...
There are several ways to create a new list based on the elements of some other collection, but the use of a list comprehension has multiple benefits. First, it is both concise and readable, and second, it yields a fully-formed object without requiring a mutable object as input that must be updated multiple times in the course of the list creation.
squares = []
for x in range(10):
squares.append(x**2) # Noncompliant
squares = map(lambda x: x**2, range(10)) #Noncompliant
In Python 3.9 and later, the zoneinfo module is the recommended tool for handling timezones, replacing the pytz library. This recommendation is based on several key advantages.
First, zoneinfo is part of Python’s standard library, making it readily available without needing additional installation, unlike pytz.
Second, zoneinfo integrates seamlessly with Python’s datetime module. You can directly use zoneinfo timezone objects when creating datetime objects, making it more intuitive and less error-prone than pytz, which requires a separate localize method for this purpose.
Third, zoneinfo handles historical timezone changes more accurately than pytz. When a pytz timezone object is used, it defaults to the earliest known offset, which can lead to unexpected results. zoneinfo does not have this issue.
Lastly, zoneinfo uses the system’s IANA time zone database when available, ensuring it works with the most up-to-date timezone data. In contrast, pytz includes its own copy of the IANA database, which may not be as current.
In summary, zoneinfo offers a more modern, intuitive, and reliable approach to handling timezones in Python 3.9 and later, making it the preferred choice over pytz.
from datetime import datetime
import pytz
dt = pytz.timezone('America/New_York').localize(datetime(2022, 1, 1)) # Noncompliant: the localize method is needed to avoid bugs (see S6887)
The length of a collection is always greater than or equal to zero. Testing it doesn’t make sense, since the result is always true
.
mylist = []
if len(myList) >= 0: # Noncompliant: always true
pass
A `SystemExit exception is raised when sys.exit() is called. This exception is used to signal the interpreter to exit. The exception is expected to propagate up until the program stops. It is possible to catch this exception in order to perform, for example, clean-up tasks. It should, however, be raised again to allow the interpreter to exit as expected. Not re-raising such exception could lead to undesired behaviour.
A bare except: statement, i.e. an except block without any exception class, is equivalent to except BaseException. Both statements will catch every exceptions, including SystemExit. It is recommended to catch instead a more specific exception. If it is not possible, the exception should be raised again.
It is also a good idea to re-raise the KeyboardInterrupt exception. Similarly to SystemExit,KeyboardInterrupt` is used to signal the interpreter to exit. Not re-raising such exception could also lead to undesired behaviour.
try:
...
except SystemExit: # Noncompliant: the SystemExit exception is not re-raised.
pass
try:
...
except BaseException: # Noncompliant: BaseExceptions encompass SystemExit exceptions and should be re-raised.
pass
try:
...
except: # Noncompliant: exceptions caught by this statement should be re-raised or a more specific exception should be caught.
pass
Using inplace=True when modifying a Pandas DataFrame means that the method will modify the DataFrame in place, rather than returning a new object:
df.an_operation(inplace=True)
In some cases a comparison with operators `==, or != will always return True or always return False. When this happens, the comparison and all its dependent code can simply be removed. This includes:
comparing unrelated builtin types such as string and integer.
comparing class instances which do not implement eq or ne to an object of a different type (builtin or from an unrelated class which also doesn’t implement eq or ne`).
foo = 1 == "1" # Noncompliant. Always False.
foo = 1 != "1" # Noncompliant. Always True.
class A:
pass
myvar = A() == 1 # Noncompliant. Always False.
myvar = A() != 1 # Noncompliant. Always True.
Assertions are meant to detect when code behaves as expected. An assertion which fails or succeeds all the time does not achieve this. Either it is redundant and should be removed to improve readabity or it is a mistake and the assertion should be corrected.
This rule raises an issue when an assertion method is given parameters which will make it succeed or fail all the time. It covers three cases:
an `assert statement or a unittest’s assertTrue or assertFalse method is called with a value which will be always True or always False.
a unittest’s assertIsNotNone or assertIsNone method is called with a value which will always be None or never be None.
a unittest’s assertIsNot or assertIs method is called with a literal expression creating a new object every time (ex: [1, 2, 3]`).
import unittest
class MyTestCase(unittest.TestCase):
def expect_not_none(self):
self.assertIsNotNone(round(1.5)) # Noncompliant: This assertion always succeeds because "round" returns a number, not None.
def helper_compare(param):
self.assertIs(param, [1, 2, 3]) # Noncompliant: This assertion always fails because [1, 2, 3] creates a new object.
While it is technically correct to assign to parameters from within function bodies, doing so before the parameter value is read is likely a bug. Instead, initial values of parameters should be, if not treated as read-only, then at least read before reassignment.
def foo(strings, param):
param = 1 # NonCompliant
Implementing the special method `ne is not equivalent to implementing the special method eq. By default ne will call eq, but the default implementation of eq does not call ne.
This rule raises an issue when the special method ne is implemented but not the eq` method.
class Ne:
def __ne__(self, other): # Noncompliant.
return False
myvar = Ne() == 1 # False. __ne__ is not called
myvar = 1 == Ne() # False. __ne__ is not called
myvar = Ne() != 1 # False
myvar = 1 != Ne() # False
An iterable object is an object capable of returning its members one at a time. To do so, it must define an `iter method that returns an iterator.
The iterator protocol specifies that, in order to be a valid iterator, an object must define a next and an iter method (because iterators are also iterable).
Defining an iter method that returns anything else than an iterator will raise a TypeError as soon as the iteration begins.
Note that generators and generator expressions have both next and iter` methods generated automatically.
class MyIterable:
def __init__(self, values):
self._values = values
def __iter__(self):
return None # Noncompliant: Not a valid iterator
Typically, backslashes are seen only as part of escape sequences. Therefore, the use of a backslash outside of a raw string or escape sequence looks suspiciously like a broken escape sequence.
Characters recognized as escape-able are: abfnrtvox‘“
s = "Hello \world."
t = "Nice to \ meet you"
u = "Let's have \ lunch"
Python has no real private attribute. Every attribute is accessible. There are however two conventions indicating that an attribute is not meant to be “public”:
attributes with a name starting with a single underscore (ex: `_myattribute) should be seen as non-public and might change without prior notice. They should not be used by third-party libraries or software. It is ok to use those methods inside the library defining them but it should be done with caution.
”class-private” attributes have a name starting with at least two underscores and ending with at most one underscore. These attributes’ names will be automatically mangled to avoid collision with subclasses’ attributes. For example __myattribute will be renamed as _classname__myattribute, where classname` is the attribute’s class name without its leading underscore(s). They shouldn’t be used outside of the class defining the attribute.
This rule raises an issue when a class-private attribute (two leading underscores, max one underscore at the end) is never read inside the class. It optionally raises an issue on unread attributes prefixed with a single underscore. Both class attributes and instance attributes will raise an issue.
class Noncompliant:
_class_attr = 0 # Noncompliant if enable_single_underscore_issues is enabled
__mangled_class_attr = 1 # Noncompliant
def __init__(self, value):
self._attr = 0 # Noncompliant if enable_single_underscore_issues is enabled
self.__mangled_attr = 1 # Noncompliant
def compute(self, x):
return x * x
In Python, the unpacking assignment is a powerful feature that allows you to assign multiple values to multiple variables in a single statement.
The basic rule for the unpacking assignment is that the number of variables on the left-hand side must be equal to the number of elements in the iterable. If this is not respected, a ValueError
will be produced at runtime.
def foo(param):
ls = [1, 2, 3]
x, y = ls # Noncompliant: 'ls' contains more elements than there are variables on the left-hand side
The result of TensorFlow’s reduction operations (i.e. tf.math.reduce_sum, tf.math.reduce_std
),
highly depends on the shape of the Tensor provided.
import tensorflow as tf
x = tf.constant([[1, 1, 1], [1, 1, 1]])
tf.math.reduce_sum(x)
In Python 3’s except statement, attempting to catch an object that does not derive from BaseException will raise a TypeError.
In order to catch multiple exceptions in an except statement, a tuple of exception classes should be provided. Trying to catch multiple exceptions with a list or a set will raise a TypeError.
If you are about to create a custom exception class, note that custom exceptions should inherit from `Exception, rather than BaseException.
BaseException is the base class for all built-in exceptions in Python, including system-exiting exceptions like SystemExit or KeyboardInterrupt, which are typically not meant to be caught. On the other hand, Exception is intended for exceptions that are expected to be caught, which is generally the case for user-defined exceptions. See PEP 352 for more information.
To fix this issue, make sure the expression used in an except statement is an exception which derives from BaseException/Exception` or a tuple of such exceptions.
class CustomException(object):
"""An Invalid exception class."""
pass
try:
...
except CustomException: # Noncompliant: this custom exception does not derive from BaseException or Exception.
print("exception")
try:
...
except [TypeError, ValueError]: # Noncompliant: list of exceptions, only tuples are valid.
print("exception")
The all property of a module is used to define the list of names that will be imported when performing a wildcard import of this module, i.e. when from mymodule import *
is used.
In the following example:
# mymodule.py
def foo(): ...
def bar(): ...
__all__ = ["foo"]
A class without an explicit extension of object (class ClassName(object)) is considered an old-style class, and slots
declarations are ignored in old-style classes. Having such a declaration in an old-style class could be confusing for maintainers and lead them to make false assumptions about the class.
class A:
__slots__ = ["id"] # Noncompliant; this is ignored
def __init__(self):
self.id = id
self.name = "name" # name wasn't declared in __slots__ but there's no error
a = A()
Being a dynamically typed language, the Python interpreter only does type checking during runtime. Getting the typing right is important as certain operations may result in a TypeError.
Type hints can be used to clarify the expected return type of a function, enabling developers to better document its contract. Applying them consistently makes the code easier to read and understand.
In addition, type hints allow some development environments to offer better autocompletion and improve the precision of static analysis tools.
def hello(name):
return 'Hello ' + name
Shared naming conventions allow teams to collaborate efficiently.
This rule raises an issue when a method name does not match a provided regular expression.
For example, with the default provided regular expression ^[a-z_][a-z0-9_]*$
, the method:
class MyClass:
def MyMethod(a,b): # Noncompliant
...
The str
method in Django models is used to represent the model instance as a string. For example, the return value of this method will be inserted in a template when displaying an object in the Django admin site. Without this method, the model instance will be represented by its object identifier, which is not meaningful to end-users. This can result in confusion and make debugging more difficult.
from django.db import models
class MyModel(models.Model):
name = models.CharField(max_length=100)
The repetition of a prefix operator (not or ~
) is usually a typo. The second operator invalidates the first one:
a = False
b = ~~a # Noncompliant: equivalent to "a"
Checking that variable X has type T with type annotations implies that X’s value is of type T or a subtype of T. After such a check, it is a good practice to limit actions on X to those allowed by type T, even if a subclass of T allows different actions. Doing otherwise will confuse your fellow developers.
Just to be clear, it is common in python to perform an action without checking first if it is possible (see “Easier to ask for forgiveness than permission.”). However when type checks are performed, they should not contradict the following actions.
This rule raises an issue when an action performed on a variable might be possible, but it contradicts a previous type check. The list of checked actions corresponds to rules S2159, S3403, S5607, S5756, S5644, S3862, S5797, S5795 and S5632. These other rules only detect cases where the type of a variable is certain, i.e. it cannot be a subclass.
def add_the_answer(param: str):
return param + 42 # Noncompliant. Fix this "+" operation; Type annotation on "param" suggest that operands have incompatible types.
# Note: In practice it is possible to create a class inheriting from both "str" and "int", but this would be a very confusing design.
Why use named groups only to never use any of them later on in the code?
This rule raises issues every time named groups are:
referenced while not defined;
defined but called elsewhere in the code by their number instead.
import re
def foo():
pattern = re.compile(r"(?P<a>.)")
matches = pattern.match("abc")
g1 = matches.group("b") # Noncompliant - group "b" is not defined
g2 = matches.group(1) # Noncompliant - Directly use 'a' instead of its group number.
Nested control flow statements if, for, while, try, and with
are often key ingredients in creating
what’s known as “Spaghetti code”. This code smell can make your program difficult to understand and maintain.
When numerous control structures are placed inside one another, the code becomes a tangled, complex web. This significantly reduces the code’s readability and maintainability, and it also complicates the testing process.
if condition1: # Compliant - depth = 1
# ...
if condition2: # Compliant - depth = 2
# ...
for i in range(10): # Compliant - depth = 3
# ...
if condition3: # Compliant - depth = 4
if condition4: # Non-Compliant - depth = 5, which exceeds the limit
if condition5: # Depth = 6, exceeding the limit, but issues are only reported on depth = 5
# ...
Nested conditionals are hard to read and can make the order of operations complex to understand.
class Job:
@property
def readable_status(self):
return "Running" if job.is_running else "Failed" if job.errors else "Succeeded" # Noncompliant
In Python it is possible to catch multiple types of exception in a single except statement using a tuple of the exceptions.
Repeating an exception class in a single except statement will not fail but it does not have any effect. Either the exception class is not the one which should be caught, or it is duplicated code which should be removed.
Having a subclass and a parent class in the same except statement does not provide any benefit either. It is enough to keep only the parent class.
try:
...
except (TypeError, TypeError): # Noncompliant: duplicated code or incorrect exception class.
print("Foo")
try:
...
except (NotImplementedError, RuntimeError): # Noncompliant: NotImplementedError inherits from RuntimeError.
print("Foo")
tensorflow.functions only supports singleton tensorflow.Variables. This means the variable will be created on the first call
of the tensorflow.function and will be reused across the subsequent calls. Creating a tensorflow.Variable that is not a singleton
will raise a ValueError
.
import tensorflow as tf
@tf.function
def f(x):
v = tf.Variable(1.0)
return v
Importing a feature from the future
module turns on that feature from a future version of Python in your module. The purpose is to allow you to gradually transition to the new features or incompatible changes in future language versions, rather than having to make the entire jump at once.
Because such changes must be applied to the entirety of a module to work, putting such imports anywhere but in the beginning of the module doesn’t make sense. It would mean applying those restrictions to only part of your code. Because that would lead to inconsistencies and massive confusion, it’s not allowed.
name = "John"
from __future__ import division # Noncompliant
An except clause that only rethrows the caught exception has the same effect as omitting the except altogether and letting it bubble up automatically.
a = {}
try:
a[5]
except KeyError:
raise # Noncompliant
Using a predictable seed is a common best practice when using NumPy to create reproducible results. To that end, using np.random.seed(number) to set the seed of the global numpy.random.RandomState has been the privileged solution for a long time.
numpy.random.RandomState and its associated methods rely on a global state, which may be problematic when threads or other forms of concurrency are involved. The global state may be altered and the global seed may be reset at various points in the program (for instance, through an imported package or script), which would lead to irreproducible results.
Instead, the preferred best practice to generate reproducible pseudorandom numbers is to instantiate a numpy.random.Generator object with a seed and reuse it in different parts of the code. This avoids the reliance on a global state. Whenever a new seed is needed, a new generator may be created instead of mutating a global state.
Below is the list of legacy functions and their alternatives:
Legacy function name | numpy.random.Generator alternative |
numpy.random.RandomState.seed | numpy.random.default_rng |
numpy.random.RandomState.rand | numpy.random.Generator.random |
numpy.random.RandomState.randn | numpy.random.Generator.standard_normal |
numpy.random.RandomState.randint | numpy.random.Generator.integers |
numpy.random.RandomState.random_integers | numpy.random.Generator.integers |
numpy.random.RandomState.random_sample | numpy.random.Generator.random |
numpy.random.RandomState.choice | numpy.random.Generator.choice |
numpy.random.RandomState.bytes | numpy.random.Generator.bytes |
numpy.random.RandomState.shuffle | numpy.random.Generator.shuffle |
numpy.random.RandomState.permutation | numpy.random.Generator.permutation |
numpy.random.RandomState.beta | numpy.random.Generator.beta |
numpy.random.RandomState.binomial | numpy.random.Generator.binomial |
numpy.random.RandomState.chisquare | numpy.random.Generator.chisquare |
numpy.random.RandomState.dirichlet | numpy.random.Generator.dirichlet |
numpy.random.RandomState.exponential | numpy.random.Generator.exponential |
numpy.random.RandomState.f | numpy.random.Generator.f |
numpy.random.RandomState.gamma | numpy.random.Generator.gamma |
numpy.random.RandomState.geometric | numpy.random.Generator.geometric |
numpy.random.RandomState.gumbel | numpy.random.Generator.gumbel |
numpy.random.RandomState.hypergeometric | numpy.random.Generator.hypergeometric |
numpy.random.RandomState.laplace | numpy.random.Generator.laplace |
numpy.random.RandomState.logistic | numpy.random.Generator.logistic |
numpy.random.RandomState.lognormal | numpy.random.Generator.lognormal |
numpy.random.RandomState.logseries | numpy.random.Generator.logseries |
numpy.random.RandomState.multinomial | numpy.random.Generator.multinomial |
numpy.random.RandomState.multivariate_normal | numpy.random.Generator.multivariate_normal |
numpy.random.RandomState.negative_binomial | numpy.random.Generator.negative_binomial |
numpy.random.RandomState.noncentral_chisquare | numpy.random.Generator.noncentral_chisquare |
numpy.random.RandomState.noncentral_f | numpy.random.Generator.noncentral_f |
numpy.random.RandomState.normal | numpy.random.Generator.normal |
numpy.random.RandomState.pareto | numpy.random.Generator.pareto |
numpy.random.RandomState.poisson | numpy.random.Generator.poisson |
numpy.random.RandomState.power | numpy.random.Generator.power |
numpy.random.RandomState.rayleigh | numpy.random.Generator.rayleigh |
numpy.random.RandomState.standard_cauchy | numpy.random.Generator.standard_cauchy |
numpy.random.RandomState.standard_exponential | numpy.random.Generator.standard_exponential |
numpy.random.RandomState.standard_gamma | numpy.random.Generator.standard_gamma |
numpy.random.RandomState.standard_normal | numpy.random.Generator.standard_normal |
numpy.random.RandomState.standard_t | numpy.random.Generator.standard_t |
numpy.random.RandomState.triangular | numpy.random.Generator.triangular |
numpy.random.RandomState.uniform | numpy.random.Generator.uniform |
numpy.random.RandomState.vonmises | numpy.random.Generator.vonmises |
numpy.random.RandomState.wald | numpy.random.Generator.wald |
numpy.random.RandomState.weibull | numpy.random.Generator.weibull |
numpy.random.RandomState.zipf | numpy.random.Generator.zipf |
numpy.random.beta | numpy.random.Generator.beta |
numpy.random.binomial | numpy.random.Generator.binomial |
numpy.random.bytes | numpy.random.Generator.bytes |
numpy.random.chisquare | numpy.random.Generator.chisquare |
numpy.random.choice | numpy.random.Generator.choice |
numpy.random.dirichlet | numpy.random.Generator.dirichlet |
numpy.random.exponential | numpy.random.Generator.exponential |
numpy.random.f | numpy.random.Generator.f |
numpy.random.gamma | numpy.random.Generator.gamma |
numpy.random.geometric | numpy.random.Generator.geometric |
numpy.random.gumbel | numpy.random.Generator.gumbel |
numpy.random.hypergeometric | numpy.random.Generator.hypergeometric |
numpy.random.laplace | numpy.random.Generator.laplace |
numpy.random.logistic | numpy.random.Generator.logistic |
numpy.random.lognormal | numpy.random.Generator.lognormal |
numpy.random.logseries | numpy.random.Generator.logseries |
numpy.random.multinomial | numpy.random.Generator.multinomial |
numpy.random.multivariate_normal | numpy.random.Generator.multivariate_normal |
numpy.random.negative_binomial | numpy.random.Generator.negative_binomial |
numpy.random.noncentral_chisquare | numpy.random.Generator.noncentral_chisquare |
numpy.random.noncentral_f | numpy.random.Generator.noncentral_f |
numpy.random.normal | numpy.random.Generator.normal |
numpy.random.pareto | numpy.random.Generator.pareto |
numpy.random.permutation | numpy.random.Generator.permutation |
numpy.random.poisson | numpy.random.Generator.poisson |
numpy.random.power | numpy.random.Generator.power |
numpy.random.rand | numpy.random.Generator.random |
numpy.random.randint | numpy.random.Generator.integers |
numpy.random.randn | numpy.random.Generator.standard_normal |
numpy.random.random | numpy.random.Generator.random |
numpy.random.random_integers | numpy.random.Generator.integers |
numpy.random.random_sample | numpy.random.Generator.random |
numpy.random.ranf | numpy.random.Generator.random |
numpy.random.rayleigh | numpy.random.Generator.rayleigh |
numpy.random.sample | numpy.random.Generator.random |
numpy.random.seed | numpy.random.default_rng |
numpy.random.shuffle | numpy.random.Generator.shuffle |
numpy.random.standard_cauchy | numpy.random.Generator.standard_cauchy |
numpy.random.standard_exponential | numpy.random.Generator.standard_exponential |
numpy.random.standard_gamma | numpy.random.Generator.standard_gamma |
numpy.random.standard_normal | numpy.random.Generator.standard_normal |
numpy.random.standard_t | numpy.random.Generator.standard_t |
numpy.random.triangular | numpy.random.Generator.triangular |
numpy.random.uniform | numpy.random.Generator.uniform |
numpy.random.vonmises | numpy.random.Generator.vonmises |
numpy.random.wald | numpy.random.Generator.wald |
numpy.random.weibull | numpy.random.Generator.weibull |
numpy.random.zipf | numpy.random.Generator.zipf |
import numpy as np
def foo():
np.random.seed(42)
x = np.random.randn() # Noncompliant: this relies on numpy.random.RandomState, which is deprecated
A string literal that is the first statement in a module, function, class, or method is a docstring. A docstring should document what a caller needs to know about the code. Information about what it does, what it returns, and what it requires are all valid candidates for documentation. Well written docstrings allow callers to use your code without having to first read it and understand its logic.
By convention, docstrings are enclosed in three sets of double-quotes.
def my_function(a,b):
When defining a tensorflow.function it is generally a bad practice to make this function recursive. TensorFlow does not support recursive tensorflow.function and will in the majority of cases throw an exception. However it is possible as well that the execution of such function succeeds, but with multiple tracings which has strong performance implications. When executing tensorflow.function, the code is split into two distinct stages. The first stage call tracing creates a new tensorflow.Graph, runs the Python code normally, but defers the execution of TensorFlow operations (i.e. adding two Tensors). These operations are added to the graph without being ran. The second stage which is much faster than the first, runs everything that was deferred previously. Depending on the input of the tensorflow.function the first stage may not be needed, see: Rules of tracing. Skipping this first stage is what provides the user with TensorFlow’s high performance.
Having a recursive tensorflow.function prevents the user from benefiting of TensorFlow’s capabilities.
import tensorflow as tf
@tf.function
def factorial(n):
if n == 1:
return 1
else:
return (n * factorial(n-1)) # Noncompliant: the function is recursive
While the assignment of default parameter values is typically a good thing, it can go very wrong very quickly when mutable objects are used. That’s because a new instance of the object is not created for each function invocation. Instead, all invocations share the same instance, and the changes made for one caller are made for all!
def get_attr_array(obj, arr=[]): # Noncompliant
props = (name for name in dir(obj) if not name.startswith('_'))
arr.extend(props) # after only a few calls, this is a big array!
return arr
When a function is designed to return an invariant value, it may be poor design, but it shouldn’t adversely affect the outcome of your program. However, when it happens on all paths through the logic, it is surely a bug.
This rule raises an issue when a function contains several return statements that all return the same value.
def foo(a): # NonCompliant
b = 12
if a == 1:
return b
return b
Raising instances of `Exception and BaseException will have a negative impact on any code trying to catch these exceptions.
From a consumer perspective, it is generally a best practice to only catch exceptions you intend to handle. Other exceptions should ideally not be caught and let to propagate up the stack trace so that they can be dealt with appropriately. When a generic exception is thrown, it forces consumers to catch exceptions they do not intend to handle, which they then have to re-raise.
Besides, when working with a generic type of exception, the only way to distinguish between multiple exceptions is to check their message, which is error-prone and difficult to maintain. Legitimate exceptions may be unintentionally silenced and errors may be hidden.
For instance, if an exception such as SystemExit` is caught and not re-raised, it will prevent the program from stopping.
When raising an exception, it is therefore recommended to raising the most specific exception possible so that it can be handled intentionally by consumers.
def check_value(value):
if value < 0:
raise BaseException("Value cannot be negative") # Noncompliant: this will be difficult for consumers to handle
Keras provides a full-featured model class called tensorflow.keras.Model. It inherits from tensorflow.keras.layers.Layer, so a Keras model can be used and nested in the same way as Keras layers. Keras models come with extra functionality that makes them easy to train, evaluate, load, save, and even train on multiple machines.
As the tensorflow.keras.Model class inherits from the ‘tensorflow.keras.layers’ you do not need to specify input_shape in a subclassed model; this argument will be ignored.
import tensorflow as tf
class MyModel(tf.keras.Model):
def __init__(self):
super(MyModel, self).__init__(input_shape=...) # Noncompliant: this parameter will be ignored
Being a dynamically typed language, the Python interpreter only does type checking during runtime. Getting the typing right is important as certain operations may result in a TypeError.
Type hints can be used to clarify the expected parameters of a function, enabling developers to better document its contract. Applying them consistently makes the code easier to read and understand.
In addition, type hints allow some development environments to offer better autocompletion and improve the precision of static analysis tools.
def hello(name) str:
return 'Hello ' + name
Prior to Python 3.12 functions using generic types were created as follows:
from typing import TypeVar
_T = TypeVar("_T")
def func(a: _T, b: _T) _T:
...
In Python, function parameters can have default values.
These default values are expressions which are evalutated when the function is defined, i.e. only once. The same default value will be used every time the function is called. Therefore, modifying it will have an effect on every subsequent call. This can lead to confusing bugs.
def myfunction(param=foo()): # foo is called only once, when the function is defined.
...
The new style of class creation, with the declaration of a parent class, created a unified object model in Python, so that the type of an instantiated class is equal to its class. In Python 2.2-2.7, this is not the case for old-style classes. In Python 3+ all classes are new-style classes. However, since the behavior can differ from 2.2+ to 3+, explicitly inheriting from object
(if there is no better candidate) is recommended.
class MyClass():
pass
Formatting strings, either with the % operator or str.format
method, requires a valid string and arguments matching this string’s replacement fields.
This rule raises an issue when formatting a string will raise an exception because the input string or arguments are invalid. Rule S3457 covers cases where no exception is raised and the resulting string is simply not formatted properly.
print('Error code %d' % '42') # Noncompliant. Replace this value with a number as %d requires.
print('User {1} is not allowed to perform this action'.format('Bob')) # Noncompliant. Replacement field numbering should start at 0.
print('User {0} has not been able to access {}'.format('Alice', 'MyFile')) # Noncompliant. Use only manual or only automatic field numbering, don't mix them.
print('User {a} has not been able to access {b}'.format(a='Alice')) # Noncompliant. Provide a value for field "b".
Python’s datetime API provide several different ways to create datetime objects. One possibility is the to use datetime.datetime.utcnow or datetime.datetime.utcfromtimestamp functions. The issue with these two functions is they are not time zone aware, even if their name would suggest otherwise.
Using these functions could cause issue as they may not behave as expected, for example:
from datetime import datetime
timestamp = 1571595618.0
date = datetime.utcfromtimestamp(timestamp)
date_timestamp = date.timestamp()
assert timestamp == date_timestamp
Use of the `exec statement could be dangerous, and should be avoided. Moreover, the exec statement was removed in Python 3.0. Instead, the built-in exec() function can be used.
Use of the exec statement is strongly discouraged for several reasons such as:
Security Risks: Executing code from a string opens up the possibility of code injection attacks.
Readability and Maintainability: Code executed with exec statement is often harder to read and understand since it is not explicitly written in the source code.
Performance Implications: The use of exec statement can have performance implications since the code is compiled and executed at runtime.
Limited Static Analysis: Since the code executed with exec` statement is only known at runtime, static code analysis tools may not be able to catch certain errors or issues, leading to potential bugs.
exec 'print 1' # Noncompliant
Iterating over a collection using a for loop in Python relies on iterators.
An iterator is an object that allows you to traverse a collection of elements, such as a list or a dictionary. Iterators are used in for loops to iterate over the elements of a collection one at a time.
When you create an iterator, it keeps track of the current position in the collection and provides a way to access the next element. The next() function is used to retrieve the next element from the iterator. When there are no more elements to iterate over, the next() function raises a StopIteration exception and the iteration stops.
It is important to note that iterators are designed to be read-only. Modifying a collection while iterating over it can cause unexpected behavior, as the iterator may skip over or repeat elements. A RuntimeError may also be raised in this situation, with the message changed size during iteration. Therefore, it is important to avoid modifying a collection while iterating over it to ensure that your code behaves as expected.
If you still want to modify the collection, it is best to use a second collection or to iterate over a copy of the original collection instead.
def my_fun(my_dict):
for key in my_dict:
if my_dict[key] == 'foo':
my_dict.pop(key) # Noncompliant: this will make the iteration unreliable
Parentheses are not required after the assert, del, elif, except, for, if, in, not, raise, return, while, and yield
keywords, and using them unnecessarily impairs readability. They should therefore be omitted.
x = 1
while (x < 10):
print "x is now %d" % (x)
x += 1
Unlike class and instance methods, static methods don’t receive an implicit first argument. Nonetheless naming the first argument self or cls
guarantees confusion - either on the part of the original author, who may never understand why the arguments don’t hold the values he expected, or on that of future maintainers.
class MyClass:
@staticmethod
def s_meth(self, arg1, arg2): #Noncompliant
# ...
When the call to a function doesn’t have any side effects, what is the point of making the call if the results are ignored? In such case, either the function call is useless and should be dropped or the source code doesn’t behave as expected.
This rule raises an issue when a builtin function or methods which has no side effects is called and its result is not used.
myvar = "this is a multiline"
"message from {}".format(sender) # Noncompliant. The formatted string is not used because the concatenation is not done properly.
As soon as the `yield keyword is used the enclosing method or function becomes a generator. Thus yield should never be used in a function or method which is not intended to be a generator.
This rule raises an issue when yield or yield from are used in a function or method which is not a generator because:
the function/method’s return type annotation is not [typing.Generator[…]|https://docs.python.org/3/library/typing.html#typing.Generator]
it is a special method which can never be a generator (ex: init`).
class A:
def __init__(self, value):
self.value = value
yield value # Noncompliant
def mylist2() List[str]:
yield ['string'] # Noncompliant. Return should be used instead of yield
def generator_ok() Generator[int, float, str]:
sent = yield 42
return '42'
Defining a variable with the same name as a built-in symbol will “shadow” it. That means that the builtin will no longer be accessible through its original name, having locally been replaced by the variable.
Shadowing a builtin makes the code more difficult to read and maintain. It may also be a source of bugs as you can reference the builtin by mistake.
It is sometimes acceptable to shadow a builtin to improve the readability of a public API or to support multiple versions of a library. In these cases, benefits are greater than the maintainability cost. This should, however, be done with care.
It is generally not a good practice to shadow builtins with variables which are local to a function or method. These variables are not public and can easily be renamed, thus reducing the confusion and making the code less error-prone.
def a_function():
int = 42 # Noncompliant; int is a builtin
In order to be callable, a Python class should implement the `call method. Thanks to this method, an instance of this class will be callable as a function.
However, when making a call to a non-callable object, a TypeError will be raised.
In order to fix this issue, make sure that the object you are trying to call has a call` method.
class MyClass:
pass
myvar = MyClass()
myvar() # Noncompliant
none_var = None
none_var() # Noncompliant
An assert is inappropriate for parameter validation because assertions are disabled globally at the interpreter level when the application runs as optimize bytecode (-O and -OO command line switches). It means that the optimize version of the application would completely eliminate the intended checks.
This rule raises an issue when a public method uses one or more of its parameters with assert
s.
class Shop:
def setPrice(self, price):
assert(price >= 0 and price <= MAX_PRICE) # Noncompliant
// Set the price
When a function is called, it accepts only one value per parameter. The Python interpreter will raise a SyntaxError when the same parameter is provided more than once, i.e. myfunction(a=1, a=2).
Other less obvious cases will also fail at runtime by raising a TypeError, when:
An argument is provided by value and position at the same time.
An argument is provided twice, once via unpacking and once by value or position.
def func(a, b, c):
return a * b * c
func(6, 93, 31, c=62) # Noncompliant: argument "c" is duplicated
params = {'c':31}
func(6, 93, 31, **params) # Noncompliant: argument "c" is duplicated
func(6, 93, c=62, **params) # Noncompliant: argument "c" is duplicated
The with statement is used to wrap the execution of a block with methods defined by a context manager. The context manager handles the entry into, and the exit from, the desired runtime context for the execution of the block of code. To do so, a context manager should have an enter and an exit method.
Executing the following block of code:
class MyContextManager:
def __enter__(self):
print("Entering")
def __exit__(self, exc_type, exc_val, exc_tb):
print("Exiting")
with MyContextManager():
print("Executing body")
Importing every public name from a module using a wildcard (from mymodule import *
) is a bad idea because:
It could lead to conflicts between names defined locally and the ones imported.
It reduces code readability as developers will have a hard time knowing where names come from.
It clutters the local namespace, which makes debugging more difficult.
Remember that imported names can change when you update your dependencies. A wildcard import that works today might be broken tomorrow.
# file: mylibrary/pyplot.py
try:
from guiqwt.pyplot import * # Ok
except Exception:
from matplotlib.pyplot import * # Ok
The exit method is invoked with four arguments: self, type, value and traceback. Leave one of these out of the method declaration and the result will be a TypeError
at runtime.
class MyClass:
def __enter__(self):
pass
def __exit__(self, exc_type, exc_val): # Noncompliant
pass
Calling a function or a method with fewer or more arguments than expected will raise a TypeError
. This is usually a bug and should be fixed.
######################
# Positional Arguments
######################
param_args = [1, 2, 3]
param_kwargs = {'x': 1, 'y': 2}
def func(a, b=1):
print(a, b)
def positional_unlimited(a, b=1, *args):
print(a, b, *args)
func(1)
func(1, 42)
func(1, 2, 3) # Noncompliant. Too many positional arguments
func() # Noncompliant. Missing positional argument for "a"
positional_unlimited(1, 2, 3, 4, 5)
def positional_limited(a, *, b=2):
print(a, b)
positional_limited(1, 2) # Noncompliant. Too many positional arguments
#############################
# Unexpected Keyword argument
#############################
def keywords(a=1, b=2, *, c=3):
print(a, b, c)
keywords(1)
keywords(1, z=42) # Noncompliant. Unexpected keyword argument "z"
def keywords_unlimited(a=1, b=2, *, c=3, **kwargs):
print(a, b, kwargs)
keywords_unlimited(a=1, b=2, z=42)
#################################
# Mandatory Keyword argument only
#################################
def mandatory_keyword(a, *, b):
print(a, b)
mandatory_keyword(1, b=2)
mandatory_keyword(1) # Noncompliant. Missing keyword argument "b"
break and continue are control flow statements used inside of loops. break is used to break out of its innermost enclosing loop and continue will continue with the next iteration.
The example below illustrates the use of break in a while loop:
n = 1
while n < 10:
if n % 3 == 0:
print("Found a number divisible by 3", n)
break
n = n + 1
Python allows developers to customize how code is interpreted by defining special methods (also called magic methods). For example, it is possible to define the objects own truthiness or falsiness by iverriding `bool method. It is invoked when the built-in bool() function is called on an object. The bool() function returns True or False based on the truth value of the object.
The Python interpreter will call these methods when performing the operation they’re associated with. Each special method expects a specific return type. Calls to a special method will throw a TypeError if its return type is incorrect.
An issue will be raised when one of the following methods doesn’t return the indicated type:
bool method should return bool
index method should return integer
repr method should return string
str method should return string
bytes method should return bytes
hash method should return integer
format method should return string
getnewargs method should return tuple
getnewargs_ex` method should return something which is of the form tuple(tuple, dict)
class MyClass:
def __bool__(self):
return 0 # Noncompliant: Return value of type bool here.
obj1 = MyClass()
print(bool(obj1)) # TypeError: __bool__ should return bool, returned int
Getting, setting and deleting items using square brackets requires the accessed object to have special methods:
Getting items such as `my_variable[key] requires my_variable to have the getitem method, or the class_getitem method if my_variable is a class.
Setting items such as my_variable[key] = 42 requires my_variable to have the setitem method.
Deleting items such as del my_variable[key] requires my_variable to have the delitem method.
Performing these operations on an object that doesn’t have the corresponding method will result in a TypeError`.
To fix this issue, make sure that the class for which you are trying to perform item operations implements the required methods.
del (1, 2)[0] # Noncompliant: tuples are immutable
(1, 2)[0] = 42 # Noncompliant
(1, 2)[0]
class A:
def __init__(self, values):
self._values = values
a = A([0,1,2])
a[0] # Noncompliant
del a[0] # Noncompliant
a[0] = 42 # Noncompliant
class B:
pass
B[0] # Noncompliant
The regex function re.sub can be used to perform a search and replace based on regular expression matches. The repl parameter can contain references to capturing groups used in the pattern parameter. This can be achieved with \n to reference the n’th group.
When referencing a nonexistent group an error will be thrown for Python < 3.5 or replaced by an empty string for Python >= 3.5.
re.sub(r"(a)(b)(c)", r"\1, \9, \3", "abc") # Noncompliant - result is an re.error: invalid group reference
The first argument to super
must be the name of the class making the call. If it’s not, the result will be a runtime error.
class Person(object):
#...
class PoliceOfficer(Person):
def __init__(self, name):
super().__init__(name) // Noncompliant
Using “null=True” on string-based fields can lead to inconsistent and unexpected behavior. In Django, “null=True” allows the field to have a NULL value in the database. However, the Django convention to represent the absence of data for a string is an empty string. Having two ways to represent the absence of data can cause problems when querying and filtering on the field. For example, if a CharField with “null=True” has a value of NULL in the database, querying for an empty string will not return that object.
class ExampleModel(models.Model):
name = models.CharField(max_length=50, null=True)
GraphQL introspection is a feature that allows client applications to query the schema of a GraphQL API at runtime. It provides a way for developers to explore and understand the available data and operations supported by the API.
This feature is a diagnostic tool that should only be used in the development phase as its presence also creates risks.
Clear documentation and API references should be considered better discoverability tools for a public GraphQL API.
from graphql_server.flask import GraphQLView
app.add_url_rule("/api",
view_func=GraphQLView.as_view( # Noncompliant
name="api",
schema=schema,
)
)
Functions, methods, or lambdas with a long parameter list are difficult to use, as maintainers must figure out the role of each parameter and keep track of their position.
def set_coordinates(x1, y1, z1, x2, y2, z2): # Noncompliant
# ...
By default, only dictionary objects can be serialized in Django JSON-encoded response. Before ECMASCript 5, serializing non-dictionary objects could lead to security vulnerabilities. Since most modern browsers implement ECMAScript 5, this vector of attack is no longer a threat and it is possible to serialize non-dictionary objects by setting the safe flag to False. However, if this flag is not set, a TypeError will be thrown by the serializer.
Despite this possibility, it is still recommended to serialize dictionary objects, as an API based on dict is generally more extensible and easier to maintain.
from django.http import JsonResponse
response = JsonResponse([1, 2, 3])
The walrus operator := (also known as “assignment expression”) should be used with caution as it can easily make code more difficult to understand and thus maintain. In such case it is advised to refactor the code and use an assignment statement (i.e. =
) instead.
Reasons why it is better to avoid using the walrus operator in Python:
Readability: The walrus operator can lead to more complex and nested expressions, which might reduce the readability of the code, especially for developers who are not familiar with this feature.
Compatibility: If you are working on projects that need to be compatible with older versions of Python (before 3.8), you should avoid using the walrus operator, as it won’t be available in those versions.
v0 = (v1 := f(p)) # Noncompliant: Use an assignment statement ("=") instead; ":=" operator is confusing in this context
f'{(x:=10)}' # Noncompliant: Move this assignment out of the f-string; ":=" operator is confusing in this context
The else clause of a loop is skipped when a break is executed in this loop. In other words, a loop with an else but no break statement will always execute the else part (unless of course an exception is raised or return is used). If this is what the developer intended, it would be much simpler to have the else statement removed and its body unindented. Thus having a loop with an else and no break
is most likely an error.
from typing import List
def foo(elements: List[str]):
for elt in elements:
if elt.isnumeric():
return elt
else: # Noncompliant: no break in the loop
raise ValueError("List does not contain any number")
def bar(elements: List[str]):
for elt in elements:
if elt.isnumeric():
return elt
else: # Noncompliant: no break in the loop
raise ValueError("List does not contain any number")
Class methods that don’t access instance data can and should be static because they yield more performant code.
To implement a static method in Python one should use either @classmethod or @staticmethod
. A class method receives the class as implicit first argument, just like an instance method receives the instance. A static method does not receive an implicit first argument.
class Utilities:
def do_the_thing(self, arg1, arg2, ...): # Noncompliant
#...
Because a subclass instance may be used as an instance of the superclass, overriding methods should uphold the aspects of the superclass contract that relate to the Liskov Substitution Principle. Specifically, an overriding method should be callable with the same parameters as the overriden one.
The following modifications are OK:
Adding an optional parameter, i.e. with a default value, as long as they don’t change the order of positional parameters.
Renaming a positional-only parameter.
Reordering keyword-only parameters.
Adding a default value to an existing parameter.
Changing the default value of an existing parameter.
Extend the ways a parameter can be provided, i.e. change a keyword-only or positional-only parameter to a keyword-or-positional parameter. This is only true if the order of positional parameters doesn’t change. New positional parameters should be placed at the end.
Adding a vararg parameter (`*args).
Adding a keywords parameter (**kwargs).
The following modifications are not OK:
Removing parameters, even when they have default values.
Adding mandatory parameters, i.e. without a default value.
Removing the default value of a parameter.
Reordering parameters, except when they are keyword-only parameters.
Removing some ways of providing a parameter. If a parameter could be passed as keyword it should still be possible to pass it as keyword, and the same is true for positional parameters.
Removing a vararg parameter (*args).
Removing a keywords parameter (**kwargs`).
This rule raises an issue when the signature of an overriding method does not accept the same parameters as the overriden one. Only instance methods are considered, class methods and static methods are ignored.
class ParentClass(object):
def mymethod(self, param1):
pass
class ChildClassRenamed(ParentClass):
def mymethod(self, renamed): # No issue but this is suspicious. Rename this parameter as "param1" or use positional only arguments if possible.
pass
Having all branches of an if chain with the same implementation indicates a problem.
In the following code:
if b == 0: # Noncompliant
do_one_more_thing()
elif b == 1:
do_one_more_thing()
else:
do_one_more_thing()
b = 4 if a > 12 else 4 # Noncompliant
Every instance method is expected to have at least one positional parameter. This parameter will reference the object instance on which the method is called. Calling an instance method which doesn’t have at least one parameter will raise a TypeError. By convention, this first parameter is usually named self.
Class methods, i.e. methods annotated with @classmethod, also require at least one parameter. The only differences is that they will receive the class itself instead of a class instance. By convention, this first parameter is usually named cls.
class MyClass:
def instance_method(): # Noncompliant: "self" parameter is missing.
print("instance_method")
@classmethod
def class_method(): # Noncompliant: "cls" parameter is missing.
print("class_method")
When tested for truthiness, a sequence or collection will evaluate to False if it is empty (its `len method returns 0) and to True if it contains at least one element.
Using the assert statement on a tuple literal will therefore always fail if the tuple is empty, and always succeed otherwise.
The assert statement does not take parentheses around its parameters. Calling assert(x, y) will test if the tuple (x, y) is True, which is always the case.
There are two possible fixes:
If your intention is to test the first value of the tuple and use the second value as a message, simply remove the parentheses.
If your intention is to check that every element of the tuple is True`, test each value separately.
def test_values(a, b):
assert (a, b) # Noncompliant: will always be True
The long suffix should always be written in uppercase, i.e. ‘L’, as the lowercase ‘l’ can easily be confused with the digit one ‘1’.
return 10l // Noncompliant; easily confused with one zero one
The tf.gather function allows you to gather slices from a tensor along a specified axis according to the indices provided. The validate_indices argument is deprecated and setting its value has no effect. Indices are always validated on CPU and never validated on GPU.
import tensorflow as tf
x = tf.constant([[1, 2], [3, 4]])
y = tf.gather(x, [1], validate_indices=True) # Noncompliant: validate_indices is deprecated
Python allows developers to customize how code is interpreted by defining special methods (also called magic methods). For example, it is possible to override how the multiplication operator (`a * b) will apply to instances of a class by defining in this class the mul and rmul methods. Whenever a multiplication operation is performed with this class, the Python interpreter will call one of these methods instead of performing the default multiplication.
Each special method expects a specific number of parameters. The Python interpreter will call these methods with those parameters. Calls to a special method will throw a TypeError` if it is defined with an incorrect number of parameters.
class A:
def __mul__(self, other, unexpected): # Noncompliant: too many parameters
return 42
def __add__(self): # Noncompliant: missing one parameter
return 42
A() * 3 # TypeError: __mul__() missing 1 required positional argument: 'unexpected'
A() + 3 # TypeError: __add__() takes 1 positional argument but 2 were given
Using octal escapes in regular expressions can create confusion with backreferences. Octal escapes are sequences of digits that represent a character in the ASCII table, and they are sometimes used to represent special characters in regular expressions. However, they can be easily mistaken for backreferences, which are also sequences of digits that represent previously captured groups. This confusion can lead to unexpected results or errors in the regular expression.
import re
match = re.match(r"\101", "A")
A method that is never called is dead code, and should be removed. Cleaning out dead code decreases the size of the maintained codebase, making it easier to understand the program and preventing bugs from being introduced.
Python has no real private methods. Every method is accessible. There are however two conventions indicating that a method is not meant to be “public”:
methods with a name starting with a single underscore (ex: `_mymethod) should be seen as non-public and might change without prior notice. They should not be used by third-party libraries or software. It is ok to use those methods inside the library defining them but it should be done with caution.
”class-private” methods have a name which starts with at least two underscores and ends with at most one underscore. These methods’ names will be automatically mangled to avoid collision with subclasses’ methods. For example __mymethod will be renamed as _classname__mymethod, where classname` is the method’s class name without its leading underscore(s). These methods shouldn’t be used outside of their enclosing class.
This rule raises an issue when a class-private method (two leading underscores, max one underscore at the end) is never called inside the class. Class methods, static methods and instance methods will all raise an issue.
class Noncompliant:
@classmethod
def __mangled_class_method(cls): # Noncompliant
print("__mangled_class_method")
@staticmethod
def __mangled_static_method(): # Noncompliant
print("__mangled_static_method")
def __mangled_instance_method(self): # Noncompliant
print("__mangled_instance_method")