great way of making sure your application gets security updates as soon as they are available. Once a vendor releases a security update, it is crucial to apply it in a timely manner before malicious actors exploit the vulnerability. Relying on manual updates is usually too late, especially if the application is publicly accessible on the internet.
Get Started
- CodeAnt AI
- Control Center
- Pull Request Review
- IDE
- Compliance
- Anti-Patterns
- Code Governance
- Infrastructure Security Database
- Application Security Database
Php
Overly complicated regular expressions are hard to read and to maintain and can easily cause hard-to-find bugs. If a regex is too complicated, you should consider replacing it or parts of it with regular code or splitting it apart into multiple patterns at least.
The complexity of a regular expression is determined as follows:
Each of the following operators increases the complexity by an amount equal to the current nesting level and also increases the current nesting level by one for its arguments:
`| - when multiple | operators are used together, the subsequent ones only increase the complexity by 1
Quantifiers (*, +, ?, {n,m}, {n,} or {n})
Non-capturing groups that set flags (such as (?i:some_pattern) or (?i)some_pattern`)
Lookahead and lookbehind assertions
Additionally, each use of the following features increase the complexity by 1 regardless of nesting:
character classes
back references
Jump statements, such as return, goto, and continue
let you change the default flow of program execution, but jump statements that direct the control flow to the original direction are just a waste of keystrokes.
WordPress relies a lot on the configuration located in a file named wp-config.php. This file contains mostly define statements and each of them creates a constant for a given WordPress option. However, no warning appears if an option is misspelled: the statement simply defines a constant which is never used.
This rule raises an issue when a file named wp-config.php defines a constant whose name is slightly different from a known WordPress option.
The PHPUnit test runner does execute public methods defined within test classes that have a name starting with “test” or the @test annotation. Methods that do not convey to this will not get executed.
This rule raises an issue on methods marked as test methods (by name or annotation) but do not have a public visibility. An issue is also raised on public methods that are not marked as tests, do contain assertions, and are not called from within another discoverable test method within the class. No issues are raised in abstract classes, or on public static method which provides data.
WordPress makes it possible to define options using define statements inside a configuration file named wp-config.php. However, if the statements are located after the settings are loaded at the end of this file, they are not taken into account by WordPress. This rule raises an issue when a define statement appears after wp-settings.php is loaded.
This rule allows banning certain PHP functions.
PHPUnit assertions do throw a PHPUnit\Framework\ExpectationFailedException exception when they fail. This is how PHPUnit internally notices when assertions within testcases fail. However, if such an exception type or one of its parent types is captured within a try-catch block and not rethrown, PHPUnit does not notice the assertion failure.
This check raises an issue on assertions within the try body of a try-catch block that do catch exceptions of the type PHPUnit\Framework\ExpectationFailedException, PHPUnit\Framework\AssertionFailedError, or Exception, and do not handle the variable holding the exception.
The elseif keyword is preferred over the else if keywords in PHP due to its consistency with other programming languages.
elseif is a single token, making it more efficient for the PHP parser to process. Additionally, using elseif encourages developers to follow a unified coding style and maintain a consistent syntax throughout their PHP codebase.
Overall, the use of elseif improves code clarity, reduces potential errors, and enhances the overall maintainability of PHP projects.
Instances of classes that do not derive from the “Throwable” interface cannot be used in a PHP “throw” statement.
Many built-in exceptions such as “Exception” and the SPL exception classes do implement the “Throwable” interface and can be extended when creating custom exceptions.
This rule raises an issue when an instance of a class that does not implement the “Throwable” interface is used in a “throw” statement .
Boolean operators are used to combine conditional statements based on their value. In PHP, there are two different sets of operators to use for AND and OR:
`&& / ||
and / or
The difference between these sets is the precedence, which specifies how “tightly” two expressions are bound together. Because and / or have a lower precedence than almost any other operator, using them instead of && / ||` may not have the result you expect.
When verifying that code raises an exception, a good practice is to avoid having multiple method calls inside the tested code, to be explicit about what is exactly tested.
When two of the methods can raise the same exception, not respecting this good practice is a bug, since it is not possible to know what is really tested.
Without OAEP in RSA encryption, it takes less work for an attacker to decrypt the data or infer patterns from the ciphertext. This rule logs an issue when openssl_public_encrypt is used with one the following padding constants: OPENSSL_NO_PADDING or OPENSSL_PKCS1_PADDING or OPENSSL_SSLV23_PADDING
.
e repair and optimization mode that can be activated by setting WP_ALLOW_REPAIR to true in the configuration.
If activated, the repair page can be accessed by any user, authenticated or not. This makes sense because if the database is corrupted, the authentication mechanism might not work.
Malicious users could trigger this potentially costly operation repeatadly slowing down the website, and making it unavailable.
Regular expressions are patterns used to match and manipulate strings based on specific rules.
The pattern provided to PHP regular expression functions is required to be enclosed in valid delimiters to be working correctly.
The PSR2 standard recommends listing modifiers in the following order to improve the readability of PHP source code:
final or abstract
public or protected or private
static
When present, the `abstract and final declarations should precede the visibility declaration.
When present, the static` declaration should come after the visibility declaration.
The <?php tag is used to explicitly mark the start of PHP code in a file. It is considered the recommended and portable way to open PHP blocks. On the other hand, the <?= tag is a shorthand for <?php echo and is specifically used to output variables or expressions directly without using the echo statement. Not using these tags can make the code less readable and harder to maintain, as it deviates from the standard conventions followed by most PHP developers.
In PHP both the way to declare a constructor and the way to make a call to a parent constructor have evolved.
When declaring constructors with the __construct name, nested calls to parent constructors should also use the new __constructor
name.
There’s no point in having a PHPUnit test case without any test methods. Similarly, you shouldn’t have a file in the tests directory which extends PHPUnit\Framework\TestCase but no tests in the file. Doing either of these things may lead someone to think that uncovered classes have been tested. Add some test method or make the class abstract if it is used by a real test case class.
In PHP, nested functions are functions, which are defined inside other functions. They have access to the variables of the enclosing functions. Once the parent function declares the nested function, the nested function becomes available to the global scope.
Code that uses nested functions can become difficult to read, refactor and to maintain and should therefore be avoided.
include_once and include
only generate a warning if an error occurs during the operation. Because of this they should only be used after possible error conditions have been checked.
Explicitly declaring method visibility provides clarity and improves code readability. This leads to:
Simplifying access to and use of the methods for other developers, enhancing the maintainability of the codebase.
Promoting the principle of encapsulation, allowing you to control the accessibility of methods and prevent unintended access or modifications from external code.
By explicitly stating the visibility, you establish a clear contract for how the methods should be interacted with, making it easier to understand and reason about the behavior of the code.
Available access modifiers are:
`private - access allowed only within the same class
protected - access allowed to the class and its child classes
public` - unfettered access by all (default)
PHP lets you make static calls to non-static methods, ex: A::f();. But the fact that you can doesn’t mean you should. While such calls will work when there’s no $this
reference in the called method, a fatal error will occur when there is. Furthermore, such calls have been deprecated in PHP 7, and their support may be removed from future releases.
sleep
is sometimes used in a mistaken attempt to prevent Denial of Service (DoS) attacks by throttling response rate. But because it ties up a thread, each request takes longer to serve that it otherwise would, making the application more vulnerable to DoS attacks, rather than less.
Shared coding conventions allow teams to collaborate effectively. Writing colors in upper case makes them stand out at such, thereby making the code easier to read.
This rule checks that hexadecimal color definitions are written in upper case.
Creating a new Exception
without actually throwing it is useless and is probably due to a mistake.
The `exit(…) and die(…) statements should absolutely not be used in Web PHP pages as this might lead to a very bad user experience. In such case, the end user might have the feeling that the web site is down or has encountered a fatal error.
But of course PHP can also be used to develop command line application and in such case use of exit(…) or die(…) statement can be justified but must remain limited and not spread all over the application. We expect exceptions to be used to handle errors and those exceptions should be caught just before leaving the application to specify the exit code with help of exit(…) or die(…)` statements.
ible to install and update themes and plugins in the Administration Screens. This is a convenient feature but it’s also risky. If an attacker manages to get access to the WordPress administrative area, this person can then take advantage of the theme/plugin management feature to upload PHP code on the server and to execute that code. Defining the DISALLOW_FILE_MODS option to true in wp-config.php disables this risky feature. The default value is false.
Superglobal variables are predefined variables available in all scopes throughout a script. However, accessing them directly is considered bad practice. Instead, they should be accessed through an object or framework that handles sanitation and validation.
An preg_replace call always performs an evaluation of the first argument as a regular expression, even if no regular expression features were used. This has a significant performance cost and therefore should be used with care.
When preg_replace is used, the first argument should be a real regular expression. If it’s not the case, str_replace does exactly the same thing as preg_replace without the performance drawback of the regex.
This rule raises an issue for each preg_replace used with a simple string as first argument which doesn’t contains special regex character or pattern.
`ini_set changes the value of the given configuration option for the duration of the script’s execution. While there may be a reason to do this, you should make sure that it’s a very good reason indeed, because this is the sort of “magic” change which can cause severe teeth-gnashing and hair tearing when the script needs to be debugged.
For instance, if the user explicitly turns logging on for a script, but then the script itself uses ini_set(‘display_errors’, 0);` to turn logging back off, it is likely that every other aspect of the environment will be examined before, in desperation, the script is read to figure out where the logging is going.
A `switch and a chain of if/else if statements is evaluated from top to bottom. At most, only one branch will be executed: the first one with a condition that evaluates to true.
Therefore, duplicating a condition automatically leads to dead code. Usually, this is due to a copy/paste error. At best, it’s simply dead code and at worst, it’s a bug that is likely to induce further bugs as the code is maintained, and obviously it could lead to unexpected behavior.
For a switch, if the first case ends with a break, the second case will never be executed, rendering it dead code. Worse there is the risk in this situation that future maintenance will be done on the dead case, rather than on the one that’s actually used.
On the other hand, if the first case does not end with a break`, both cases will be executed, but future maintainers may not notice that.
Shared coding conventions allow teams to collaborate effectively.
This rule checks that when
an assignment is too long to fit on one line, the line break is inserted before the
=
rather than after, and the second line of the statement is indented from the first.an object operator is the first thing on the line, it is indented from the previous line.
In PHP 5.4, break and continue no longer accept arguments that require computation. Static arguments are still okay except for zero (0
).
The PHP runtime will allow the application to access all files underneath the configured set of directories. If no value is set, the application may access any file on the filesystem.
Instead of using boolean literals or null in an equality with assertSame() or assertEquals(), it is recommended to rely on the alternative functions assertTrue(), assertFalse(), assertNull(), and assertNotNull().
When a back reference in a regex refers to a capturing group that hasn’t been defined yet (or at all), it can never be matched. Named back references throw a warning in that case; numeric back references fail silently when they can’t match, simply making the match fail.
When the group is defined before the back reference but on a different control path (like in (.)|\1 for example), this also leads to a situation where the back reference can never match.
ated by a WordPress server should be considered as security-sensitive. They may contain sensitive data which is stored in the files or in the database of the server. It’s important for the administrator of a WordPress server to understand what they contain and to which server they are sent.
WordPress makes it possible to block external requests by setting the WP_HTTP_BLOCK_EXTERNAL option to true. It’s then possible to authorize requests to only a few servers using another option named WP_ACCESSIBLE_HOSTS.
Most applications do not require or expect the file access functions to download remotely accessible files. However, attackers can abuse these remote file access features while exploiting other vulnerabilities, such as path traversal issues.
Files that define symbols such as classes and variables may be included into many files. Simply performing that inclusion should have no effect on those files other than declaring new symbols. For instance, a file containing a class definition should not also contain side-effects such as `print statements that will be evaluated automatically on inclusion. Logic should be segregated into symbol-only files and side-effect-only files. The type of operation which is not allowed in a symbol-definition file includes but is not limited to:
generating output
modifying ini` settings
emitting errors or exceptions
modifying global or static variables
reading/writing files
Variable variables in PHP allow you to use the value of a variable as the name of another variable. This feature can be useful in dynamic programming scenarios where variable names need to be dynamically determined and manipulated.
In PHP, keywords and constants are case-insensitive, meaning they can be written in either lower case or upper case without affecting their functionality. This allows for more flexibility and ease of use when writing code.
However, it is generally recommended to follow a consistent casing convention for readability and maintainability purposes. Relevant constants are true, false and null.
The count of elements from an array or Countable object is always greater than or equal to zero. Testing it doesn’t make sense, since the result is always true
.
Having characters before <?php
can cause “Cannot modify header information” errors and similar problems with Ajax requests.
Namespaces should be preferred over the include or require functions in PHP because they provide a cleaner and more organized approach for managing code dependencies. Namespaces allow you to logically group related classes, functions, and constants, preventing naming conflicts and improving code readability. They also promote code reusability by enabling modularization and encapsulation. On the other hand, using include or require functions directly can lead to a cluttered global namespace and make it harder to track and manage dependencies, especially in larger projects.
While it is technically correct to assign to parameters from within function bodies, doing so before the parameter value is read is likely a bug. Instead, initial values of parameters should be, if not treated as read-only, then at least read before reassignment.
GET URL parameter can be disclosed in a variety of ways:
Directly in a web browser address bar.
In navigation history.
In web servers or intermediate proxies log files.
Function arguments should all have different names to prevent any ambiguity. Indeed, if arguments have the same name, the last duplicated argument hides all the previous arguments with the same name. This hiding makes no sense, reduces understandability and maintainability, and obviously can be error prone.
Include injections occur in an application when the application retrieves data from a user or a third-party service and inserts it into an include expression without sanitizing it first.
If an application contains an include expression that is vulnerable to injections, it is exposed to attacks that target the underlying server.
Instances of classes that do not derive from the “Throwable” interface cannot be used in a PHP “throw” statement. Thus, it does not make sense to try to catch such objects within a “try-catch” block.
Many built-in exceptions such as “Exception” and the SPL exception classes do implement the “Throwable” interface and can be extended when creating custom exceptions.
This rule raises an issue when the classes used to specify the type of objects to be caught in a “try-catch” block do not derive from “Throwable” .
When a function parameter has a nullable type, e.g., parameter param in f(?int param)
, it must be explicitly provided in every function call. A nullable-type parameter has no default value.
While PHP variables obligingly spring into existence the first time you use them, relying on this behavior is a bad idea for two reasons. First, relying on the default value of an uninitialized variable can cause problems in some cases. Second, and more importantly, it can pose a security risk when register_globals is enabled. (Note that register_globals
is deprecated in PHP 5.3 and removed in PHP 5.4.)
For example, with the default provided regular expression ^[a-z][a-zA-Z0-9]*$
, the function:
The repetition of a prefix operator (!, or ~
) is usually a typo. The second operator invalidates the first one:
Repeating an exception class in a single `catch clause will not fail but it is not what the developer intended. Either the class is not the one which should be caught, or this is dead code.
Having a subclass and a parent class in the same catch clause is also useless. It is enough to keep only the parent class.
This rule raises an issue when an exception class is duplicated in a catch clause, or when an exception class has a parent class in the same catch` clause.
Directives in the php.ini files can be of type `boolean, integer or string.
For boolean acceptable values are 0, 1, true, false, yes, no, on and off.
For integers they can be qualified with k, m or g`. E.g. 16k means 16000.
The complete list of directive is at http://php.net/manual/en/ini.list.php
In PHP, the define function allows you to define a named constant with a specific value, which cannot be changed later in the code. Once a constant has been defined, it can be used throughout the entire script, including in function and class definitions.
Assigning a value to the same constant name using two or more define statements does not cause PHP to fail. In such a case, PHP only issues a warning and ignores the second and further define.
When dynamic loading is enabled, PHP code can load arbitrary PHP extensions by calling the dl function. This can be used to bypass restrictions set with the open_basedir configuration.
PHP defaults to allowing dynamic loading.
Throwing generic exceptions such as Error, RuntimeException, Throwable, and Exception
will have a negative impact on any code trying to catch these exceptions.
From a consumer perspective, it is generally a best practice to only catch exceptions you intend to handle. Other exceptions should ideally be let to propagate up the stack trace so that they can be dealt with appropriately. When a generic exception is thrown, it forces consumers to catch exceptions they do not intend to handle, which they then have to re-throw.
Besides, when working with a generic type of exception, the only way to distinguish between multiple exceptions is to check their message, which is error-prone and difficult to maintain. Legitimate exceptions may be unintentionally silenced and errors may be hidden.
When throwing an exception, it is therefore recommended to throw the most specific exception possible so that it can be handled intentionally by consumers.
Well-named functions can allow the users of your code to understand at a glance what to expect from the function - even before reading the documentation. Toward that end, methods returning a boolean property should have names that start with “is” or “has” rather than with “get”.
Note that this rule will only apply to functions that are documented to return a boolean.
When using metacharacters like \w without the Unicode flag or when using hard-coded character classes like [a-zA-Z], letters outside of the ASCII range, such as umlauts, accented letters, or letters from non-Latin languages, won’t be matched. This may cause code to incorrectly handle input containing such letters.
To correctly handle non-ASCII input, it is recommended to use the Unicode flag \u or Unicode character properties like \p{L}.
Testing equality or nullness with PHPUnit’s assertTrue() or assertFalse()
should be simplified to the corresponding dedicated assertion.
echo
can be called with or without parentheses, but it is best practice to leave parentheses off the call because using parentheses with multiple arguments will result in a parse error.
In PHP it is not required to initialize variables before their usage. However, using uninitialized variables is considered bad practice and should be avoided because of the following reasons:
The value and type of uninitialized variables depend on the context of their first usage. It is better to be explicit about those to avoid confusion.
The interpreter raises a warning or a notice in many cases.
ss administrator and editor roles can add unfiltered HTML content in various places, such as post content. This includes the capability to add JavaScript code.
If an account with such a role gets hijacked, this capability can be used to plant malicious JavaScript code that gets executed whenever somebody visits the website.
In a file mixing PHP and HTML, all PHP tags should be closed. Failure to close a tag in this type of file could lead to unexpected output when the file is processed.
In PHP, references allow you to create multiple names for the same variable, enabling you to access and manipulate its value through different identifiers. They are denoted by the ampersand symbol & placed before the variable name during declaration or assignment.
Any modification a method makes to a function parameter passed by reference will also be made to the original value.
This feature can be difficult to use correctly, particularly if the callee is not expecting a reference.
The improper use of references in function calls can make code less efficient rather than more efficient.
Pre-processing on the server side is often required to check users authentication when working in CGI mode. Those preliminary actions can also position diverse configuration parameters necessary for the CGI script to work correctly.
When the call to a function doesn’t have any side effect, what is the point of making the call if the results are ignored? In such cases, either the function call is useless and should be dropped, or the source code doesn’t behave as expected.
For better organization and clarity in test suites, test classes should end with `*Test.php. This naming convention helps to easily identify and distinguish test classes from other classes in the codebase. It allows for automated test runners or frameworks to locate and execute the tests systematically.
The PHPUnit command-line test runner will look for *Test.php` files. In that case, test files without this pattern are ignored and not executed without warning.
Shared coding conventions allow teams to collaborate effectively. This rule flags all Perl-style comments.
Shared coding conventions make it possible for a team to collaborate efficiently. This rule raises issues for failures to comply with formatting standard.
By default, this rule conforms to the PER (PHP Evolving Recommendation) standard.
Calling a function or a method with fewer or more arguments than expected will raise a TypeError. This is usually a bug and should be fixed.
Provide missing arguments to the call, or define default values if there are fewer arguments.
Reduce the number of arguments provided by the function call, or add more parameter if there are more arguments than expected.
Assertions comparing an object to itself are more likely to be bugs due to developer’s carelessness.
This rule raises an issue when the actual expression matches the expected expression.
The PHP function preg_replace can be used to perform a search and replace based on regular expression matches. The replacementparametercancontainreferencestocapturinggroupsusedinthepattern parameter. This can be achieved with \n or $n to reference the n’th group.
When referencing a nonexistent group, it will be substituted with an empty string. This is in general not the intended behavior, and might stay unnoticed since no warning is raised.
The empty() function in PHP is specifically designed to check if a variable is empty, and it returns true if the variable is empty or evaluates to false. It handles various types of values, such as empty strings, zero, null, and arrays with no elements. This makes the code more readable because it clearly expresses the intention of checking for emptiness.
On the other hand, the count() function in PHP is primarily used to count the number of elements in an array or the properties of an object. It’s less explicit and may be less intuitive for someone reading the code.
Exception handling provides a way to help developers handle and recover from runtime errors and unexpected events during program execution. When an error occurs, an exception is thrown and can be caught by an appropriate try-catch block, allowing the program to handle the exception and prevent the program from crashing.
A problem arises, when an exception or error class — an instance of Throwable — in a catch clause does not exist, as PHP won’t generate an error.
This typically occurs when being in a namespace trying to catch PHP built-in exception classes without escaping to the global namespace or importing the classes.
In summary, this rule raises an issue when, being in a namespace, an undefined class belonging to that namespace is caught.
Included variables may have been set by user input could contain unexpected, and potentially dangerous values.
guessed (not generated with a secure pseudo random generator, or with insufficient length …) an attacker may be able to hijack another user’s session.
Certain functions exist in PHP only as aliases of other functions. These aliases have been made available for backward compatibility, but should really be removed from code.
This rule looks for uses of the following aliases:
Alias | Replacement |
---|---|
`chop | rtrim |
close | closedir |
doubleval | floatval |
fputs | fwrite |
ini_alter | ini_set |
is_double | is_float |
is_integer | is_int |
is_long | is_int |
is_real | is_float |
is_writeable | is_writable |
join | implode |
key_exists | array_key_exists |
magic_quotes_runtime | set_magic_quotes_runtime |
pos | current |
show_source | highlight_file |
sizeof | count |
strchr | strstr` |
The ability to define default values for method arguments can make a method easier to use. Default argument values allow callers to specify as many or as few arguments as they want while getting the same functionality and minimizing boilerplate, wrapper code.
But all method arguments with default values should be declared after the method arguments without default values. Otherwise, it makes it impossible for callers to take advantage of defaults; they must re-specify the defaulted values in order to “get to” the non-default arguments.
`file_uploads is an on-by-default PHP configuration that allows files to be uploaded to your site. Since accepting candy files from strangers is inherently dangerous, this feature should be disabled unless it is absolutely necessary for your site.
This rule raises an issue when file_uploads` is not explicitly disabled.
It is recommended not to declare more than one property per statement for the sake of code readability and maintainability. Declaring multiple properties in a single statement can make the code harder to understand and debug. It also increases the risk of introducing errors or overlooking specific property assignments.
Shared coding conventions allow teams to collaborate efficiently. To avoid the confusion that can be caused by tangling two coding languages in the same file, inline HTML should be avoided.
References in a class to static class members (fields or methods) can be made using either self::varorstatic::var (introduced in 5.3). The difference between the two is one of scope. Confusingly, in subclasses, the use of self:: references the original definition of the member, i.e. the superclass version, rather than any override at the subclass level. static::
, on the other hand, references the class that was called at runtime.
If a function does not return anything, it makes no sense to use its output. Specifically, passing it to another function, or assigning its “result” to a variable is probably a bug because such functions return nothing, which is probably not what was intended.
Using constructs with parentheses can be misleading as it will produce a syntax that looks like a normal function call. However those constructs have lower precedence in the evaluation order than some others operators, this can lead to a behavior completely different from what the function syntax would hint. Also, some of the constructs have optional parameters, while modifying the code we may remove a parameter while keeping the parentheses, resulting in invalid code.
In PHPUnit, to test that an exception is thrown in a given piece of code, the expectException*() methods or the @expectedException* annotations can be used. For such a test to succeed, something in the test method has to throw an exception with the awaited properties. Having an assertion at the end of such a test method, means that, if the test succeeds, that assertion was never evaluated because an exception was thrown before getting to that point.
ible to edit theme and plugin files directly in the Administration Screens. While it may look like an easy way to customize a theme or do a quick change, it’s a dangerous feature. When visiting the theme or plugin editor for the first time, WordPress displays a warning to make it clear that using such a feature may break the web site by mistake. More importantly, users who have access to this feature can trigger the execution of any PHP code and may therefore take full control of the WordPress instance. This security risk could be exploited by an attacker who manages to get access to one of the authorized users. Setting the DISALLOW_FILE_EDIT option to true in wp-config.php disables this risky feature. The default value is false.
To reference the current class instance the keyword $this can be used. Through the use of this keyword you have access to class properties and methods.
Static methods can be accessed without instantiating the class, so thisisnotavailableforthem.Usingthis in a static context will result in a runtime error.
sizeof() is an alias of count() so the two offer the same results, but sizeof()
has a different meaning in other languages and may confuse developers who are new to PHP.
Exceptions handlers (`catch) are evaluated in the order they are written. Once a match is found, the evaluation stops.
In some contexts a catch block is dead code as it will never catch any exception:
If there is a handler for a base class followed by a handler for class derived from that base class, the second handler will never trigger: the handler for the base class will match the derived class, and will be the only executed handler.
When multiple catch blocks try to catch the same exception class, only the first one will be executed.
This rule raises an issue when a catch block catches every exception before a later catch` block could catch it.
Secret keys are used in combination with an algorithm to encrypt data. A typical use case is an authentication system. For such a system to be secure, the secret key should have a value which cannot be guessed and which is long enough to not be vulnerable to brute-force attacks.
A “salt” is an extra piece of data which is included when hashing data such as a password. Its value should have the same properties as a secret key.
This rule raises an issue when it detects that a secret key or a salt has a predictable value or that it’s not long enough.
At root, `require, require_once, include, and include_once all perform the same task of including one file in another. However, the way they perform that task differs, and they should not be used interchangeably.
require includes a file but generates a fatal error if an error occurs in the process.
include also includes a file, but generates only a warning if an error occurs.
Predictably, the difference between require and require_once is the same as the difference between include and include_once. The _once` versions ensure that the specified file is only included once.
The var keyword in PHP was historically used to declare class properties with default public visibility. However, its usage is discouraged as it lacks clarity and explicit visibility declaration.
Instead, PHP introduced the keywords public, protected, and private to clearly define the visibility of class properties. This enhances code readability and maintainability, as it becomes easier to understand and control access to class members. Additionally, using the keywords for explicit visibility helps prevent unintended modifications or security vulnerabilities that may arise from the ambiguity of the var keyword.
There is no reason to concatenate literal strings. Doing so is an exercise in reducing code readability. Instead, the strings should be combined.
Global variables are a useful construct, but they should not be abused. Functions can access the global scope either through the global keyword or though the $GLOBALS
array, but these practices considerably reduce the function’s readability and reusability. Instead, the global variable should be passed as a parameter to the function.
Both php_sapi_name() and the PHP_SAPI
constant give the same value. But calling the method is less efficient that simply referencing the constant.
PHPUnit provides helper functions and annotations to verify that a given block of code throws an exception and to assert different properties of that exception. The provided helper functions are:
`expectException()
expectExceptionCode()
expectExceptionMessage()
expectExceptionMessageRegExp()`
This check raises an issue when the throw of an exception is verified using a try-catch approach instead of relying on the provided helper functions.
Just as pain is your body’s way of telling you something is wrong, errors are PHP’s way of telling you there’s something you need to fix. Neither pain, nor PHP errors should be ignored.
In PHP, PCRE functions, such as preg_match, are often used to perform regular expression operations. It is necessary to enclose the actual pattern with delimiters. The delimiters are usually /. Every occurrence of the delimiter character in the pattern has to be escaped with a backslash.
However, delimiters may also be any non-alphanumeric, non-backslash or non-whitespace character. In addition to two identical delimiters, it is also allowed to use bracket style delimiters with the opening and closing brackets. Using different delimiters out of these possibilities allows avoiding unnecessary escaping, and increases the readability.
In PHP, references allow you to create multiple names for the same variable, enabling you to access and manipulate its value through different identifiers. They are denoted by the ampersand symbol & placed before the variable name during declaration or assignment.
When a reference is used in a foreach loop instead of using a simple variable, the reference remains assigned and keeps its “value” which is a reference, even after the foreach
execution.
The requirement for a final case default clause is defensive programming. The clause should either take appropriate action, or contain a suitable comment as to why no action is taken. Even when the switch covers all current values of an enum, a default case should still be used because there is no guarantee that the enum
won’t be extended.
Nested `switch structures are difficult to understand because you can easily confuse the cases of an inner switch as belonging to an outer statement. Therefore nested switch statements should be avoided.
Specifically, you should structure your code to avoid the need for nested switch statements, but if you cannot, then consider moving the inner switch` to another function.
Because it is dynamically typed, PHP does not enforce a return type on a function. This means that different paths through a function can return different types of values, which can be very confusing to the user and significantly harder to maintain.
In particular, it is consequently also possible to mix empty return statements (implicitly returning null) with some returning an expression. This rule verifies that all the return
statements from a function are consistent.
When a test fails due, for example, to infrastructure issues, you might want to ignore it temporarily. But without some kind of notation about why the test is being ignored, it may never be reactivated. Such tests are difficult to address without comprehensive knowledge of the project, and end up polluting their projects.
This rule raises an issue on each test that is marked as incomplete or skipped without a message explaining the reasoning behind it.
A test case without assertions ensures only that no exceptions are thrown. Beyond basic runnability, it ensures nothing about the behavior of the code under test.
This rule raised an issue when no assertions are found within a PHPUnit test method.
Using a function in PHP with the same name as the nesting class was historically used to declare a class constructor. However, as of PHP 8.0.0, this declaration is discouraged and will provoke an E_DEPRECATED error, albeit it functions as a constructor.
Instead, users should explicitly define the constructor by declaring a __construct(…) function.
However, if both styles are present in the same class, PHP will treat the __construct
function as the class constructor, which can cause unintended behavior.
Adhering to this convention improves readability and maintainability by ensuring that the constructor declaration is named uniformly throughout the codebase.
This rule will check that:
the sql query is not built using a concatenation
there is at least a call to bindParm between the call to prepare and fetch on the PDO connection object
It is needlessly complex to invert the result of a boolean comparison. The opposite comparison should be made instead.
Regular expressions have their own syntax that is understood by regular expression engines. Those engines will throw an exception at runtime if they are given a regular expression that does not conform to that syntax.
To avoid syntax errors, special characters should be escaped with backslashes when they are intended to be matched literally and references to capturing groups should use the correctly spelled name or number of the group.
Sub-patterns can be wrapped by parentheses to build a group. This enables to restrict alternations, back reference the group or apply quantifier to the sub-pattern.
If this group should not be part of the match result or if no reference to this group is required, a non-capturing group can be created by adding ?: behind the opening parenthesis.
However, if this non-capturing group does not have a quantifier, or does not wrap an alternation, then imaging this group is redundant.
Curly brace quantifiers in regular expressions can be used to have a more fine-grained control over how many times the character or the sub-expression preceeding them should occur. They can be used to match an expression exactly n times with `{n}, between n and m times with {n,m}, or at least n times with {n,}. In some cases, using such a quantifier is superfluous for the semantic of the regular expression, and it can be removed to improve readability. This rule raises an issue when one of the following quantifiers is encountered:
{1,1} or {1}: they match the expression exactly once. The same behavior can be achieved without the quantifier.
{0,0} or {0}`: they match the expression zero times. The same behavior can be achieved by removing the expression.
Putting multiple statements on a single line lowers the code readability and makes debugging the code more complex.
Unresolved directive in <stdin> - include::{noncompliant}[]
Write one statement per line to improve readability.
Unresolved directive in <stdin> - include::{compliant}[]
The complexity of an expression is defined by the number of &&, || and condition ? ifTrue : ifFalse
operators it contains.
A single expression’s complexity should not become too high to keep the code readable.
A cross-site request forgery (CSRF) attack occurs when a trusted user of a web application can be forced, by an attacker, to perform sensitive actions that he didn’t intend, such as updating his profile or sending a message, more generally anything that can change the state of the application.
The attacker can trick the user/victim to click on a link, corresponding to the privileged action, or to visit a malicious web site that embeds a hidden web request and as web browsers automatically include cookies, the actions can be authenticated and sensitive.
Alternation is used to match a single regular expression out of several possible regular expressions. If one of the alternatives is empty it would match any input, which is most probably a mistake.
When executing an OS command and unless you specify the full path to the executable, then the locations in your application’s PATH environment variable will be searched for the executable. That search could leave an opening for an attacker if one of the elements in PATH
is a directory under his control.
Having a permissive Cross-Origin Resource Sharing policy is security-sensitive. It has led in the past to the following vulnerabilities:
Same origin policy in browsers prevents, by default and for security-reasons, a javascript frontend to perform a cross-origin HTTP request to a resource that has a different origin (domain, protocol, or port) from its own. The requested target can append additional HTTP headers in response, called CORS, that act like directives for the browser and change the access control policy / relax the same origin policy.
Character classes in regular expressions are a convenient way to match one of several possible characters by listing the allowed characters or ranges of characters. If a character class contains only one character, the effect is the same as just writing the character without a character class.
Thus, having only one character in a character class is usually a simple oversight that remained after removing other characters of the class.
Declaring a variable only to immediately return or throw it is considered a bad practice because it adds unnecessary complexity to the code. This practice can make the code harder to read and understand, as it introduces an extra step that doesn’t add any value. Instead of declaring a variable and then immediately returning or throwing it, it is generally better to return or throw the value directly. This makes the code cleaner, simpler, and easier to understand.
A loop with at most one iteration is equivalent to the use of an `if statement to conditionally execute one piece of code. No developer expects to find such a use of a loop statement. If the initial intention of the author was really to conditionally execute one piece of code, an if statement should be used instead.
At worst that was not the initial intention of the author and so the body of the loop should be fixed to use the nested return, break or throw` statements in a more appropriate way.
Development tools and frameworks usually have options to make debugging easier for developers. Although these features are useful during development, they should never be enabled for applications deployed in production. Debug instructions or error messages can leak detailed information about the system, like the application’s path or file names.
A boolean literal can be represented in two different ways: {true} or {false}. They can be combined with logical operators ({ops}) to produce logical expressions that represent truth values. However, comparing a boolean literal to a variable or expression that evaluates to a boolean value is unnecessary and can make the code harder to read and understand. The more complex a boolean expression is, the harder it will be for developers to understand its meaning and expected behavior, and it will favour the introduction of new bugs.
Integer literals starting with a zero are octal rather than decimal values. While using octal values is fully supported, most developers do not have experience with them. They may not recognize octal values as such, mistaking them instead for decimal values.
Using pseudorandom number generators (PRNGs) is security-sensitive. For example, it has led in the past to the following vulnerabilities:
When software generates predictable values in a context requiring unpredictability, it may be possible for an attacker to guess the next value that will be generated, and use this guess to impersonate another user or access sensitive information.
Successful Zip Bomb attacks occur when an application expands untrusted archive files without controlling the size of the expanded data, which can lead to denial of service. A Zip bomb is usually a malicious archive file of a few kilobytes of compressed data but turned into gigabytes of uncompressed data. To achieve this extreme compression ratio, attackers will compress irrelevant data (eg: a long string of repeated bytes).
Nested code - blocks of code inside blocks of code - is eventually necessary, but increases complexity. This is why keeping the code as flat as possible, by avoiding unnecessary nesting, is considered a good practice.
Merging if statements when possible will decrease the nesting of the code and improve its readability.
An empty {operationName} is generally considered bad practice and can lead to confusion, readability, and maintenance issues. Empty {operationName}s bring no functionality and are misleading to others as they might think the {operationName} implementation fulfills a specific and identified requirement.
There are several reasons for a {operationName} not to have a body:
It is an unintentional omission, and should be fixed to prevent an unexpected behavior in production.
It is not yet, or never will be, supported. In this case an exception should be thrown.
The method is an intentionally-blank override. In this case a nested comment should explain the reason for the blank override.
Lookahead assertions are a regex feature that makes it possible to look ahead in the input without consuming it. It is often used at the end of regular expressions to make sure that substrings only match when they are followed by a specific pattern.
For example, the following pattern will match an “a” only if it is directly followed by a “b”. This does not consume the “b” in the process:
Unresolved directive in <stdin> - include::{lookahead}[]
However, lookaheads can also be used in the middle (or at the beginning) of a regex. In that case there is the possibility that what comes after the lookahead contradicts the pattern inside the lookahead. Since the lookahead does not consume input, this makes the lookahead impossible to match and is a sign that there’s a mistake in the regular expression that should be fixed.
A for loop with a counter that moves in the wrong direction, away from the stop condition, is not an infinite loop. Because of wraparound, the loop will eventually reach its stop condition, but in doing so, it will probably run more times than anticipated, potentially causing unexpected behavior.
Clear-text protocols such as `ftp, telnet, or http lack encryption of transported data, as well as the capability to build an authenticated connection. It means that an attacker able to sniff traffic from the network can read, modify, or corrupt the transported content. These protocols are not secure as they expose applications to an extensive range of risks:
sensitive data exposure
traffic redirected to a malicious endpoint
malware-infected software update or installer
execution of client-side code
corruption of critical information
Even in the context of isolated networks like offline environments or segmented cloud environments, the insider threat exists. Thus, attacks involving communications being sniffed or tampered with can still happen.
For example, attackers could successfully compromise prior security layers by:
bypassing isolation mechanisms
compromising a component of the network
getting the credentials of an internal IAM account (either from a service account or an actual person)
In such cases, encrypting communications would decrease the chances of attackers to successfully leak data or steal credentials from other network components. By layering various security practices (segmentation and encryption, for example), the application will follow the defense-in-depth principle.
Note that using the http` protocol is being deprecated by major web browsers.
In the past, it has led to the following vulnerabilities:
`switch can contain a default clause for various reasons: to handle unexpected values, to show that all the cases were properly considered, etc.
For readability purposes, to help a developer quickly spot the default behavior of a switch statement, it is recommended to put the default clause at the beginning or the end of the switch statement.
This rule raises an issue if the default clause is not the first or the last one of the switch’s cases.
When a cookie is configured with the HttpOnly attribute set to true, the browser guaranties that no client-side script will be able to read it. In most cases, when a cookie is created, the default value of HttpOnly is false and it’s up to the developer to decide whether or not the content of the cookie can be read by the client-side script. As a majority of Cross-Site Scripting (XSS) attacks target the theft of session-cookies, the HttpOnly
attribute can help to reduce their impact as it won’t be possible to exploit the XSS vulnerability to steal session-cookies.
When a reluctant (or lazy) quantifier is followed by a pattern that can match the empty string or directly by the end of the regex, it will always match zero times for *? or one time for +?. If a reluctant quantifier is followed directly by the end anchor ($), it behaves indistinguishably from a greedy quantifier while being less efficient.
This is likely a sign that the regex does not work as intended.
The use of parentheses, even those not required to enforce a desired order of operations, can clarify the intent behind a piece of code. However, redundant pairs of parentheses could be misleading and should be removed.
If a private field is declared but not used locally, its limited visibility makes it dead code.
This is either a sign that some logic is missing or that the code should be cleaned.
Cleaning out dead code decreases the size of the maintained codebase, making it easier to understand and preventing bugs from being introduced.
A regular expression is a sequence of characters that specifies a match pattern in text. Among the most important concepts are:
Character classes: defines a set of characters, any one of which can occur in an input string for a match to succeed.
Quantifiers: used to specify how many instances of a character, group, or character class must be present in the input for a match.
Wildcard (.): matches all characters except line terminators (also matches them if the s flag is set).
Many of these features include shortcuts of widely used expressions, so there is more than one way to construct a regular expression to achieve the same results. For example, to match a two-digit number, one could write [0-9]{2,2} or \d{2}. The latter is not only shorter but easier to read and thus to maintain.
This rule recommends replacing some quantifiers and character classes with more concise equivalents:
\d for [0-9] and \D for [^0-9]
\w for [A-Za-z0-9_] and \W for `[^A-Za-z0-9_]
. for character classes matching everything (e.g. [\w\W], [\d\D], or [\s\S] with s flag)
x? for x{0,1}, x* for x{0,}, x+ for x{1,}, x{N} for x{N,N}`
Using the same value on both sides of a binary operator is a code defect. In the case of logical operators, it is either a copy/paste error and, therefore, a bug, or it is simply duplicated code and should be simplified. In the case of bitwise operators and most binary mathematical operators, having the same value on both sides of an operator yields predictable results and should be simplified as well.
When either the equality operator in a null test or the logical operator that follows it is reversed, the code has the appearance of safely null-testing the object before dereferencing it. Unfortunately the effect is just the opposite - the object is null-tested and then dereferenced only if it is null, leading to a guaranteed null pointer dereference.
Rejecting requests with significant content length is a good practice to control the network traffic intensity and thus resource consumption in order to prevent DoS attacks.
A common code smell that can hinder the clarity of source code is making assignments within sub-expressions. This practice involves assigning a value to a variable inside a larger expression, such as within a loop or a conditional statement.
This practice essentially gives a side-effect to a larger expression, thus making it less readable. This often leads to confusion and potential errors.
In regular expressions the boundaries `^ and \A can only match at the beginning of the input (or, in case of ^ in combination with the MULTILINE flag, the beginning of the line) and $, \Z and \z only at the end.
These patterns can be misused, by accidentally switching ^ and $` for example, to create a pattern that can never match.
In regular expressions, anchors (^, ,\A,Zand\z)havehigherprecedencethanthe∣operator.Soinaregularexpressionlikealt1∣alt2∣alt3, alt1 would be anchored to the beginning, alt3 to the end and alt2
wouldn’t be anchored at all. Usually the intended behavior is that all alternatives are anchored at both ends. To achieve this, a non-capturing group should be used around the alternatives.
In cases where it is intended that the anchors only apply to one alternative each, adding (non-capturing) groups around the anchors and the parts that they apply to will make it explicit which parts are anchored and avoid readers misunderstanding the precedence or changing it because they mistakenly assume the precedence was not intended.
Formatted SQL queries can be difficult to maintain, debug and can increase the risk of SQL injection when concatenating untrusted values into the query. However, this rule doesn’t detect SQL injections (unlike rule S3649), the goal is only to highlight complex/formatted queries.
A typical code smell known as unused function parameters refers to parameters declared in a function but not used anywhere within the function’s body. While this might seem harmless at first glance, it can lead to confusion and potential errors in your code. Disregarding the values passed to such parameters, the function’s behavior will be the same, but the programmer’s intention won’t be clearly expressed anymore. Therefore, removing function parameters that are not being utilized is considered best practice.
When placing Unicode Grapheme Clusters (characters which require to be encoded in multiple Code Points) inside a character class of a regular expression, this will likely lead to unintended behavior.
For instance, the grapheme cluster c̈ requires two code points: one for ‘c’, followed by one for the umlaut modifier ‘\u{0308}‘. If placed within a character class, such as [c̈], the regex will consider the character class being the enumeration [c\u{0308}] instead. It will, therefore, match every ‘c’
and every umlaut that isn’t expressed as a single codepoint, which is extremely unlikely to be the intended behavior.
This rule raises an issue every time Unicode Grapheme Clusters are used within a character class of a regular expression.
switch statements are useful when there are many different cases depending on the value of the same expression.
For just one or two cases, however, the code will be more readable with if statements.
A regex should never include a repetitive pattern whose body would match the empty string. This is usually a sign that a part of the regex is redundant or does not do what the author intended.
Some statements (return, break, continue, goto, switch) and throw
expressions move control flow out of the current code block. So any unlabeled statements that come after such a jump are unreachable, and either this dead code should be removed, or the logic should be corrected.
Constructing arguments of system commands from user input is security-sensitive. It has led in the past to the following vulnerabilities:
Arguments of system commands are processed by the executed program. The arguments are usually used to configure and influence the behavior of the programs. Control over a single argument might be enough for an attacker to trigger dangerous features like executing arbitrary commands or writing files into specific directories.
Signaling processes or process groups can seriously affect the stability of this application or other applications on the same system.
Accidentally setting an incorrect PID or signal or allowing untrusted sources to assign arbitrary values to these parameters may result in a denial of service.
Also, the system treats the signal differently if the destination PID is less than or equal to 0. This different behavior may affect multiple processes with the same (E)UID simultaneously if the call is left uncontrolled.
There is no reason to re-assign a variable to itself. Either this statement is redundant and should be removed, or the re-assignment is a mistake and some other value or variable was intended for the assignment instead.
Executing code dynamically is security-sensitive. It has led in the past to the following vulnerabilities:
Some APIs enable the execution of dynamic code by providing it as strings at runtime. These APIs might be useful in some very specific meta-programming use-cases. However most of the time their use is frowned upon because they also increase the risk of maliciously Injected Code. Such attacks can either run on the server or in the client (example: XSS attack) and have a huge impact on an application’s security.
This rule marks for review each occurrence of such dynamic code execution. This rule does not detect code injections. It only highlights the use of APIs which should be used sparingly and very carefully.
This rule applies whenever an `if statement is followed by one or more else if statements; the final else if should be followed by an else statement.
The requirement for a final else statement is defensive programming.
The else statement should either take appropriate action or contain a suitable comment as to why no action is taken. This is consistent with the requirement to have a final default clause in a switch` statement.
To reduce the risk of cross-site scripting attacks, templating systems, such as `Twig, Django, Smarty, Groovy’s template engine, allow configuration of automatic variable escaping before rendering templates. When escape occurs, characters that make sense to the browser (eg: <a>) will be transformed/replaced with escaped/sanitized values (eg: & lt;a& gt; ).
Auto-escaping is not a magic feature to annihilate all cross-site scripting attacks, it depends on the strategy applied and the context, for example a “html auto-escaping” strategy (which only transforms html characters into html entities) will not be relevant when variables are used in a html attribute because ’:’ character is not escaped and thus an attack as below is possible:
<a href=”{{ myLink }}“>link</a> // myLink = javascript:alert(document.cookie) <a href=“javascript:alert(document.cookie)“>link</a> // JS injection (XSS attack)
A `for loop stop condition should test the loop counter against an invariant value (i.e. one that is true at both the beginning and ending of every loop iteration). Ideally, this means that the stop condition is set to a local variable just before the loop begins.
Stop conditions that are not invariant are slightly less efficient, as well as being difficult to understand and maintain, and likely lead to the introduction of errors in the future.
This rule tracks three types of non-invariant stop conditions:
When the loop counters are updated in the body of the for` loop
When the stop condition depend upon a method call
When the stop condition depends on an object property, since such properties could change during the execution of the loop.
Multiple spaces in a regular expression can make it hard to tell how many spaces should be matched. It’s more readable to use only one space and then indicate with a quantifier how many spaces are expected.
User-provided data, such as URL parameters, POST data payloads, or cookies, should always be considered untrusted and tainted. Applications constructing HTTP response headers based on tainted data could allow attackers to change security sensitive headers like Cross-Origin Resource Sharing headers.
Web application frameworks and servers might also allow attackers to inject new line characters in headers to craft malformed HTTP response. In this case the application would be vulnerable to a larger range of attacks like HTTP Response Splitting/Smuggling. Most of the time this type of attack is mitigated by default modern web application frameworks but there might be rare cases where older versions are still vulnerable.
As a best practice, applications that use user-provided data to construct the response header should always validate the data first. Validation should be based on a whitelist.
When a cookie is protected with the secure
attribute set to true it will not be send by the browser over an unencrypted HTTP request and thus cannot be observed by an unauthorized person during a man-in-the-middle attack.
Having too many return statements in a function increases the function’s essential complexity because the flow of execution is broken each time a return statement is encountered. This makes it harder to read and understand the logic of the function.
According to the Single Responsibility Principle, introduced by Robert C. Martin in his book “Principles of Object Oriented Design”, a class should have only one responsibility:
Classes which rely on many other classes tend to aggregate too many responsibilities and should be split into several smaller ones.
Nested classes dependencies are not counted as dependencies of the outer class.
Nested control flow statements such as if, for, while, switch, and try
are often key ingredients in creating
what’s known as “Spaghetti code”. This code smell can make your program difficult to understand and maintain.
When numerous control structures are placed inside one another, the code becomes a tangled, complex web. This significantly reduces the code’s readability and maintainability, and it also complicates the testing process.
Nested ternaries are hard to read and can make the order of operations complex to understand.
Unresolved directive in <stdin> - include::{noncompliant}[]
Instead, use another line to express the nested operation in a separate statement.
Unresolved directive in <stdin> - include::{compliant}[]
There is no good reason to create a new object to not do anything with it. Most of the time, this is due to a missing piece of code and so could lead to an unexpected behavior in production.
If it was done on purpose because the constructor has side-effects, then that side-effect code should be moved into a separate method and called directly.
Defining and using global variables and global functions, when the convention dictates OOP can be confusing and difficult to use properly for multiple reasons:
You run the risk of name clashes.
Global functions must be stateless, or they can cause difficult-to-track bugs.
Global variables can be updated from anywhere and may no longer hold the value you expect.
It is difficult to properly test classes that use global functions.
Instead of being declared globally, such variables and functions should be moved into a class, potentially marked static
, so they can be used without a class instance.
This rule checks that only object-oriented programming is used and that no functions or procedures are declared outside of a class.
Clear, communicative naming is important in code. It helps maintainers and API users understand the intentions for and uses of a unit of code. Using “exception” in the name of a class that does not extend Exception
or one of its subclasses is a clear violation of the expectation that a class’ name will indicate what it is and/or does.
When writing code, it is important to ensure that each statement serves a purpose and contributes to the overall functionality of the program. When they have no side effects or do not change the control flow, they can either indicate a programming error or be redundant:
The code does not behave as intended: The statements are expected to have an effect but they do not. This can be caused by mistyping, copy-and-paste errors, etc.
The statements are residual after a refactoring.
The use of increment and decrement operators in method calls or in combination with other arithmetic operators is not recommended, because:
It can significantly impair the readability of the code.
It introduces additional side effects into a statement, with the potential for undefined behavior.
It is safer to use these operators in isolation from any other arithmetic operators.
Overriding a method just to call the same method from the super class without performing any other actions is useless and misleading. The only time this is justified is in final overriding methods, where the effect is to lock in the parent class behavior. This rule ignores such overrides of equals, hashCode and toString
.
Duplicated string literals make the process of refactoring complex and error-prone, as any change would need to be propagated on all occurrences.
Altering or bypassing the accessibility of classes, methods, or fields through reflection violates the encapsulation principle. This can break the internal contracts of the accessed target and lead to maintainability issues and runtime errors.
This rule raises an issue when reflection is used to change the visibility of a class, method or field, and when it is used to directly update a field value.
According to the US National Institute of Standards and Technology (NIST), the Data Encryption Standard (DES) is no longer considered secure:
For similar reasons, RC2 should also be avoided.
goto is an unstructured control flow statement. It makes code less readable and maintainable. Structured control flow statements such as if, for, while, continue or break
should be used instead.
The MD5 algorithm and its successor, SHA-1, are no longer considered secure, because it is too easy to create hash collisions with them. That is, it takes too little computational effort to come up with a different input that produces the same MD5 or SHA-1 hash, and using the new, same-hash value gives an attacker the same access as if he had the originally-hashed value. This applies as well to the other Message-Digest algorithms: MD2, MD4, MD6, HAVAL-128, HMAC-MD5, DSA (which uses SHA-1), RIPEMD, RIPEMD-128, RIPEMD-160, HMACRIPEMD160.
Consider using safer alternatives, such as SHA-256, SHA-512 or SHA-3.
A value that is incremented or decremented and then not stored is at best wasted code and at worst a bug.
The switch statement should be used only to clearly define some new branches in the control flow. As soon as a case clause contains too many statements this highly decreases the readability of the overall control flow statement. In such case, the content of the case
clause should be extracted into a dedicated method.
Shared coding conventions allow teams to collaborate efficiently. This rule checks that all function names match a provided regular expression.
Two {func_name}s having the same implementation are suspicious. It might be that something else was intended. Or the duplication is intentional, which becomes a maintenance burden.
Hardcoding IP addresses is security-sensitive. It has led in the past to the following vulnerabilities:
Today’s services have an ever-changing architecture due to their scaling and redundancy needs. It is a mistake to think that a service will always have the same IP address. When it does change, the hardcoded IP will have to be modified too. This will have an impact on the product development, delivery, and deployment:
The developers will have to do a rapid fix every time this happens, instead of having an operation team change a configuration file.
It misleads to use the same address in every environment (dev, sys, qa, prod).
Last but not least it has an effect on application security. Attackers might be able to decompile the code and thereby discover a potentially sensitive address. They can perform a Denial of Service attack on the service, try to get access to the system, or try to spoof the IP address to bypass security checks. Such attacks can always be possible, but in the case of a hardcoded IP address solving the issue will take more time, which will increase an attack’s impact.
A cookie’s domain specifies which websites should be able to read it. Left blank, browsers are supposed to only send the cookie to sites that exactly match the sending domain. For example, if a cookie was set by lovely.dream.com, it should only be readable by that domain, and not by nightmare.com or even strange.dream.com. If you want to allow sub-domain access for a cookie, you can specify it by adding a dot in front of the cookie’s domain, like so: .dream.com. But cookie domains should always use at least two levels.
Cookie domains can be set either programmatically or via configuration. This rule raises an issue when any cookie domain is set with a single level, as in .com.
Developers often use TODO tags to mark areas in the code where additional work or improvements are needed but are not implemented immediately. However, these TODO tags sometimes get overlooked or forgotten, leading to incomplete or unfinished code. This rule aims to identify and address unattended TODO tags to ensure a clean and maintainable codebase. This description explores why this is a problem and how it can be fixed to improve the overall code quality.
Having all branches of a switch or if chain with the same implementation indicates a problem.
In the following code:
Unresolved directive in <stdin> - include::{example}[]
Either there is a copy-paste error that needs fixing or an unnecessary switch or if chain that should be removed.
Character classes in regular expressions are a convenient way to match one of several possible characters by listing the allowed characters or ranges of characters. If the same character is listed twice in the same character class or if the character class contains overlapping ranges, this has no effect.
Thus duplicate characters in a character class are either a simple oversight or a sign that a range in the character class matches more than is intended or that the author misunderstood how character classes work and wanted to match more than one character. A common example of the latter mistake is trying to use a range like [0-99] to match numbers of up to two digits, when in fact it is equivalent to [0-9]. Another common cause is forgetting to escape the - character, creating an unintended range that overlaps with other characters in the character class.
Sharing some naming conventions is a key point to make it possible for a team to efficiently collaborate. This rule allows to check that all interface names match a provided regular expression.
If an alternative in a regular expression only matches things that are already matched by another alternative, that alternative is redundant and serves no purpose.
In the best case this means that the offending subpattern is merely redundant and should be removed. In the worst case it’s a sign that this regex does not match what it was intended to match and should be reworked.
A naming convention in software development is a set of guidelines for naming code elements like variables, functions, and classes.
The goal of a naming convention is to make the code more readable and understandable, which makes it easier to maintain and debug. It also ensures consistency in the code, especially when multiple developers are working on the same project.
This rule checks that field names match a provided regular expression.
This rule raises an issue when a {visibility} {operationName} is never referenced in the code.
In Unix file system permissions, the “others
” category refers to all
users except the owner of the file system resource and the members of the group
assigned to this resource.
Granting permissions to this category can lead to unintended access to files or directories that could allow attackers to obtain sensitive information, disrupt services or elevate privileges.
Using reluctant quantifiers (also known as lazy or non-greedy quantifiers) in patterns can often lead to needless backtracking, making the regex needlessly inefficient and potentially vulnerable to catastrophic backtracking. Particularly when using .*? or .+? to match anything up to some terminating character, it is usually a better idea to instead use a greedily or possessively quantified negated character class containing the terminating character. For example <.+?> should be replaced with <[^>]++>
.
`if statements with conditions that are always false have the effect of making blocks of code non-functional. if statements with conditions that are always true are completely redundant, and make the code less readable.
There are three possible causes for the presence of such code:
An if statement was changed during debugging and that debug code has been committed.
Some value was left unset.
Some logic is not doing what the programmer thought it did.
In any of these cases, unconditional if` statements should be removed.
Empty statements represented by a semicolon ; are statements that do not perform any operation. They are often the result of a typo or a misunderstanding of the language syntax. It is a good practice to remove empty statements since they don’t add value and lead to confusion and errors.
When the execution is not explicitly terminated at the end of a switch case, it continues to execute the statements of the following case. While this is sometimes intentional, it often is a mistake which leads to unexpected behavior.
When the names of parameters in a method call match the names of the method arguments, it contributes to clearer, more readable code. However, when the names match, but are passed in a different order than the method arguments, it indicates a mistake in the parameter order which will likely lead to unexpected results.
Cryptographic hash algorithms such as MD2, MD4, MD5, MD6, HAVAL-128, HMAC-MD5, DSA (which uses SHA-1), RIPEMD, RIPEMD-128, RIPEMD-160, HMACRIPEMD160 and SHA-1 are no longer considered secure, because it is possible to have collisions
(little computational effort is enough to find two or more different inputs that produce the same hash).
Possessive quantifiers in Regex patterns like below improve performance by eliminating needless backtracking:
?+ , *+ , ++ , {n}+ , {n,}+ , {n,m}+
But because possessive quantifiers do not keep backtracking positions and never give back, the following sub-patterns should not match only similar characters. Otherwise, possessive quantifiers consume all characters that could have matched the following sub-patterns and nothing remains for the following sub-patterns.
Shared coding conventions make it possible to collaborate efficiently. This rule makes it mandatory to place the open curly brace at the beginning of a line.
There are several reasons to use a group in a regular expression:
to change the precedence (e.g. do(g|or) will match ‘dog’ and ‘door’)
to remember parenthesised part of the match in the case of capturing group
to improve readability
In any case, having an empty group is most probably a mistake. Either it is a leftover after refactoring and should be removed, or the actual parentheses were intended and were not escaped.
Because it is easy to extract strings from an application source code or binary, credentials should not be hard-coded. This is particularly true for applications that are distributed or that are open-source.
In the past, it has led to the following vulnerabilities:
Credentials should be stored outside of the code in a configuration file, a database, or a management service for secrets.
This rule flags instances of hard-coded credentials used in database and LDAP connections. It looks for hard-coded credentials in connection strings, and for variable names that match any of the patterns from the provided list.
It’s recommended to customize the configuration of this rule with additional credential words such as “oauthToken”, “secret”, …
When the line immediately after a conditional has neither curly braces nor indentation, the intent of the code is unclear and perhaps not what is executed. Additionally, such code is confusing to maintainers.
There are valid cases for passing a variable multiple times into the same method call, but usually doing so is a mistake, and something else was intended for one of the arguments.
An unused local variable is a variable that has been declared but is not used anywhere in the block of code where it is defined. It is dead code, contributing to unnecessary complexity and leading to confusion when reading the code. Therefore, it should be removed from your code to maintain clarity and efficiency.
Dead stores refer to assignments made to local variables that are subsequently never used or immediately overwritten. Such assignments are unnecessary and don’t contribute to the functionality or clarity of the code. They may even negatively impact performance. Removing them enhances code cleanliness and readability. Even if the unnecessary operations do not do any harm in terms of the program’s correctness, they are - at best - a waste of computing resources.
A naming convention in software development is a set of guidelines for naming code elements like variables, functions, and classes. {identifier_capital_plural} hold the meaning of the written code. Their names should be meaningful and follow a consistent and easily recognizable pattern. Adhering to a consistent naming convention helps to make the code more readable and understandable, which makes it easier to maintain and debug. It also ensures consistency in the code, especially when multiple developers are working on the same project.
This rule checks that {identifier} names match a provided regular expression.
Undocumented APIs pose significant challenges in software development for several reasons:
Lack of Clarity: developers struggling to understand how to use the API correctly. This can lead to misuse and unexpected results.
Increased Development Time: developers spending extra time reading and understanding the source code, which slows down the development process.
Error Prone: developers are more likely to make mistakes that lead to bugs or system crashes when the intent or the error handling of an API is not clear.
Difficult Maintenance and Updates: developers may not understand the existing functionality well enough to add new features without breaking the existing ones.
Poor Collaboration: collaboration, when there is lack of documentation, leads to confusion and inconsistencies.
Local variables should not have the same name as class fields
This rule verifies that single-line comments are not located at the ends of lines of code. The main idea behind this rule is that in order to be really readable, trailing comments would have to be properly written and formatted (correct alignment, no interference with the visual structure of the code, not too long to be visible) but most often, automatic code formatters would not handle this correctly: the code would end up less readable. Comments are far better placed on the previous empty line of code, where they will always be visible and properly formatted.