1357 246-
Get Started
- CodeAnt AI
- Control Center
- Pull Request Review
- IDE
- Compliance
- Anti-Patterns
- Code Governance
- Infrastructure Security Database
- Application Security Database
Rpg
IMPORT
ed values should be treated as read-only. Doing otherwise could result in unpleasant surprises down the line.
The DEBUG(*YES) and DUMP
statements are useful during development and debugging, but could expose sensitive information to attackers and should not be included in production code.
Even if there’s no data to be returned from a procedure, each procedure should at least return a boolean to indicate when procedure execution failed, so callers can respond accordingly.
While `END will adequately close a statement, it is less clear than the use of the relevant, statement-specific ENDxx. Thus, the statement-specific version is preferred to facilitate code maintenance and enhance clarity.
This rule is applied to the following operations: CASxx, DO, DOU, DOUxx, DOW, DOWxx, FOR, IF, IFxx and SELECT` groups.
The complexity of an expression is defined by the number of AND, OR
.
A single expression’s complexity should not become too high to keep the code readable.
When multiple, adjacent MONITOR statements have duplicate ON-ERROR blocks, they should be merged to consolidate the ON-ERROR logic for cleaner, more readable code. Note that this applies even when there is intervening code outside any MONITOR
block.
Shared coding conventions allow teams to collaborate effectively. This rule checks that labels are on lines by themselves, which makes them more immediately visible as such.
The format used to write a record to a file should be cleared before each use. Otherwise stale data left in the format from previous records may be saved into the current record if it does not have data for all the fields in the format.
Leaving an ON-ERROR
block empty means that the exception in question is neither handled nor passed forward to callers for handling at a higher level. Suppressing errors rather than handling them could lead to unpredictable system behavior and should be avoided.
The use of UNDERLINE makes normal text printed to the screen difficult to read, and prevents titles from being read at all. For these reasons, UNDERLINE
should not be used.
Using a unique symbolic `PLIST name provides a single definition point for the parameters used by any external routine being called. This ensures that the same information is passed every time the external routine is called, reducing the potential for mistakes in operation.
This rule flags non-compliant instances of CALL and CALLB`.
Each program should contain an initial section of H*
comments lines, which includes a copyright and/or license statement. This rule must be fed with the header text that is expected in the header.
Historically /COPY procedures contained all of the specifications required for each procedure, so for example xxxxF /COPY would be used to copy the F lines from another F spec. However, this practice results in code that is difficult to understand and maintain. Instead, the use of /COPY
statements should be replaced with explicit declarations.
Turning on the `*srcstmt header option means that any line numbers cited in error statements will actually correspond correctly to the code. Otherwise, object line numbers will be shown, making errors difficult to debug.
The *srcstmt option is also recommended for accurate line numbers during debugging, particularly in combination with *nodebugio, (H option(*srcstmt:*nodebugio)`) which prevents debug operations from stopping on I/O.
If a file is defined without the USROPN statement then the natural RPG logic will deal with opening and closing it. However, files defined with USROPN, must be both explicitly OPENed and CLOSE
d.
Using both SPACEA (space-after) and SPACEB
(space-before) in the same printer file can make it difficult to understand how many blank lines will be printed where. Therefore either one command or the other should be used but not both.
A Select or a chain of If/Elseif statements is evaluated from top to bottom. At most, only one branch will be executed: the first one with a condition that evaluates to true
.
Therefore, duplicating a condition automatically leads to dead code. Usually, this is due to a copy/paste error. At best, it’s simply dead code and at worst, it’s a bug that is likely to induce further bugs as the code is maintained, and obviously it could lead to unexpected behavior.
Using a named constant to refer to an indicator makes the content of the field clearer, and therefore makes the code easier to read and maintain.
Having all the pieces in front of you makes it easier to understand and debug or maintain a piece of code. Unfortunately, that’s often not the case with compile-time arrays, since they are often separated from the code that uses them by many, many lines. Instead, data structures are often a better option.
Externally described files standardize file access, and result in simpler, easier to read code.
Testing %PARMS before the use of one or more parameters is a clear indication in the code that at least some parameters are treated as optional. However, failure to actually mark those parameters as such with options( *nopass ) denies callers the ability to treat them as optional. Since all parameters after the first one explicitly declared options( *nopass )
are also implicitly optional, this marking is only required once. However, for clarity you should consider explicitly marking all optional parameters.
Eval and Callp
are the only two exceptions to the rule that each free-format line must start with an operation code. Since you can leave these two opcodes out, you should for cleaner, more readable code.
This rule applies whenever an `IF statement is followed by one or more ELSEIF statements; the final ELSEIF should be followed by an ELSE statement.
The requirement for a final ELSE statement is defensive programming.
The ELSE statement should either take appropriate action or contain a suitable comment as to why no action is taken. This is consistent with the requirement to have a final OTHER clause in a SELECT` statement.
Using a unique `KLIST to access files ensures that the correct key is used at all times and creates a standard method of accessing the data throughout the code, simplifying maintenance and improving readability.
This rule flags non-compliant instances of CHAIN, DELETE, READE, READPE, SETGT, and SETLL` operations.
Shared naming conventions allow teams to collaborate efficiently.
This rule raises an issue when a subroutine name does not match a provided regular expression.
For example, with the default provided regular expression ^SR[a-zA-Z0-9]*$
, the following subroutines:
Nested control flow statements CASxx, DO, DOU, DOW, DOUxx, DOWxx, FOR, IF, IFxx, MONITOR and SELECT
are often key ingredients in creating
what’s known as “Spaghetti code”. This code smell can make your program difficult to understand and maintain.
When numerous control structures are placed inside one another, the code becomes a tangled, complex web. This significantly reduces the code’s readability and maintainability, and it also complicates the testing process.
When storing numeric fields, using an odd number of digits allows the sign to be included in the storage area without wasting any space.
For example the number -1234567, when packed as 7 digits is stored as follows.
thus taking up only 4 data slots. but when packed as 8 characters it uses 5 data slots
02467 0135-
or when 7 digits, non-packed, it will be stored as 8 data slots, including the sign.
FFFFFFF6 12345670
Subprocedures and subroutines are both mechanisms to segregate logic, but subprocedures are preferred for three reasons:
their local files and variables make maintenance faster and cleaner. They allow you to create variables without worrying about name clashes, and to change fields without worrying about negatively impacting other parts of the program.
their local files and variables make code reuse easy.
they can be called with parameters as functions, yielding clearer more readable code.
Shared coding conventions allow teams to collaborate effectively. While it is possible to omit the specification letter before a /COPY
statement, it is advisable not to do so because including a specification on the line indicates to other developers what type of statements will be added to the program by the copybook. Further, using a specification letter on the line makes it clearer to those who might be skimming the file that the line is not just a comment, but does contain code.
Every subroutine should be commented to explain its goal and how it works. This non-empty comment must be located before or after the subroutine definition.
Global variables can seem like a handy way to pass state information around in a program, but the use of global variables only works well in very small programs. As the code base grows, you’ll need to understand every subprocedure’s impact on the global state in order to understand how the program works. This is a task that quickly becomes impossible.
To control the situation, only the main procedure should be allowed access to global variables; it can then pass that state information to subprocedures as parameters.
Error handling should be defined on file operations to ensure correct processing of program failures rather than defaulting to an abnormal program end. This rule checks two things:
INFSR
error handling is defined on file operations.error handling is delegated to an appropriately named routine.
GOTO is an unstructured control flow statement. It makes code less readable and maintainable, and should only be used to branch to defined ENDSR points within the source. For normal loop functions use ITER or LEAVE
instead.
Both the name and the field definitions of a parameter 2 data area used in an IN should be coded in the D
specification. This ensures that all definition data is kept together, and the specific data is not manually extracted in the code, thus making the code easier to read and maintain.
The use of positional notation to describe the overlapping of one field on another can quickly become confusing, particularly in large, or complicated data structures. Instead, the OVERLAY
keyword should be used to make such overlaps more readily identifiable.
Free-format syntax is preferred because it is clearer, easier to write, and easier for newcomers to the language than fixed-format syntax.
There’s no need to repeat the program name in the free-format definition of an external program.
Because library names could change from environment to environment, they should never be hard-coded in a program. Instead, you should use a variable to specify the library name and set that variable with data external to the program.
The SELECT statement should be used only to clearly define some new branches in the control flow. As soon as a WHEN clause contains too many statements this highly decreases the readability of the overall control flow statement. In such case, the content of WHEN
clause should be extracted in a dedicated subroutine.
When using a PRINTER defined file, an INFDS keyword should be used to read the file’s line count from *PSSR
positions 367-368.
When accessing a non-input-only file with a CHAIN/READx operation, the record is locked and becomes inaccessible to the system, and unreadable other programs. This is fine if you are about to update the record immediately, but if you have data processing to do following the read, it can leave the record unavailable for too long. Therefore, non-input files using the CHAIN, READ, READE, READP, and READPE operation codes should always be accessed with the no lock option, (N)
.
While the OCCURS keyword still works, it is slower than its more modern replacement: DIM
, which should be used instead.
Initializing an alphanumeric field with the literal character ’ ’, ‘1’, or ‘0’ has the same effect as using the equivalent figurative constant *BLANK, *ON, or *OFF
, but the constant version removes ambiguity and makes the code more readable.
The use of options( *nopass )
on a parameter makes it optional, so it should always be checked before use.
The use of positional notation can quickly become confusing, especially in a large or complex data structure. To keep the code clean and understandable, use length notation instead.
Prototypes should be used for program calls because their use allows compile-time checking of the number of parameters and the type and size of those parameters. Without the use of a prototype, that checking happens only at run-time.
A secondary benefit of the use of prototypes and CALLP
, is that prototypes allow the caller to gain a clearer understanding of the data expected for each parameter because parameter names are purely descriptive.
Shared coding conventions allow teams to collaborate effectively. To improve readability, blank lines in a program should not have end-of-line comments.
The /EJECT compiler directive adds a page break when the code is printed, and should be used at the end of each F, D, and C
specification section to make the structure of the code clearer and to enhance overall readability.
SELECT and OMIT allow you to choose records from a logical file based on the values of specific fields. SQL views allow you to make the same distinctions, but their use is more efficient, and easier to read and understand. Therefore views are preferred over SELECT and OMIT
statements.
Specifying the parameters to a procedure with a PLIST makes that procedure unusable from free-format code. Instead, prototypes should be used - both when defining a procedure and when calling it. They have the additional benefit of allowing you to use keywords such as Const to better-specify how parameters are passed to a procedure. Further, the use of a prototype instead of a PLIST
is cleaner and more consistent with the code required for subprocedures.
The use of a result data structure with file I/O improves performance because it moves the data in one large block from file to data structure (or vice versa) rather than field by field.
Additionally using a data structure can limit the problems caused by having bad data in a file. Without a data structure, the entire READ
operation will fail at the first bad value. With one, the error comes only when the bad field is used.
Indicators should not be used on “IF” statements because they foster poor logic flow and result in code that is difficult to read and maintain.
A results indicator should be used on a CHAIN statement because it leads to cleaner, more readable, and more correct code. Using an indicator results in the conditions record not found, and end of file
being automatically handled correctly, without the need for further error handling.
Non-static variables initialized with INZ() are only reliably initialized on the first run through the code in a “terminate and stay resident” (TSR) program. Subsequent calls to the program within the same job do not re-initialize the variable with the value from INZ()
unless the last record indicator is set at the end of the program.
Without setting the last record indicator, the assumptions the code makes about the variable’s initial value will be wrong every time but one, potentially leading to bad program behavior.
QUALIFIED
data structures result in cleaner code because you can’t reference the fields without using the qualifying name. They also allow you to have multiple sub-fields with the same name, meaning subfield names don’t have to be convoluted for uniqueness, and can be expressive instead.
While a file I/O call will compile without a file name, such code is almost always an error. The omission of the file name, can cause unexpected results in multi-file programs.
Use of the LIKE statement for fields with the same specification that are being used together in the same statements makes the relevance of individual fields more understandable, and gives easy reference information, and makes relationships between those fields more obvious.
The CONST keyword on a subprocedure’s parameter indicates that the parameter value will not be changed by the subprocedure. This is not just a nice way to communicate with the programmers who will call the procedure. It also offers performance benefits, because it allows the compiler to produce more optimized code. Further, using CONST
means that a field of a similar data type will automatically be converted to the correct type and size for the parameter.
Development tools and frameworks usually have options to make debugging easier for developers. Although these features are useful during development, they should never be enabled for applications deployed in production. Debug instructions or error messages can leak detailed information about the system, like the application’s path or file names.
If a private field is declared but not used locally, its limited visibility makes it dead code.
This is either a sign that some logic is missing or that the code should be cleaned.
Cleaning out dead code decreases the size of the maintained codebase, making it easier to understand and preventing bugs from being introduced.
Using the same value on both sides of a binary operator is a code defect. In the case of logical operators, it is either a copy/paste error and, therefore, a bug, or it is simply duplicated code and should be simplified. In the case of bitwise operators and most binary mathematical operators, having the same value on both sides of an operator yields predictable results and should be simplified as well.
There is no reason to re-assign a variable to itself. Either this statement is redundant and should be removed, or the re-assignment is a mistake and some other value or variable was intended for the assignment instead.
Duplicated string literals make the process of refactoring complex and error-prone, as any change would need to be propagated on all occurrences.
SELECT *
should be avoided because it releases control of the returned columns and could therefore lead to errors and potentially to performance issues.
Having two WHEN in the same SELECT statement or branches in the same IF structure with the same implementation is at best duplicate code, and at worst a coding error.