Code Search and Replacement Guide
This guide demonstrates how to efficiently search and replace code patterns across multiple files within a project. The process involves identifying the pattern, creating match and rewrite templates, and applying these to transform the code.
Getting Started
Step 1: Identify the Pattern to Match
First, identify the instances of code that need replacement. For example, consider a Python logging statement that needs to be updated:
Step 2: Write a Match Template Using Recognized Syntax
Create a match template to capture the relevant code segment. Use a hole (:[hole_name]) to represent the dynamic part of the code you wish to match:
In this template, :[message]
captures the argument passed to the print function. The name message is arbitrary and can be replaced with any valid identifier.
Step 3: Create a Rewrite Template
Construct a rewrite template to specify how the matched code should be transformed. For example, to replace a print statement with a structured logging call:
:[message]
will be replaced by the content matched in the match template.
Step 4: Test the Rule
Apply the match and rewrite templates to transform the code.
This demonstrates the process of refactoring code using match and rewrite templates.
Go Pro with Matching
Combining Regular Expressions with Structural Matching
Embed regular expressions within holes using :[hole\~regex]
for pattern specificity within a structured matching framework.
Matching a function call with a numeric argument in case like
:\[fn~\w+\](:[arg\~\d+])
, matchesfoo(404)
, but notbar(not_a_number)
because:[arg~\d+]
specifies a sequence of digits.
Advanced Syntax Reference
The syntax below has special meaning for matching. Bind match contents to identifiers like hole using Named Match syntax. Using names is useful when replacing contents or writing rules. To just match patterns without giving a meaningful name, use any of the Just Match syntax.
Named Match | Just Match | Description |
---|---|---|
:[var] | ... :[_] | Match zero or more characters in a lazy fashion. When used is inside delimiters, as in {:[v1], :[v2]} or (:[v]) , holes match within that group or code block, including newlines. Holes outside of delimiters stop matching at a newline, or the start of a code block, whichever comes first. |
:[var~regex] | :[~regex] | Match an arbitrary PCRE regular expression regex . Avoid regular expressions that match special syntax like ) or .* , otherwise your pattern may fail to match balanced blocks. |
:[[var]] | :[~\w+] :[[_]] | Match one or more alphanumeric characters and _ . |
:[var:e] | :[_:e] | Expression-like syntax matches contiguous non-whitespace characters like foo or foo.bar, as well as contiguous character sequences that include valid code block structures like balanced parentheses in function(foo, bar) (notice how whitespace is allowed inside the parentheses). Language-dependent. |
:[var.] | :[_.] | Match one or more alphanumeric characters and punctuation like . , ; , and - that do not affect balanced syntax. Language dependent. |
:[var\n] | :[~.*\n] :[_\n] | Match zero or more characters up to a newline, including the newline. |
:[ var] | :[var~[ \t]+] :[ ] | Match only whitespace characters, excluding newlines. |
Rewrite properties
Let’s Go!
We have convenient built-in properties to transform and substitute matched values for certain use cases that commonly crop up when rewriting code.
:[hole].Capitalize
will capitalize a string matched by hole.
Match Rule:
:[[x]]
Rewrite Rule::[[x]].Capitalize
Test Code:these are words 123
Result:These Are Words 123
Properties are recognized in the rewrite template and substituted according to the predefined behavior. Property accesses cannot be chained. Below are the current built-in properties.
Built-in Properties
String converters
Property | Behavior |
---|---|
.lowercase | Convert letters to lowercase. |
.UPPERCASE | Convert letters to uppercase. |
.Capitalize | Capitalize the first character if it is a letter. |
.uncapitalize | Lowercase the first character if it is a letter. |
.UPPER_SNAKE_CASE | Convert camelCase to snake_case (each capital letter in camelCase gets a _ prepended). Then uppercase letters. |
.lower_snake_case | Convert camelCase to snake_case (each capital letter in camelCase gets a _ prepended). Then lowercase letters. |
.UpperCamelCase | Convert snake_case to CamelCase (each letter after _ in snake_case is capitalized, and the _ removed). Then capitalize the first character. |
.lowerCamelCase | Convert snake_case to CamelCase (each letter after _ in snake_case is capitalized, and the _ removed). Then lowercase the first character. |
Sizes
Property | Behavior |
---|---|
.length | Substitute the number of characters of the hole value. |
.lines | Substitute the number of lines of the hole value. |
Positions
Property | Behavior |
---|---|
.line | Substitute the starting line number of this hole. |
.line.start | Alias of .line . |
.line.end | Substitute the ending line number of this hole. |
.column | Substitute the starting column number of this hole (also known as character). |
.column.start | Alias of .column . |
.column.end | Substitute the ending column number of this hole. |
.offset | Substitute the starting byte offset of this hole in the file. |
.offset.start | Alias of .offset . |
.offset.end | Substitute the ending byte offset of this hole in the file. |
File context
Property | Behavior |
---|---|
.file | Substitute the absolute file path of the file where this hole matched. |
.file.path | Alias of .file . |
.file.name | Substitute the file name of the file where this hole matched (basename). |
.file.directory | Substitute the file directory of the file where the hole matched (dirname). |
Identity (for escaping property names)
Property | Behavior |
---|---|
.value | Substitute the text value of this hole (for escaping, see below). |
Resolving clashes with property names
Let’s say you want to literally insert the text .length
after a hole. We can’t use :[hole].length
because that reserved syntax will substitute the length of the match, and not insert the text .length
. To resolve a clash like this, simply use :[hole].value
instead of :[hole]
to substitute the value of :[hole]
. Then, append .length
to the template. This will cause the .length
, to be interpreted literally:
Match Rule:
:[[x]]
Rewrite Rule::[x].value.length is :[x].length
Test Code:a word
Result:a.length is 1 word.length is 4
The way this works is that :[hole].value
acts as an escape sequence so that any conflicting .<property>
can be interpreted literally by simply appending it to :[hole].value
.
FAQs
What does lazy evaluation in hole matching mean?
A: Lazy evaluation means holes (:[hole_name]
or ...
for unnamed holes) capture the smallest string that fits the pattern for efficient and precise matching.
Example:
In if (width <= 1280 && height <= 800) { return 1;}
, with template if (:[var] <= :[rest])
, :[var]
matches until it sees the <=
part coming after it and matches width. :[rest]
matches the rest of the condition: 1280 && height <= 800
Way to refine matching is to add concrete context around holes based on what we care about. For example, we could match height to :[height]
with either templates
if (... && :[height] ...)
orif (... :[height] <= 800)
In if (x < 10 && y > 20) {...}
, the template if (:[condition] && ...) {...}
matches x < 10
as :[condition]
.
What is structural matching in the context of code patterns?
A: Structural matching accurately handles code by recognizing balanced delimiters (()
, []
, {}
), enabling precise manipulation of nested structures.
Example: For calculate(sum(add(2, 3), multiply(4, 5)))
, the template add(:[args])
matches add(2, 3)
.
How does understanding language constructs affect code matching?
A: It’s essential for accurately matching patterns within complex language constructs like comments and string literals.
Example: foo(bar(5 /* includes ) tax */))
matched with foo(bar(:[arg]))
captures 5 /* includes ) tax */
as :[arg]
, ignoring )
within /* includes ) tax */
as closing bracket.
Does whitespace affect the matching process?
A: No, variations in whitespace do not impact matching, allowing templates to adapt to different code formatting styles effectively.
Examples:
Match Template if (... && :[height] ...)
works with:
- Single-line:
if (width <= 1280 && height <= 800) { return 1; }
- Multi-line with standard spacing:
- Multi-line with extra spacing demonstrates matching flexibility, unaffected by whitespace differences.