- Quescol 1-Minute Read
- Let’s Understand in Depth
- What is Finite Automata ?
- Algorithm: String Matching with Finite Automata
- Step 1: Preprocessing
- Step 2: Searching
- Step 3: Continue until the text ends.
- Examples of String Matching with Finite Automata
- Example 1:
- Example 2:
- Java Program: String Matching with Finite Automata
- Output
- Explanation
- Time Complexity of String Matching
- Space Complexity of String Matching
- Conclusion
String matching is the problem of finding occurrences of a pattern string P in a larger text string T. There are several algorithms to solve this problem efficiently. One powerful method is using finite automata, which comes from automata theory in computer science.
Finite automata are abstract machines with a finite number of states, transitions, and a starting and accepting state. They process input strings one character at a time and change their state according to a transition function.
Quescol 1-Minute Read
Finite Automata is a method for efficiently finding a pattern string inside a larger text string. It works by converting the pattern into a state machine with a starting state, intermediate states, and an accepting state. As the text is read one character at a time, the automaton moves from state to state based on the current character.
Reaching the accepting state means the pattern has been found. This approach avoids unnecessary character comparisons and scans the text in linear time after preprocessing. Although building the transition table takes some memory and preprocessing time, it is very effective for repeated or fast pattern searches.
Key Points:
- States (Q): Represent how many characters of the pattern have matched.
- Alphabet (Σ): All possible input characters.
- Transition Function (δ): Determines next state based on current state and character.
- Start State (q0): Initial state where no characters have matched.
- Accepting State (qm): Final state indicating full pattern match.
- Examples:
- Pattern “AAA” in text “AAAAA” → matches at indices 0, 1, 2.
- Pattern “ABC” in text “ABXABCAB” → matches at index 3.
Let’s Understand in Depth
What is Finite Automata ?
A finite automaton (FA) for string matching is defined as a finite set of states, including a starting state and an accepting state, with transitions based on input characters of the pattern. The automaton moves from state to state as it reads text, and reaching the accepting state indicates a match of the pattern.
Important components:
- States (Q): Each state represents how many characters of the pattern have been matched so far.
- Alphabet (Σ): Set of possible input characters.
- Transition function (δ): Determines the next state based on the current state and input character.
- Start state (q0): State where no character has matched yet.
- Accepting state (qm): Final state indicating the pattern has been matched.
Example

The above image explains how a smaller string, called a pattern, is found inside a larger string, called a text. In the example, the text is “AAABCXRHBABC” and the pattern is “ABC”. By checking each part of the text, we see that the pattern “ABC” appears twice — once starting at index 2 and again at index 9. This means that if we count the letters from 0, the first match begins at the third letter, and the second match starts at the tenth letter.
Algorithm: String Matching with Finite Automata
Step 1: Preprocessing
- Construct a transition table for the pattern P.
- Each entry δ(q, a) represents the next state after reading character
ain stateq. - For a pattern of length m and alphabet size |Σ|, preprocessing takes O(m|Σ|).
Step 2: Searching
- Start from the initial state q0.
- For each character of the text T:
- Move to the next state using δ(current_state, character).
- If the accepting state is reached, report a pattern occurrence.
Step 3: Continue until the text ends.
Examples of String Matching with Finite Automata
Example 1:
Pattern: P = "AAA"
Text: T = "AAAAA"
States: 0 (no match), 1 (matched "A"), 2 (matched "AA"), 3 (matched "AAA" — full match)
Transition table (alphabet = {A}):
| State | A |
|---|---|
| 0 | 1 |
| 1 | 2 |
| 2 | 3 |
| 3 | 3 |
- Start at state
0. - Read
A(index 0):0 → 1 - Read
A(index 1):1 → 2 - Read
A(index 2):2 → 3→ match found ending at index 2 → start index =2 - 3 + 1 = 0 - Continue: Read
A(index 3): from state3onA→3→ match found ending at index 3 → start index =1 - Continue: Read
A(index 4): from state3onA→3→ match found ending at index 4 → start index =2
Matches found: indices 0, 1, 2 (overlapping matches because the pattern repeats inside itself).
Example 2:
Pattern: P = "ABC"
Text: T = "ABXABCAB"
States: 0 (no match), 1 (matched "A"), 2 (matched "AB"), 3 (matched "ABC" — full match)
Transition table (alphabet = {A, B, C, X}):
| State | A | B | C | X |
|---|---|---|---|---|
| 0 | 1 | 0 | 0 | 0 |
| 1 | 1 | 2 | 0 | 0 |
| 2 | 1 | 0 | 3 | 0 |
| 3 | 1 | 0 | 0 | 0 |
- Start at state
0. - Read
A(index 0):0 → 1 - Read
B(index 1):1 → 2 - Read
X(index 2):2 → 0(no continuation becauseXbreaks the prefix) - Read
A(index 3):0 → 1 - Read
B(index 4):1 → 2 - Read
C(index 5):2 → 3→ match found ending at index 5 → start index =5 - 3 + 1 = 3 - Continue after match: Read
A(index 6):3 → 1 - Read
B(index 7):1 → 2(end of text, no full match)
Matches found: index 3 only.
Java Program: String Matching with Finite Automata
public class FiniteAutomataStringMatch {
static final int NO_OF_CHARS = 256;
public static void search(String pat, String txt) {
int M = pat.length();
int N = txt.length();
int[][] TF = new int[M + 1][NO_OF_CHARS];
computeTransitionFunction(pat, M, TF);
int state = 0;
for (int i = 0; i < N; i++) {
state = TF[state][txt.charAt(i)];
if (state == M)
System.out.println("Pattern found at index " + (i - M + 1));
}
}
private static void computeTransitionFunction(String pat, int M, int[][] TF) {
for (int state = 0; state <= M; state++) {
for (int x = 0; x < NO_OF_CHARS; x++) {
TF[state][x] = getNextState(pat, M, state, (char)x);
}
}
}
private static int getNextState(String pat, int M, int state, char x) {
if (state < M && x == pat.charAt(state))
return state + 1;
for (int ns = state; ns > 0; ns--) {
if (pat.charAt(ns - 1) == x) {
int i;
for (i = 0; i < ns - 1; i++)
if (pat.charAt(i) != pat.charAt(state - ns + 1 + i))
break;
if (i == ns - 1)
return ns;
}
}
return 0;
}
public static void main(String[] args) {
String txt = "ABABABAB";
String pat = "ABAB";
search(pat, txt);
}
}
Output
Pattern found at index 0
Pattern found at index 2
Pattern found at index 4
Explanation
The program starts with a constant NO_OF_CHARS = 256, representing the total number of possible characters (ASCII). The main function search() takes two inputs — the pattern (pat) and the text (txt). It first finds the length of the pattern (M) and the text (N) and then creates a transition table (TF), which stores state changes for every possible character.
The function computeTransitionFunction() fills this transition table. For each state (representing how many characters of the pattern have been matched), and for each possible character (x), it calls the helper function getNextState() to determine what the next state should be if that character appears.
The getNextState() function is the most important part. It checks whether the next character in the text matches the next character of the pattern. If it matches, it moves to the next state (state + 1). Otherwise, it tries to find the longest prefix of the pattern that matches the current suffix — this ensures no character comparisons are wasted.
After building the transition table, the search() function scans the text one character at a time. The current state changes based on the transition table entries. Whenever the state reaches M (the length of the pattern), it means the pattern is found in the text — so it prints the starting index using (i - M + 1).
Time Complexity of String Matching
| Case | Time Complexity |
|---|---|
| Best Case | O(n) |
| Average Case | O(n) |
| Worst Case | O(m × Σ + n) |
Space Complexity of String Matching
| Case | Space Complexity |
|---|---|
| Best Case | O(m × Σ) |
| Average Case | O(m × Σ) |
| Worst Case | O(m × Σ) |
Conclusion
String matching with finite automata is a powerful and deterministic method for finding patterns in text efficiently. By preprocessing the pattern into a state machine, the algorithm can scan the text in linear time while avoiding unnecessary character comparisons. Although it requires memory for the transition table and preprocessing time, it is highly effective for applications requiring repeated or fast pattern matching. This method also introduces beginners to automata theory, prefix-suffix analysis, and practical graph-like transitions for problem-solving.