The Text Search algorithm is a fundamental method in text processing and information retrieval. This algorithm allows us to locate and identify occurrences of a substring (or pattern) within a larger piece of text.
How It Works
- Start with a larger piece of text (or document) and a substring to search for.
- Iterate through each character of the text sequentially.
- Compare the substring with a portion of the text that has the same length as the substring. If a match is found, record the current position.
- Move to the next position and continue the comparison.
Example
Consider the text: "Let's search for a substring within this text to see how the algorithm works."
And the substring to search for: "substring"
- Start at position 0. Compare
Let'
with "substring". No match. - Move to position 1. Compare
et's
with "substring". No match. - Move to position 2. Compare
t's
" with "substring". No match. - Move to position 3. Compare
's s
with "substring". No match. - Move to position 4. Compare
se
with "substring". No match. - Move to position 5. Compare
sea
with "substring". No match. - Move to position 6. Compare
earc
with "substring". No match. - Move to position 7. Compare
arch
with "substring". No match. - Move to position 8. Compare
rch
" with "substring". No match. - Move to position 9. Compare
ch w
with "substring". No match. - Move to position 10. Compare
h wi
with "substring". No match. - Move to position 11. Compare "
wit
with "substring". No match. - Move to position 12. Compare
with
with "substring". No match. - Move to position 13. Compare
ithi
with "substring". No match. - Move to position 14. Compare
thin
with "substring". No match. - Move to position 15. Compare
hinn
with "substring". No match. - Move to position 16. Compare
in t
with "substring". No match. - Move to position 17. Compare
n th
with "substring". No match. - Move to position 18. Compare "
thi
with "substring". No match. - Move to position 19. Compare
this
with "substring". No match. - Move to position 20. Compare
his
" with "substring". No match. - Move to position 21. Compare
is t
with "substring". No match. - Move to position 22. Compare
s te
with "substring". No match. - Move to position 23. Compare
ubst
with "substring". No match. - Move to position 24. Compare
bstr
with "substring". No match. - Move to position 25. Compare
stre
with "substring". No match. - Move to position 26. Compare
trin
with "substring". No match. - Move to position 27. Compare
ring
with "substring". Match found, record position 27.
The substring "substring" is found at position 27 within the text.
Example Code in C++
#include <iostream>
#include <string>
#include <vector>
std::vector<int> textSearch(const std::string& text, const std::string& query) {
std::vector<int> positions;
for (int i = 0; i <= text.length() - query.length(); ++i) {
int j = 0;
while (j < query.length() && text[i + j] == query[j]) {
++j;
}
if (j == query.length()) {
positions.push_back(i);
}
}
return positions;
}
int main() {
std::string text = "Let's search for a substring within this text to see how the algorithm works.";
std::string query = "substring";
std::vector<int> result = textSearch(text, query);
std::cout << "Substring found at positions: ";
for (int pos : result) {
std::cout << pos << " ";
}
std::cout << std::endl;
return 0;
}
In this example, the textSearch
function is used to find the positions of the substring "substring" within the text. The result will be a vector containing the starting positions of the matches.