Full text search
Full-text search is a technique for searching words or phrases across a large set of textual data. Unlike traditional queries that require exact matches, full-text search tries to understand what you’re searching for and bring up the best results. It is widely used in applications like search engines, e-commerce platforms, documentation searches, and content management systems.
Understanding text indexes
A full-text index is fundamentally different from a regular database index. Instead of simply mapping fields to values, it:
- Tokenizes text (splits them into words or phrases).
- Removes stop words (such as "is", "the", "and").
- Applies stemming (so "running" and "run" are treated as the same).
- Assigns weights based on frequency, importance, or custom ranking logic.
FerretDB supports full-text search capabilities.
A full text search index creation takes the following parameters:
Field | Description |
---|---|
name | A custom name for the index, useful for reference. |
weights | Assigns weighting to fields (higher values mean more relevance in search). Default is 1 . |
default_language | Specifies the language used for stemming (default: "english"). |
caseSensitive | Enables case-sensitive search. |
FerretDB only supports one text index per collection.
Single full text index
Single full text index is created on a single field in a collection.
Creating a text index
To create a text index, use the createIndex
command with the field you want to index and the type set to 'text'
.
db.runCommand({
createIndexes: 'books',
indexes: [
{
key: { summary: 'text' },
name: 'summary_text_index'
}
]
})
This command creates a full text index on the summary
field in a books
collection.
Insert the following documents into the books
collection:
db.runCommand({
insert: 'books',
documents: [
{
_id: 'pride_prejudice_1813',
title: 'Pride and Prejudice',
author: 'Jane Austen',
summary:
'The novel follows the story of Elizabeth Bennet, a spirited young woman navigating love, ' +
'societal expectations, and family drama in 19th-century England.'
},
{
_id: 'moby_dick_1851',
title: 'Moby Dick',
author: 'Herman Melville',
summary:
'The narrative follows Ishmael and his voyage aboard the whaling ship Pequod, commanded by Captain Ahab, ' +
'who is obsessed with hunting the elusive white whale, Moby Dick.'
},
{
_id: 'frankenstein_1818',
title: 'Frankenstein',
author: 'Mary Shelley',
summary:
'Victor Frankenstein, driven by an unquenchable thirst for knowledge, creates a living being, ' +
'only to face tragic consequences as his creation turns monstrous.'
}
]
})
Performing a full text search
Let's run a basic full text search query to find all documents that contain the word "drama" in the summary
field.
db.runCommand({
find: 'books',
filter: {
$text: {
$search: 'drama'
}
}
})
This query returns all documents where the summary
field contains the word "drama".
response = {
cursor: {
id: Long('0'),
ns: 'new.books',
firstBatch: [
{
_id: 'pride_prejudice_1813',
title: 'Pride and Prejudice',
author: 'Jane Austen',
summary:
'The novel follows the story of Elizabeth Bennet, a spirited young woman navigating love, societal expectations, and family drama in 19th-century England.'
}
]
},
ok: 1
}
Compound text index
Compound text index creates an index on multiple fields. Ensure to drop the existing index before creating a new one.
Let's create a compound text index on the title
and summary
fields.
db.runCommand({
createIndexes: 'books',
indexes: [
{
key: { title: 'text', summary: 'text' },
name: 'title_summary_text_index'
}
]
})
Relevance score
When you perform a full-text search, a relevance score is assigned to each document based on how well it matches the search query. Relevance scores are calculated based on factors like word frequency, proximity, and custom weights. Higher scores indicate better relevance.
Let's search for books that contain the words "hunt whales" in the summary
field and return the relevance score.
db.runCommand({
find: 'books',
filter: { $text: { $search: 'hunt whales' } },
projection: { title: 1, author: 1, summary: 1, score: { $meta: 'textScore' } },
sort: { score: { $meta: 'textScore' } }
})
Even though the query does not have exact matches, the search returns documents that contain similar words.
response = {
cursor: {
id: Long('0'),
ns: 'new.books',
firstBatch: [
{
_id: 'moby_dick_1851',
title: 'Moby Dick',
author: 'Herman Melville',
summary:
'The narrative follows Ishmael and his voyage aboard the whaling ship Pequod, commanded by Captain Ahab, who is obsessed with hunting the elusive white whale, Moby Dick.',
score: 3
}
]
},
ok: 1
}