Version: v2.0

Full text search

Full-text search is a technique for searching words or phrases across a large set of textual data. Unlike traditional queries that require exact matches, full-text search tries to understand what you’re searching for and bring up the best results. It is widely used in applications like search engines, e-commerce platforms, documentation searches, and content management systems.

Understanding text indexes

A full-text index is fundamentally different from a regular database index. Instead of simply mapping fields to values, it:

Tokenizes text (splits them into words or phrases).
Removes stop words (such as "is", "the", "and").
Applies stemming (so "running" and "run" are treated as the same).
Assigns weights based on frequency, importance, or custom ranking logic.

FerretDB supports full-text search capabilities.

A full text search index creation takes the following parameters:

Field	Description
name	A custom name for the index, useful for reference.
weights	Assigns weighting to fields (higher values mean more relevance in search). Default is `1`.
default_language	Specifies the language used for stemming (default: "english").
caseSensitive	Enables case-sensitive search.

note

FerretDB only supports one text index per collection.

Single full text index

Single full text index is created on a single field in a collection.

Creating a text index

To create a text index, use the createIndex command with the field you want to index and the type set to 'text'.

db.runCommand({
  createIndexes: 'books',
  indexes: [
    {
      key: { summary: 'text' },
      name: 'summary_text_index'
    }
  ]
})

This command creates a full text index on the summary field in a books collection.

Insert the following documents into the books collection:

db.runCommand({
  insert: 'books',
  documents: [
    {
      _id: 'pride_prejudice_1813',
      title: 'Pride and Prejudice',
      author: 'Jane Austen',
      summary:
        'The novel follows the story of Elizabeth Bennet, a spirited young woman navigating love, ' +
        'societal expectations, and family drama in 19th-century England.'
    },
    {
      _id: 'moby_dick_1851',
      title: 'Moby Dick',
      author: 'Herman Melville',
      summary:
        'The narrative follows Ishmael and his voyage aboard the whaling ship Pequod, commanded by Captain Ahab, ' +
        'who is obsessed with hunting the elusive white whale, Moby Dick.'
    },
    {
      _id: 'frankenstein_1818',
      title: 'Frankenstein',
      author: 'Mary Shelley',
      summary:
        'Victor Frankenstein, driven by an unquenchable thirst for knowledge, creates a living being, ' +
        'only to face tragic consequences as his creation turns monstrous.'
    }
  ]
})

Performing a full text search

Let's run a basic full text search query to find all documents that contain the word "drama" in the summary field.

db.runCommand({
  find: 'books',
  filter: {
    $text: {
      $search: 'drama'
    }
  }
})

This query returns all documents where the summary field contains the word "drama".

response = {
  cursor: {
    id: Long('0'),
    ns: 'new.books',
    firstBatch: [
      {
        _id: 'pride_prejudice_1813',
        title: 'Pride and Prejudice',
        author: 'Jane Austen',
        summary:
          'The novel follows the story of Elizabeth Bennet, a spirited young woman navigating love, societal expectations, and family drama in 19th-century England.'
      }
    ]
  },
  ok: 1
}

Compound text index

Compound text index creates an index on multiple fields. Ensure to drop the existing index before creating a new one.

Let's create a compound text index on the title and summary fields.

db.runCommand({
  createIndexes: 'books',
  indexes: [
    {
      key: { title: 'text', summary: 'text' },
      name: 'title_summary_text_index'
    }
  ]
})

Relevance score

When you perform a full-text search, a relevance score is assigned to each document based on how well it matches the search query. Relevance scores are calculated based on factors like word frequency, proximity, and custom weights. Higher scores indicate better relevance.

Let's search for books that contain the words "hunt whales" in the summary field and return the relevance score.

db.runCommand({
  find: 'books',
  filter: { $text: { $search: 'hunt whales' } },
  projection: { title: 1, author: 1, summary: 1, score: { $meta: 'textScore' } },
  sort: { score: { $meta: 'textScore' } }
})

Even though the query does not have exact matches, the search returns documents that contain similar words.

response = {
  cursor: {
    id: Long('0'),
    ns: 'new.books',
    firstBatch: [
      {
        _id: 'moby_dick_1851',
        title: 'Moby Dick',
        author: 'Herman Melville',
        summary:
          'The narrative follows Ishmael and his voyage aboard the whaling ship Pequod, commanded by Captain Ahab, who is obsessed with hunting the elusive white whale, Moby Dick.',
        score: 3
      }
    ]
  },
  ok: 1
}

Understanding text indexes​

Single full text index​

Creating a text index​

Performing a full text search​

Compound text index​

Relevance score​