Skip to content

Threshold

The threshold property is used to set the minimum/maximum number of results to return.

The problem

Let's consider the following example:

javascript
import { create, insert, search } from '@orama/orama'

const db = await create({
  schema: {
    title: 'string',
  }
})

await insert(db, { title: 'Blue t-shirt, slim fit' })
await insert(db, { title: 'Blue t-shirt, regular fit' })
await insert(db, { title: 'Red t-shirt, slim fit' })
await insert(db, { title: 'Red t-shirt, oversize fit' })

As you can see, we're inserting 4 documents with a lot of common keywords.

What happens if I search for "t-shirt"?

javascript
const results = await search(db, {
  term: 't-shirt',
})

// results.count = 4

In that case, every single document will be returned, as they all contain the "t-shirt" keyword.

Now, what happens if I search for "regular fit"?

javascript
const results = await search(db, {
  term: 'regular fit',
})

// results.count = 4

What! Why do I get 4 results? I only have 1 document that contains the "regular fit" keyword!

Well, Orama will position the document containing the "regular fit" keyword at the top of the results, but it will also return the other 3 documents, as they also contain the "fit" keyword.

With very long search queries, this can lead to a lot of results, which depending on your index size, it might not be what you want.

Imagine you have a database with 1 million documents, and you want to search for "red t-shirt with long sleeves and a motorbike printed on the front". That's a pretty broad search, right? Maybe it's the case to limit the results a bit.

Using the threshold property

The threshold property solves this problem by limiting (or maximizing) the number of results to return when performing a search operation. It must be a number between 0 and 1, and it represents the percentage of results to return.

Setting the threshold to 1 (default)

By default, Orama sets the threshold to 1. This means that all the results will be returned.

javascript
const results = await search(db, {
  term: 'slim fit',
})

This will return all the documents containing either the "slim" keyword or the "fit" keyword. In our case, considering the example above, all the documents will be returned.

Setting the threshold to 0

Considering the example above, it will work the following way:

javascript
const results = await search(db, {
  term: 'slim fit',
  threshold: 0,
})

In this case, the threshold property is set to 0, which means that only the document containing the "slim fit" keywords will be returned. This applies to all the document properties; if a keyword is found in a property, and another keyword is found in a different property, the document will be returned.

You can boost the results depending on where a property is found using the field boosting API.

Setting the threshold to a value between 0 and 1

javascript
const results = await search(db, {
  term: 'slim fit',
  threshold: 0.6,
})

In this case, the threshold property is set to 0.6, which means that only the document containing the "slim fit" keywords will be returned, plus 60% of the other documents containing either the "slim" keyword or the "fit" keyword.