Automatic embeddings generation

With Orama Cloud, you can automatically generate embeddings from your data at deployment.

This guide will walk you through the process of creating embeddings using the Orama Native Embeddings or Open AI model. We plan to support additional models in the near future.

What are text embeddings?

Text embeddings are numerical representations of text that allow computers to understand the meaning and relationships between words, enabling applications like semantic search, machine translation, and sentiment analysis.

In recent months, embeddings have gained popularity as they form the foundation of semantic search, which is crucial in developing generative AI experiences like ChatGPT, Google Bard, and others.

Orama is a hybrid database capable of storing various types of data. It specializes in search capabilities, and its support for vector search enables semantic search among large sets of embeddings, which are presented as vectors.

For a deeper understanding of text embeddings, the Tensorflow website offers a fantastic explanation. It can be accessed here.

Automatic embeddings generation with Orama Cloud

Creating text embeddings from a given text or set of texts can be complex. However, Orama Cloud simplifies this process by enabling automatic generation of these embeddings each time you deploy a new index. This makes it remarkably easy to conduct semantic searches through your data at a remarkable speed.

Using the Orama Native Embeddings

To perform vector and hybrid searches on your data using the Orama Native Embeddings, no configuration is necessary.

When creating or editing an index, simply select an Orama embedding model. This will automatically generate embeddings from your data:

Automatic embeddings generation via Orama Native Embeddings

Indexes that have automatic embeddings generation enabled will display the icon and the name of the used model in the index list:

Automatic embeddings generation via Orama Native Embeddings, list view

Using OpenAI

If you prefer to use OpenAI's models, the following section will guide you through the necessary configuration.

Connecting to OpenAI

INFO

To use this feature, you will need an OpenAI account and an OpenAI API Key. We will support more providers soon.

Before you can start generating embeddings, you need to connect to OpenAI. This requires adding an OpenAI API Key to Orama Cloud.

We will encrypt this API Key and store it securely. For safety reasons, we recommend creating a new API Key specifically for Orama Cloud from the OpenAI dashboard.

As soon as you have your OpenAI API Key ready, you can add it to Orama Cloud by going to https://cloud.oramasearch.com/developer-tools, and selecting "OpenAI API key" from the left menu.

After adding your API key, you won't be able to view it again for security reasons. While you can delete it, this is not recommended as all operations dependent on vector search will cease to function. Alternatively, you may choose to replace it with a new key.

Creating the embeddings

You can now create a new index by going to https://cloud.oramasearch.com/indexes/new. For this guide, we will use a simple JSON file as a data source.

You can download the same JSON file here: download dataset.

Once you have your dataset ready, you can create a new Index:

After uploading the file, you can enable the Automatic Embeddings Generation feature. This feature will scan the "string" properties in your schema and automatically select them to generate embeddings. You can always select or deselect different properties as needed.

Once you have deployed your index containing embeddings, you can locate it in the indexes list where the "Embeddings" column is marked with the icon of the chosen model, such as OpenAI.

Congratulations! You've just deployed your first automatically-generated embeddings on Orama Cloud!

Querying the embeddings

Now that you have your embeddings distributed on Orama Cloud, you can use the official JavaScript client to perform vector search on them.

Read more about performing vector search on Orama Cloud here.

Pricing

Orama includes embedding generation in its "AI interactions" count. The Free Unlimited plan offers a limit of 1,000,000 (one million) AI interactions per month. The Pro Unlimited plan includes 1,000,000 AI interactions per month (with no extra charge for exceeding this quota in free months). The Enterprise plan provides unlimited AI interactions.

For more precise pricing, please refer to the official pricing page.

Automatic embeddings generation ​

What are text embeddings? ​

Automatic embeddings generation with Orama Cloud ​

Using the Orama Native Embeddings ​

Using OpenAI ​

Connecting to OpenAI ​

Creating the embeddings ​

Querying the embeddings ​

Pricing ​