Skip to content

Insert Data

Whenever we create a database with Orama, we must specify a schema, which represents the entry we are going to index.

Let's say our database and schema look like this:

javascript
import { create, insert } from '@orama/orama'

const movieDB = await create({
  schema: {
    title: 'string',
    director: 'string',
    plot: 'string',
    year: 'number',
    isFavorite: 'boolean',
  },
})

(Read more about database creation on the create page)

Insert

Data insertion in Orama is quick and intuitive:

javascript
const thePrestigeId = await insert(movieDB, {
  title: 'The prestige',
  director: 'Christopher Nolan',
  plot: 'Two friends and fellow magicians become bitter enemies after a sudden tragedy. As they devote themselves to this rivalry, they make sacrifices that bring them fame but with terrible consequences.',
  year: 2006,
  isFavorite: true,
});

const bigFishId = await insert(movieDB, {
  title: 'Big Fish',
  director: 'Tim Burton',
  plot: 'Will Bloom returns home to care for his dying father, who had a penchant for telling unbelievable stories. After he passes away, Will tries to find out if his tales were really true.',
  year: 2004,
  isFavorite: true,
});

const harryPotterId = await insert(movieDB, {
  title: 'Harry Potter and the Philosopher\'s Stone',
  director: 'Chris Columbus',
  plot: 'Harry Potter, an eleven-year-old orphan, discovers that he is a wizard and is invited to study at Hogwarts. Even as he escapes a dreary life and enters a world of magic, he finds trouble awaiting him.',
  year: 2001,
  isFavorite: false,
});

If you have a lot of records, we suggest using the insertMultiple function as following:

javascript
const docs = [
  {
    title: 'The prestige',
    director: 'Christopher Nolan',
    plot: 'Two friends and fellow magicians become bitter enemies after a sudden tragedy. As they devote themselves to this rivalry, they make sacrifices that bring them fame but with terrible consequences.',
    year: 2006,
    isFavorite: true,
  },
  {
    title: 'Big Fish',
    director: 'Tim Burton',
    plot: 'Will Bloom returns home to care for his dying father, who had a penchant for telling unbelievable stories. After he passes away, Will tries to find out if his tales were really true.',
    year: 2004,
    isFavorite: true,
  },
  {
    title: 'Harry Potter and the Philosopher\'s Stone',
    director: 'Chris Columbus',
    plot: 'Harry Potter, an eleven-year-old orphan, discovers that he is a wizard and is invited to study at Hogwarts. Even as he escapes a dreary life and enters a world of magic, he finds trouble awaiting him.',
    year: 2001,
    isFavorite: false,
  },
];

await insertMultiple(movieDB, docs);

Inserting a large number of documents in a loop could potentially block the event loop. Instead insertMultiple handles this case better.

You can pass a third, optional, parameter to change the batch size (default: 1000). We recommend keeping this number as low as possible to avoid blocking the event loop. The batchSize refers to the maximum number of insert operations to perform before yielding the event loop.

javascript
await insertMultiple(movieDB, docs, 500);

Validation rules

Defining the schema at database creation time, Orama validates all the inserted documents following those rules:

  • throw an error if a field has an unexpected type
  • allow missing fields or fields set to undefined
  • allow extra fields ignoring them

So the following document will be accepted:

javascript
import { create, insert } from '@orama/orama'

const movieDB = await create({
  schema: {
    title: 'string',
    year: 'number',
  },
})

await insert(movieDB, {
  title: 'The prestige',
  // `year` field is missing but it's ok
  // year: 2006,
  // Extra fields `director` and `isFavorite` will not be indexed
  director: 'Christopher Nolan',
  isFavorite: true,
});

Custom document IDs

Orama automatically uses the id field of the document, if found.

That means that given the following document and schema:

js
import { create, search } from '@orama/orama'

const db = await create({
  schema: {
    id: 'string',
    author: 'string',
    quote: 'string',
  },
})

await insert(db, {
  id: '73cbcc79-2203-49b8-bb52-60d8e9a66c5f',
  author: 'Fernando Pessoa',
  quote: "I wasn't meant for reality, but life came and found me",
})

the document will be indexed with the following id: 73cbcc79-2203-49b8-bb52-60d8e9a66c5f.

TIP

If the id field is not found, Orama will generate a random id for the document.

To provide a custom ID for a document, see the components page.

DANGER

If you try to insert two documents with the same ID, Orama will throw an error.

Remote document storing

By default Orama keeps a copy of the inserted document in memory (and in the serialized data) to speed up search performance.

If this is not acceptable, you can provide a custom documentsStore component which will be responsible to store and fetch documents from another location (local or remote).

The code example below is an example that implements a proxy: when a document is requested, the code finds it on a location of the filesystem. We assume each document has an id field which disable Orama random ID generation.

You can replace the file related operations with your custom code.

javascript
import { readFile, readdir, writeFile, rm } from 'node:fs/promises'
import { resolve } from 'node:path'
import { create } from '@orama/orama'

const ROOT_LOCATION = '/var/db/orama-example'

async function getDocument(id) {
  return JSON.parse(await readFile(resolve(ROOT_LOCATION, `${id}.json`), 'utf-8'))
}

async function listDocuments() {
  const allFiles = await readdir(ROOT_LOCATION)

  return allFiles.filter(id => id.endsWith('.json'))
}

const database = await create({
  schema: {
    title: 'string',
    director: 'string',
  },
  components: {
    // override partially the default documents store
    documentsStore: {
      create() {
        return {}
      },
      load(raw) {
        return {}
      },
      save(store) {
        return {}
      },
      get(_, id) {
        return getDocument(id)
      },
      getMultiple(_, ids) {
        return Promise.all(
          ids.map(async id => {
            return JSON.parse(await getDocument(id))
          }),
        )
      },
      async getAll() {
        const docs = await listDocuments()

        return Promise.all(
          docs.map(async id => {
            return JSON.parse(await getDocument(id))
          }),
        )
      },
      store() {
        // No-op
      },
      remove() {
        // No-op
      },
      async count() {
        const docs = await listDocuments()

        return docs.count
      },
    },
  },
})