Document based databases.

Document based database is a type of NoSQl database which does not use the relational model to store data, instead it uses the documents to store the data in the database.

relational database model and document based database

A document database stores data in JSON, BSON, or XML documents.

Documents can be stored and retrieved in a form that is much more similar to a dictionary with keys and value.

Elements can be accessed by using the index value that is assigned for faster querying

Differences between the relational database model and document based database.

Relational Database Model Document database
1. Data are stored using rows and columns in a table 1. Data is stored using documents .
2. Uses SQL commands 2. Uses NOSQL commands.
3. Relational databases have a structured schema that defines the tables, columns, data types, constraints, and relationships between tables. 3. Document databases have a flexible schema, allowing documents within a collection to have varying structures.
4. There are relationships between the tables and it is referred using FOREIGNl KEYS. 4. There is no dynamic relationship between two documents so documents can be independent of one another.

Key-Value based databases.

Features on the key value db

Simplicity

Scalability

Speed

Graph Database

Use Cases of Graph Database :

  1. Generative AI
  2. Telecommunications
  3. Healthcare and Life Science

Vector Databases

db

Example : In image processing, high-dimensional vectors can represent images as vectors of pixel values, with each dimension corresponding to a pixel in the image.

how pixels are stored in vector

How do vector databases work ?

working

Vector databases are used for :

  1. Finding similarity and semantic searches
  2. Machine learning and deep learning
  3. Large language models (LLMs) and generative AI like ChatGPT

Time-series Databases

A time series database (TSDB) is a specialized type of database that is specifically designed to efficiently store, retrieve, and analyze time-stamped or time series data.

Time series data is simply measurements or events that are tracked, monitored, downsampled, and aggregated over time.

Each data point in a time series database is tagged with a timestamp indicating when the data was recorded or observed.

Time series databases were primarily focused on looking at financial data, the volatility of stock trading, and systems built to solve trading.

analysis

When is the time series database used ?

Column oriented Databases

  1. Faster Access : data is stored column by column, the database only needs to look at the columns needed for a query. This means it doesn’t waste time scanning through entire rows, which speeds up the process.

  2. Aggregation Optimization : These databases are good at doing operations that involve adding up or summarizing data across columns, like finding averages or totals. Since each column is stored separately, these operations can focus on just the relevant data, speeding up the process.

  3. Parallel Processing : Columnar databases can spread out the work across multiple processors or computers, which helps handle large amounts of data quickly.

Situation Benefits of Columnar Databases  
  Data Filtering - Faster access: Only necessary columns are accessed.
    - Efficient processing: Queries focus on relevant data.
    - Compression: Data is stored efficiently, saving space.
  Data Aggregation - Group operations: Ideal for summarizing data.
    - Speed: Aggregations are faster due to column storage.
    - Parallel processing: Can handle large datasets easily.
  Overall - Improved performance for analytics and data insights.
    - Scalability: Suitable for handling large amounts of data.

example

When to use columnar database ?

Use Columnar Databases When… Use Row-Oriented Databases When…
Your primary use case is analytics Your primary use case is transactions
You need low query latency You don’t need low query latency (for analytics)
You have large amounts of data You’re working with smaller data (for analytics)
You don’t need strict ACID compliance You need strict ACID compliance
You’re using event sourcing principles You need to do frequent, small replaces and deletes
You need to store and analyze lots of time series data You need to store and access records with unique IDs