NoSQL Tutorial: Types of NoSQL Databases & Example

โšก Resumen inteligente

NoSQL is a non-relational database management system that does not require a fixed schema, avoids joins, and scales easily. This resource explains what NoSQL is, why it exists, its history, features, the four database types, the CAP theorem, eventual consistency, and its advantages and disadvantages.

  • ๐Ÿ“ฆ Definiciรณn: A non-relational, schema-free store built for huge, distributed datasets.
  • ๐Ÿ“ˆ Motivo: Scaling out across many hosts handles big data faster than scaling up.
  • ๐Ÿ—‚๏ธ Four Types: Key-value, column-oriented, graph-based, and document-oriented.
  • ๐Ÿ‡ง๐Ÿ‡ท CAP Theorem: A distributed store can guarantee only two of consistency, availability, partition tolerance.
  • ๐Ÿ” BASE: Basically Available, Soft state, Eventual consistency across replicas.

Tutorial NoSQL

ยฟQuรฉ es NoSQL?

Base de datos NoSQL is a non-relational data management system that does not require a fixed schema. It avoids joins, and is easy to scale. The major purpose of using a NoSQL database is for distributed data stores with humongous data storage needs. NoSQL is used for big data and real-time web apps. For example, companies like Twitter, Facebook, and Google Recopilamos terabytes de datos de usuarios cada dรญa.

Base de datos NoSQL stands for โ€œNot Only SQLโ€ or โ€œNot SQLโ€. Though a better term would be โ€œNoRELโ€, NoSQL caught on. Carl Strozzi introduced the NoSQL concept in 1998.

Traditional RDBMS uses SQL syntax to store and retrieve data for further insights. Instead, a NoSQL database system encompasses a wide range of database technologies that can store structured, semi-structured, unstructured, and polymorphic data. Let us understand about NoSQL with a diagram in this NoSQL database tutorial:

Base de datos NoSQL

ยฟPor quรฉ NoSQL?

El concepto de bases de datos NoSQL se popularizรณ entre gigantes de Internet como Google, Facebook, Amazon, etc. que manejan grandes volรบmenes de datos. El tiempo de respuesta del sistema se vuelve lento cuando utiliza RDBMS para volรบmenes masivos de datos.

Para resolver este problema, podrรญamos "ampliar" nuestros sistemas actualizando nuestro hardware existente. Este proceso es costoso.

The alternative for this issue is to distribute the database load on multiple hosts whenever the load increases. This method is known as โ€œscaling outโ€.

NoSQL

NoSQL database is non-relational, so it scales out better than relational databases, as they are designed with web applications in mind.

Breve historia de las bases de datos NoSQL

  • 1998 โ€“ Carlo Strozzi uses the term NoSQL for his lightweight, open-source relational database.
  • 2000 โ€“ Graph database Neo4j is launched.
  • 2004 - Google BigTable is launched.
  • 2005 - CouchDB se pone en marcha.
  • 2007 โ€“ The research paper on Amazon Dynamo is released.
  • 2008 โ€“ Facebook open sources the Cassandra proyecto.
  • 2009 โ€“ The term NoSQL was reintroduced.

Caracterรญsticas de NoSQL

No relacional

  • Las bases de datos NoSQL nunca siguen el modelo relacional.
  • Never provide tables with flat fixed-column records.
  • Work with self-contained aggregates or BLOBs.
  • Do not require object-relational mapping and data normalization.
  • No complex features like query languages, query planners, referential integrity joins, or ACID.

Sin esquemas

  • NoSQL databases are either schema-free or have relaxed schemas.
  • Do not require any sort of definition of the schema of the data.
  • Offer heterogeneous structures of data in the same domain.
Caracterรญsticas de NoSQL
NoSQL no tiene esquemas

API simple

  • Offers easy-to-use interfaces for storage and querying data.
  • APIs allow low-level data manipulation and selection methods.
  • Text-based protocols mostly used with HTTP REST with JSON.
  • Mostly used no standard-based NoSQL query language.
  • Web-enabled databases running as internet-facing services.

Distribuido

  • Multiple NoSQL databases can be executed in a distributed fashion.
  • Offers auto-scaling and fail-over capabilities.
  • Often the ACID concept can be sacrificed for scalability and throughput.
  • Mostly no synchronous replication between distributed nodes; asynchronous multi-master replication, peer-to-peer, HDFS replication.
  • Only providing eventual consistency.
  • Shared-nothing architecture. This enables less coordination and higher distribution.
Caracterรญsticas de NoSQL

NoSQL no es nada compartido.

Tipos de bases de datos NoSQL

Bases de datos NoSQL are mainly categorized into four types: Key-value pair, Column-oriented, Graph-based, and Document-oriented. Every category has its unique attributes and limitations. None of the above-specified databases is better at solving all the problems. Users should select the database based on their product needs.

Tipos de bases de datos NoSQL:

  • Basado en pares clave-valor
  • Grรกfico orientado a columnas
  • Basado en grรกficos
  • Orientado a documentos

Tipos de bases de datos NoSQL

Basado en pares clave-valor

Data is stored in key/value pairs. It is designed in such a way to handle lots of data and heavy load. Key-value pair storage databases store data as a hash table where each key is unique, and the value can be a JSON, BLOB (Binary Large Objects), string, etc.

Por ejemplo, un par clave-valor puede contener una clave como โ€œSitio webโ€ asociada con un valor como โ€œGuru99 ".

Basado en pares clave-valor

It is one of the most basic NoSQL database examples. This kind of NoSQL database is used as a collection, dictionaries, associative arrays, etc. Key-value stores help the developer to store schema-less data. They work best for shopping Contenido del carrito.

Redis, Dynamo, and Riak are some NoSQL examples of key-value store databases. They are all based on AmazonEl papel Dynamo.

Basado en columnas

Column-oriented databases work on columns and are based on the BigTable paper by Google. Every column is treated separately. Values of single-column databases are stored contiguously.

Base de datos NoSQL basada en columnas

Base de datos NoSQL basada en columnas

They deliver high performance on aggregation queries like SUM, COUNT, AVG, MIN, etc., as the data is readily available in a column. Column-based NoSQL databases are widely used to manage data warehouses, inteligencia empresarial , CRM, and library card catalogs.

HBase, Cassandra, and Hypertable are NoSQL query examples of column-based databases.

Orientado a documentos

Document-Oriented NoSQL DB stores and retrieves data as a key-value pair, but the value part is stored as a document. The document is stored in JSON or XML formats. The value is understood by the DB and can be queried.

Relacional vs. Documento

Relacional vs. Documento

In this diagram on your left, you can see we have rows and columns, and on the right, we have a document database which has a similar structure to JSON. Now for the relational database, you have to know what columns you have, and so on. However, for a document database, you have a data store like a JSON object. You do not need to define it, which makes it flexible.

The document type is mostly used for CMS systems, blogging platforms, real-time analytics, and e-commerce applications. It should not be used for complex transactions which require multiple operations or queries against varying aggregate structures.

Amazon base de datos simple, CouchDB, MongoDB, Riak, and Lotus Notes are popular document-oriented sistemas DBMS.

Basado en grรกficos

A graph type database stores entities as well as the relations amongst those entities. The entity is stored as a node with the relationship as edges. An edge gives a relationship between nodes. Every node and edge has a unique identifier.

Basado en grรกficos

Compared to a relational database where tables are loosely connected, a graph database is multi-relational in nature. Traversing relationships is fast, as they are already captured in the DB, and there is no need to calculate them. Graph base databases are mostly used for social networks, logistics, and spatial data.

Neo4J, grรกfico infinito, OrientDB, and FlockDB are some popular graph-based databases.

Herramientas del mecanismo de consulta para NoSQL

The most common data retrieval mechanism is the REST-based retrieval of a value based on its key/ID with a GET resource.

Document store databases offer more difficult queries, as they understand the value in a key-value pair. For example, CouchDB allows defining views with MapReduce.

ยฟQuรฉ es el teorema CAP?

CAP theorem is also called Brewerโ€™s theorem. It states that it is impossible for a distributed data store to offer more than two out of three guarantees:

  1. Consistencia
  2. Disponibilidad
  3. Tolerancia de particiรณn

Consistencia: Los datos deben permanecer consistentes incluso despuรฉs de la ejecuciรณn de una operaciรณn. Esto significa que una vez que se escriben los datos, cualquier solicitud de lectura futura debe contener esos datos. Por ejemplo, despuรฉs de actualizar el estado del pedido, todos los clientes deben poder ver los mismos datos.

Disponibilidad: La base de datos siempre debe estar disponible y responsiva. No deberรญa tener ningรบn tiempo de inactividad.

Tolerancia de particiรณn: Tolerancia de particiรณn significa que el sistema debe continuar funcionando incluso si la comunicaciรณn entre los servidores no es estable. Por ejemplo, los servidores se pueden dividir en varios grupos que pueden no comunicarse entre sรญ. Aquรญ, si parte de la base de datos no estรก disponible, otras partes nunca se ven afectadas.

Coherencia eventual

The term โ€œeventual consistencyโ€ means to have copies of data on multiple machines to get high availability and scalability. Thus, changes made to any data item on one machine have to be propagated to other replicas.

Data replication may not be instantaneous, as some copies will be updated immediately while others in due course of time. These copies may be mutually inconsistent, but in due course of time, they become consistent. Hence, the name eventual consistency.

BASE: Basicamente Adisponible Sestado a menudo, Econsistencia ventual

  • Basically available means the DB is available all the time as per the CAP theorem.
  • Soft state means even without an input, the system state may change.
  • Eventual consistency means that the system will become consistent over time.

Coherencia eventual

Ventajas de NoSQL

  • Can be used as a primary or analytic data source.
  • Big data capability.
  • No single point of failure.
  • Easy replication.
  • No need for a separate caching layer.
  • Proporciona un rendimiento rรกpido y escalabilidad horizontal.
  • Can handle structured, semi-structured, and unstructured data with equal effect.
  • Object-oriented programming which is easy to use and flexible.
  • NoSQL databases do not need a dedicated high-performance server.
  • Support key developer languages and platforms.
  • Simpler to implement than using RDBMS.
  • Puede servir como fuente de datos principal para aplicaciones en lรญnea.
  • Handles big data which manages data velocity, variety, volume, and complexity.
  • Excels at distributed database and multi-data center operations.
  • Eliminates the need for a specific caching layer to store data.
  • Offers a flexible schema design which can easily be altered without downtime or service disruption.

Desventajas de NoSQL

  • No standardization rules.
  • Limited query capabilities.
  • RDBMS databases and tools are comparatively mature.
  • No ofrece ninguna capacidad de base de datos tradicional, como consistencia cuando se realizan mรบltiples transacciones simultรกneamente.
  • When the volume of data increases, it is difficult to maintain unique values as keys become difficult.
  • Does not work as well with relational data.
  • The learning curve is stiff for new developers.
  • Open source options are not so popular for enterprises.

Preguntas Frecuentes

NoSQL databases handle the large, varied, and fast-changing data that AI and big data pipelines produce. Their flexible schema and horizontal scaling suit storing training data, logs, and real-time features across distributed clusters.

Yes. Several NoSQL databases, such as MongoDB and Elasticsearch, now support vector fields and similarity search. This lets AI applications store embeddings next to documents for semantic search and recommendation features.

SQL databases are relational with a fixed schema and use tables, rows, and joins. NoSQL databases are non-relational, use flexible schemas, and scale horizontally, storing data as documents, key-values, columns, or graphs.

Avoid NoSQL when you need strong ACID transactions, complex joins, or strict data integrity, such as in banking. Mature relational databases handle multi-record consistency and standardized queries better in those cases.

Resumir este post con: