Data Modernization Using MongoDB (Part 1)

Blog /Data-Modernization-Using-MongoDB-Part1

Evolution in technology, social media, and digital business has resulted in an explosion of data that can be mined by organizations to improve their business. The digital universe is growing: more than half of the data generated today is unstructured data from social networks, mobile/smart devices, web applications, and the like.

Unfortunately, traditional RDBMSs were designed for structured data. Although these databases have been the foundation for data management technology for more than three decades, the way applications are built has changed to the point where having one can restrict business agility, limit scalability, and strain budgets.

As a result, NoSQL databases, which are designed for web applications, metadata storage, and social media analytics, have gained traction over the last few years. Let’s take a look at how NoSQL databases like MongoDB compare with an RDBMS.

Meeting demands

Today’s digital businesses demand:

  • Fast, agile development
  • Flexible data architecture (often called schema-less DB design)
  • Scalability for big data
  • Real-time performance (consistent UX across all interfaces)
  • The ability to handle complex data types (text, media files, etc.)
  • The ability to handle new computing environments (e.g., cloud)

Because traditional RDBMS were hard to scale, IT experts turned to RAC and clusters. But that did not solve the problem. The main problem is that SQL operations and transactions spanning multiple nodes do not scale well. In contrast, an open source, document-oriented database like MongoDB handles such requirements efficiently. MongoDB provides:

  • A document data model (data is stored in a structure that maps to objects in modern programming language)
  • Rich queries (indexes, queries, aggregation framework, native MapReduce, text search indexes)
  • Drivers (applications interact with database using native libraries)
  • Horizontal scalability (commodity hardware and cloud infrastructure are used to handle growth in data volume and throughput)
  • High availability (support for native replication, automatic failover to secondary nodes)
  • In-memory performance (e.g., data is read and written to RAM)
  • Flexibility (flexible architecture supports changes to data models)

Performance and scalability

Database performance is the most important factor in successful application development. Unlike a traditional RDBMS, which demands a high level of performance-engineering knowledge, NoSQL databases like MongoDB make extensive use of RAM to optimize database performance. (Reading data from memory is faster than reading data from the disk.) MongoDB uses memory mapped files to optimize read operations. If the volume of data that is frequently accessed exceeds the capacity of single machine, MongoDB scales horizontally across multiple servers through automatic sharding. That is the key factor.

Feature Comparison

Feature/Technology

RDBMS

MongoDB

Schema

Has defined DB schema structure

Schema-less DB design

Atomicity

Guarantees atomic updates through ACID compliance

Provides document-level atomicity

Consistency

Allows table- and row-level locking and multi-version consistency

Only 1 global lock, no lock available on individual collections, does not offer multi-version consistency

Indexing

Yes

Yes

Querying

Complies with ANSI standard

Language is specific to MongoDB

Maintainability

Provides mature administrative tools

Immature as compared to traditional RDBMS

Cost

Most popular RDBMSs are not open source and require licensing

Free

Aggregation Framework

Yes

Yes

Data Model

Follows normalization/de-normalization

Key value store

API (Native Drives for most programming languages)

Yes

Yes

Partitioning and Scaling

Implements distributed, multi-master database

Designed to scale horizontally using shards

Performance

Optimized for normalized structure

Optimized for read intensive key value stores

Security

Yes

Limited: authorization at database level

In part 2 of this post, we’ll look at business cases that demonstrate the best use for NoSQL databases.

Post Date: 3/18/2016

Prakash Mishra - NTT DATA Prakash Mishra

About the author

Prakash Mishra leads NTT Data’s Data Architecture and Management Practice. A solutions-driven, results-oriented, self-motivated leader, Prakash has a proven record of extensive data architecture leadership in a complex environment. Prakash has been involved in developing and leading the implementation of traditional and innovative big data strategies and solutions, data modernization and master data management solutions for small to large organizations. Prakash is a master in building and motivating high-performance teams, cultivating a positive work environment and promoting a spirit of teamwork and idea-sharing to maximize individual contributions. Prakash holds a master’s degree in computer science , with two decades of experience specialized in enterprise data architecture and management.

VIEW ALL POSTS
EXPLORE OUR BLOGS