Course Hive
Search

Welcome

Sign in or create your account

Continue with Google
or
Database Systems - Replicated Databases and Catalogs
Play lesson

Database Systems with SQL - Full Course - Database Systems - Replicated Databases and Catalogs

5.0 (0)
6 learners

What you'll learn

This course includes

  • 4.5 hours of video
  • Certificate of completion
  • Access on mobile and TV

Summary

Keywords

Full Transcript

Replicas Replica – a copy of the database, table, or data in a table. Replicated database – database that has multiple replicas on different storage devices. They provides the following advantages: High availability – if one storage device fails, then one of the replicas is accessed on a different device. Fast reads – Reads are concurrent on separate replicas without interference with each other. Local reads – Reads can be local, which means no network delays, in a distributed database. Backups – one replica can be backed up while transactions are executed on another Security – You can restrict updates to one replica with a specific user role. Although there are some good advantages, replicated databases can be slow or have inconsistent updates because they need to be done on all the replicas. Also, replication can make server administration more complex because DBAs need to figure out how to work with all replicas. For these reasons, replications is typically used for parallel and distributed databases where there are frequent reads and infrequent updates, and it is okay to have some inconsistency. Updating replicas A storage array can update data in a database replica without interfering with the database. This is pretty simple, but not so simple in a distributed database. If you update the replicas in a distributed transaction, it will be consistent but also slow. It could also fail if any of the replicas are not available. Luckily, there are two commonly used techniques to update replicas for distributed databases: Primary/secondary – sets one node as primary, updates primary first, and then updates all secondary nodes after. So this method is eventually consistent. Group replication – updates nodes in a group, where it broadcasts that an update is going to happen and makes sure there will be no conflicts with any concurrent transactions. When one of the nodes in the group is able to commit the updates the rest will follow soon, so it is eventually consistent. There is a rollback if any conflict happens. Replicated catalogs A catalog contains info about the database objects, including all the table, columns, keys and indexes. It helps to access data and process queries. Distributed databases need access to the catalog so that they can process queries. Within the distributed database, the catalog can be: A central catalog that is located entirely on a single node. This makes it easy to manage, but slow for queries from remote locations. Also, there could be a bottleneck because every node is trying to access the same node that has the catalog. Replicated catalog – each node has a copy of the catalog, making it fast and reliable since. Updating the catalog can be complex, since all replicas need to be also updated, and this can increase network traffic and cause failures if a replica is not available in a distributed transaction. Many distributed databases use a replicated database since updates to the catalog don’t happen too often. There are also some variations of these two structures. Subscribe to Appficial for more programming videos coming soon. Also, don't forget to click LIKE and comment on the video if it helped you out!

Course Hive

Continue this lesson in the app

Install CourseHive on Android or iOS to keep learning while you move.

Related Courses

FAQs

Course Hive
Download CourseHive
Keep learning anywhere