Definition 1: Amazon DynamoDB is a fully managed proprietary NoSQL database service that supports key-value and document data structures and is offered by Amazon.com as part of the Amazon Web Services portfolio. DynamoDB exposes a similar data model to and derives its name from Dynamo, but has a different underlying implementation. Dynamo had a multi-master design requiring the client to resolve version conflicts and DynamoDB uses synchronous replication across multiple datacenters for high durability and availability.
Definition 2: DynamoDB is a fast and flexible non-relational database service for any scale. DynamoDB enables customers to offload the administrative burdens of operating and scaling distributed databases to AWS so that they don’t have to worry about hardware provisioning, setup and configuration, throughput capacity planning, replication, software patching, or cluster scaling.
Amazon DynamoDB explained
- Fully Managed
- Fast, consistent Performance
- Fine-grained access control
AWS DynamoDB Facts and Summaries
- Amazon DynamoDB is a low-latency NoSQL database.
- DynamoDB consists of Tables, Items, and Attributes
- DynamoDb supports both document and key-value data models
- DynamoDB Supported documents formats are JSON, HTML, XML
- DynamoDB has 2 types of Primary Keys: Partition Key and combination of Partition Key + Sort Key (Composite Key)
- DynamoDB has 2 consistency models: Strongly Consistent / Eventually Consistent
- DynamoDB Access is controlled using IAM policies.
- DynamoDB has fine grained access control using IAM Condition parameter dynamodb:LeadingKeys to allow users to access only the items where the partition key vakue matches their user ID.
- DynamoDB Indexes enable fast queries on specific data columns
- DynamoDB indexes give you a different view of your data based on alternative Partition / Sort Keys.
- DynamoDB Local Secondary indexes must be created when you create your table, they have same partition Key as your table, and they have a different Sort Key.
- DynamoDB Global Secondary Index Can be created at any time: at table creation or after. They have a different partition Key as your table and a different sort key as your table.
- A DynamoDB query operation finds items in a table using only the primary Key attribute: You provide the Primary Key name and a distinct value to search for.
- A DynamoDB Scan operation examines every item in the table. By default, it return data attributes.
- DynamoDB Query operation is generally more efficient than a Scan.
- With DynamoDB, you can reduce the impact of a query or scan by setting a smaller page size which uses fewer read operations.
- To optimize DynamoDB performance, isolate scan operations to specific tables and segregate them from your mission-critical traffic.
- To optimize DynamoDB performance, try Parallel scans rather than the default sequential scan.
- To optimize DynamoDB performance: Avoid using scan operations if you can: design tables in a way that you can use Query, Get, or BatchGetItems APIs.
- When you scan your table in Amazon DynamoDB, you should follow the DynamoDB best practices for avoiding sudden bursts of read activity.
- DynamoDb Provisioned Throughput is measured in Capacity Units.
- 1 Write Capacity Unit = 1 x 1KB Write per second.
- 1 Read Capacity Unit = 1 x 4KB Strongly Consistent Read Or 2 x 4KB Eventually Consistent Reads per second. Eventual consistent reads give us the maximum performance with the read operation.
- What is the maximum throughput that can be provisioned for a single DynamoDB table?
DynamoDB is designed to scale without limits. However, if you want to exceed throughput rates of 10,000 write capacity units or 10,000 read capacity units for an individual table, you must Contact AWS to increase it.
If you want to provision more than 20,000 write capacity units or 20,000 read capacity units from a single subscriber account, you must first contact AWS to request a limit increase.
- Dynamo Db Performance: DAX is a DynamoDB-compatible caching service that enables you to benefit from fast in-memory performance for demanding applications.
- As an in-memory cache, DAX reduces the response times of eventually-consistent read workloads by an order of magnitude, from single-digit milliseconds to microseconds
- DAX improves response times for Eventually Consistent reads only.
- With DAX, you point your API calls to the DAX cluster instead of your table.
- If the item you are querying is on the cache, DAX will return it; otherwise, it will perform and Eventually Consistent GetItem operation to your DynamoDB table.
- DAX reduces operational and application complexity by providing a managed service that is API compatible with Amazon DynamoDB, and thus requires only minimal functional changes to use with an existing application.
- DAX is not suitable for write-intensive applications or applications that require Strongly Consistent reads.
- For read-heavy or bursty workloads, DAX provides increased throughput and potential operational cost savings by reducing the need to over-provision read capacity units. This is especially beneficial for applications that require repeated reads for individual keys.
- Dynamo Db Performance: ElastiCache
- In-memory cache sits between your application and database
- 2 different caching strategies: Lazy loading and Write Through: Lazy loading only caches the data when it is requested
- Elasticache Node failures are not fatal, just lots of cache misses
- Avoid stale data by implementing a TTL.
- Write-Through strategy writes data into cache whenever there is a change to the database. Data is never stale
- Write-Through penalty: Each write involves a write to the cache. Elasticache node failure means that data is missing until added or updated in the database.
- Elasticache is wasted resources if most of the data is never used.
- Time To Live (TTL) for DynamoDB allows you to define when items in a table expire so that they can be automatically deleted from the database. TTL is provided at no extra cost as a way to reduce storage usage and reduce the cost of storing irrelevant data without using provisioned throughput. With TTL enabled on a table, you can set a timestamp for deletion on a per-item basis, allowing you to limit storage usage to only those records that are relevant.
- DynamoDB Security: DynamoDB uses the CMK to generate and encrypt a unique data key for the table, known as the table key. With DynamoDB, AWS Owned, or AWS Managed CMK can be used to generate & encrypt keys. AWS Owned CMK is free of charge while AWS Managed CMK is chargeable. Customer managed CMK’s are not supported with encryption at rest.
- Amazon DynamoDB offers fully managed encryption at rest. DynamoDB encryption at rest provides enhanced security by encrypting your data at rest using an AWS Key Management Service (AWS KMS) managed encryption key for DynamoDB. This functionality eliminates the operational burden and complexity involved in protecting sensitive data.
- DynamoDB is a alternative solution which can be used for storage of session management. The latency of access to data is less , hence this can be used as a data store for session management
- DynamoDB Streams Use Cases and Design Patterns:
How do you set up a relationship across multiple tables in which, based on the value of an item from one table, you update the item in a second table?
How do you trigger an event based on a particular transaction?
How do you audit or archive transactions?
How do you replicate data across multiple tables (similar to that of materialized views/streams/replication in relational data stores)?
As a NoSQL database, DynamoDB is not designed to support transactions. Although client-side libraries are available to mimic the transaction capabilities, they are not scalable and cost-effective. For example, the Java Transaction Library for DynamoDB creates 7N+4 additional writes for every write operation. This is partly because the library holds metadata to manage the transactions to ensure that it’s consistent and can be rolled back before commit.
You can use DynamoDB Streams to address all these use cases. DynamoDB Streams is a powerful service that you can combine with other AWS services to solve many similar problems. When enabled, DynamoDB Streams captures a time-ordered sequence of item-level modifications in a DynamoDB table and durably stores the information for up to 24 hours. Applications can access a series of stream records, which contain an item change, from a DynamoDB stream in near real time.
AWS maintains separate endpoints for DynamoDB and DynamoDB Streams. To work with database tables and indexes, your application must access a DynamoDB endpoint. To read and process DynamoDB Streams records, your application must access a DynamoDB Streams endpoint in the same Region
- 20 global secondary indexes are allowed per table? (by default)
- What is one key difference between a global secondary index and a local secondary index?
A local secondary index must have the same partition key as the main table
- How many tables can an AWS account have per region? 256
- How many secondary indexes (global and local combined) are allowed per table? (by default): 25
You can define up to 5 local secondary indexes and 20 global secondary indexes per table (by default) – for a total of 25.
- How can you increase your DynamoDB table limit in a region?
By contacting AWS and requesting a limit increase
- For any AWS account, there is an initial limit of 256 tables per region.
- The minimum length of a partition key value is 1 byte. The maximum length is 2048 bytes.
- The minimum length of a sort key value is 1 byte. The maximum length is 1024 bytes.
- For tables with local secondary indexes, there is a 10 GB size limit per partition key value. A table with local secondary indexes can store any number of items, as long as the total size for any one partition key value does not exceed 10 GB.
- The following diagram shows a local secondary index named LastPostIndex. Note that the partition key is the same as that of the Thread table, but the sort key is LastPostDateTime.
- Relational vs Non Relational (SQL vs NoSQL)
Reference: AWS DynamoDB
AWS DynamoDB Questions and Answers Dumps
Q0: What should the Developer enable on the DynamoDB table to optimize performance and minimize costs?
- A. Amazon DynamoDB auto scaling
- B. Amazon DynamoDB cross-region replication
- C. Amazon DynamoDB Streams
- D. Amazon DynamoDB Accelerator
Q2: A security system monitors 600 cameras, saving image metadata every 1 minute to an Amazon DynamoDb table. Each sample involves 1kb of data, and the data writes are evenly distributed over time. How much write throughput is required for the target table?
- A. 6000
- B. 10
- C. 3600
- D. 600
Q3: You are developing an application that will interact with a DynamoDB table. The table is going to take in a lot of read and write operations. Which of the following would be the ideal partition key for the DynamoDB table to ensure ideal performance?
- A. CustomerID
- B. CustomerName
- C. Location
- D. Age
Q4: A DynamoDB table is set with a Read Throughput capacity of 5 RCU. Which of the following read configuration will provide us the maximum read throughput?
- A. Read capacity set to 5 for 4KB reads of data at strong consistency
- B. Read capacity set to 5 for 4KB reads of data at eventual consistency
- C. Read capacity set to 15 for 1KB reads of data at strong consistency
- D. Read capacity set to 5 for 1KB reads of data at eventual consistency
Q5: Your team is developing a solution that will make use of DynamoDB tables. Due to the nature of the application, the data is needed across a couple of regions across the world. Which of the following would help reduce the latency of requests to DynamoDB from different regions?
- A. Enable Multi-AZ for the DynamoDB table
- B. Enable global tables for DynamoDB
- C. Enable Indexes for the table
- D. Increase the read and write throughput for the tablez
Q6: An application is currently accessing a DynamoDB table. Currently the tables queries are performing well. Changes have been made to the application and now the performance of the application is starting to degrade. After looking at the changes , you see that the queries are making use of an attribute which is not the partition key? Which of the following would be the adequate change to make to resolve the issue?
- A. Add an index for the DynamoDB table
- B. Change all the queries to ensure they use the partition key
- C. Enable global tables for DynamoDB
- D. Change the read capacity on the table
Q7: Company B has created an e-commerce site using DynamoDB and is designing a products table that includes items purchased and the users who purchased the item.
When creating a primary key on a table which of the following would be the best attribute for the partition key? Select the BEST possible answer.
- A. None of these are correct.
- B. user_id where there are many users to few products
- C. category_id where there are few categories to many products
- D. product_id where there are few products to many users
Q8: Which API call can be used to retrieve up to 100 items at a time or 16 MB of data from a DynamoDB table?
- A. BatchItem
- B. GetItem
- C. BatchGetItem
- D. ChunkGetItem
Q9: Which DynamoDB limits can be raised by contacting AWS support?
- A. The number of hash keys per account
- B. The maximum storage used per account
- C. The number of tables per account
- D. The number of local secondary indexes per account
- E. The number of provisioned throughput units per account
Q10: Which approach below provides the least impact to provisioned throughput on the “Product”
- A. Create an “Images” DynamoDB table to store the Image with a foreign key constraint to
the “Product” table
- B. Add an image data type to the “Product” table to store the images in binary format
- C. Serialize the image and store it in multiple DynamoDB tables
- D. Store the images in Amazon S3 and add an S3 URL pointer to the “Product” table item
for each image
Q11: You’re creating a forum DynamoDB database for hosting forums. Your “thread” table contains the forum name and each “forum name” can have one or more “subjects”. What primary key type would you give the thread table in order to allow more than one subject to be tied to the forum primary key name?
- A. Hash
- B. Range and Hash
- C. Primary and Range
- D. Hash and Range
Amazon Aurora explained:
- High scalability
- High availability and durability
- High Performance
- Multi Region
Amazon ElastiCache Explained
- In-Memory data store
- High availability and reliability
- Fully managed
- Supports two pop
- Open source engine
Amazon Redshift explained
- Fast, fully managed, petabyte-scale data warehouse
- Supports wide range of open data formats
- Allows you to run SQL queries against large unstructured data in Amazon Simple Storage Service
- Integrates with popular Business Intelligence (BI) and extract, Transform, Load (ETL) solutions.
Amazon Neptune Explained
- Fully managed graph database
- Supports open graph APIs
- Used in Social Networking
Other AWS Facts and Summaries and Questions/Answers Dump
- AWS S3 facts and summaries and Q&A Dump
- AWS DynamoDB facts and summaries and Questions and Answers Dump
- AWS EC2 facts and summaries and Questions and Answers Dump
- AWS Serverless facts and summaries and Questions and Answers Dump
- AWS Developer and Deployment Theory facts and summaries and Questions and Answers Dump
- AWS IAM facts and summaries and Questions and Answers Dump
- AWS Lambda facts and summaries and Questions and Answers Dump
- AWS SQS facts and summaries and Questions and Answers Dump
- AWS RDS facts and summaries and Questions and Answers Dump
- AWS ECS facts and summaries and Questions and Answers Dump
- AWS CloudWatch facts and summaries and Questions and Answers Dump
- AWS SES facts and summaries and Questions and Answers Dump
- AWS EBS facts and summaries and Questions and Answers Dump
- AWS ELB facts and summaries and Questions and Answers Dump
- AWS Autoscaling facts and summaries and Questions and Answers Dump
- AWS VPC facts and summaries and Questions and Answers Dump
- AWS KMS facts and summaries and Questions and Answers Dump
- AWS Elastic Beanstalk facts and summaries and Questions and Answers Dump
- AWS CodeBuild facts and summaries and Questions and Answers Dump
- AWS CodeDeploy facts and summaries and Questions and Answers Dump
- AWS CodePipeline facts and summaries and Questions and Answers Dump