Big Data and Data Analytics 101 – Top 100 AWS Certified Data Analytics Specialty Certification Questions and Answers Dumps

Top 100 AWS Certified Data Analytics Specialty Certification Questions and Answers Dumps

If you’re looking to take your data analytics career to the next level, then this AWS Data Analytics Specialty Certification Exam Preparation blog is a must-read! With over 100 exam questions and answers, plus data science and data analytics interview questions, cheat sheets and more, you’ll be fully prepared to ace the DAS-C01 exam.

In this blog, we talk about big data and data analytics; we also give you the last updated top 100 AWS Certified Data Analytics – Specialty Questions and Answers Dumps

The AWS Certified Data Analytics – Specialty (DAS-C01) examination is intended for individuals who perform in a data analytics-focused role. This exam validates an examinee’s comprehensive understanding of using AWS services to design, build, secure, and maintain analytics solutions that provide insight from data.

Download the App for an interactive experience:

AWS DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Get the Read Aloud Wonderland Bedtime Adventure iBook here

The AWS Certified Data Analytics – Specialty (DAS-C01) covers the following domains:

Domain 1: Collection 18%

Domain 2: Storage and Data Management 22%

Domain 3: Processing 24%

Domain 4: Analysis and Visualization 18%

Domain 5: Security 18%

Below are the Top 100 AWS Certified Data Analytics – Specialty Questions and Answers Dumps and References –

https://enoumen.com/2021/11/07/top-100-data-science-and-data-analytics-interview-questions-and-answers/

Question1: What combination of services do you need for the following requirements: accelerate petabyte-scale data transfers, load streaming data, and the ability to create scalable, private connections. Select the correct answer order.

A) Snowball, Kinesis Firehose, Direct Connect

Advertise with us - Post Your Good Content Here
We are ranked in the Top 20 on Google

AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version

B) Data Migration Services, Kinesis Firehose, Direct Connect

C) Snowball, Data Migration Services, Direct Connect

Invest in your future today by enrolling in this Azure Fundamentals - Pass the Azure Fundamentals Exam with Ease: Master the AZ-900 Certification with the Comprehensive Exam Preparation Guide!

Microsoft Azure AZ900 Certification and Training

D) Snowball, Direct Connection, Kinesis Firehose

ANSWER1:

Notes/Hint1:

AWS has many options to help get data into the cloud, including secure devices like AWS Import/Export Snowball to accelerate petabyte-scale data transfers, Amazon Kinesis Firehose to load streaming data, and scalable private connections through AWS Direct Connect.

AI Unraveled: Demystifying Frequently Asked Questions on Artificial Intelligence (OpenAI, ChatGPT, Google Gemini, Generative AI, Discriminative AI, xAI, LLMs, GPUs, Machine Learning, NLP, Promp Engineering)

Reference1: Big Data Analytics Options

AWS Data Analytics Specialty Certification Exam Preparation App is a great way to prepare for your upcoming AWS Data Analytics Specialty Certification Exam. The app provides you with over 300 questions and answers, detailed explanations of each answer, a scorecard to track your progress, and a countdown timer to help keep you on track. You can also find data science and data analytics interview questions and detailed answers, cheat sheets, and flashcards to help you study. The app is very similar to the real exam, so you will be well-prepared when it comes time to take the test.

AWS DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

"Pass the AWS Cloud Practitioner Certification with flying colors: Master the Exam with 300+ Quizzes, Cheat Sheets, Flashcards, and Illustrated Study Guides - 2024 Edition"

Question2: A company ingests a large set of clickstream data in nested JSON format from different sources and stores it in Amazon S3. Data analysts need to analyze this data in combination with data stored in an Amazon Redshift cluster. Data analysts want to build a cost-effective and automated solution for this need.
Which solution meets these requirements?

A) Use Apache Spark SQL on Amazon EMR to convert the clickstream data to a tabular format. Use the Amazon Redshift COPY command to load the data into the Amazon Redshift cluster.

B) Use AWS Lambda to convert the data to a tabular format and write it to Amazon S3. Use the Amazon Redshift COPY command to load the data into the Amazon Redshift cluster.

C) Use the Relationalize class in an AWS Glue ETL job to transform the data and write the data back to Amazon S3. Use Amazon Redshift Spectrum to create external tables and join with the internal tables.

"Become a Canada Expert: Ace the Citizenship Test and Impress Everyone with Your Knowledge of Canadian History, Geography, Government, Culture, People, Languages, Travel, Wildlife, Hockey, Tourism, Sceneries, Arts, and Data Visualization. Get the Top 1000 Canada Quiz Now!"

D) Use the Amazon Redshift COPY command to move the clickstream data directly into new tables in the Amazon Redshift cluster.

ANSWER2:

Notes/Hint2:

The Relationalize PySpark transform can be used to flatten the nested data into a structured format. Amazon Redshift Spectrum can join the external tables and query the transformed clickstream data in place rather than needing to scale the cluster to accommodate the large dataset.

Reference1: Relationalize PySpark

AWS DAS-C01 Exam Prep on iOS

Unlock the Secrets of Africa: Master African History, Geography, Culture, People, Cuisine, Economics, Languages, Music, Wildlife, Football, Politics, Animals, Tourism, Science and Environment with the Top 1000 Africa Quiz and Trivia. Get Yours Now!

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Question 3: There is a five-day car rally race across Europe. The race coordinators are using a Kinesis stream and IoT sensors to monitor the movement of the cars. Each car has a sensor and data is getting back to the stream with the default stream settings. On the last day of the rally, data is sent to S3. When you go to interpret the data in S3, there is only data for the last day and nothing for the first 4 days. Which of the following is the most probable cause of this?

A) You did not have versioning enabled and would need to create individual buckets to prevent the data from being overwritten.

B) Data records are only accessible for a default of 24 hours from the time they are added to a stream.

C) One of the sensors failed, so there was no data to record.

Dive into a comprehensive AWS Cloud Practitioner CLF-C02 Certification guide, masterfully weaving insights from Tutorials Dojo, Adrian Cantrill, Stephane Maarek, and AWS Skills Builder into one unified resource.

D) You needed to use EMR to send the data to S3; Kinesis Streams are only compatible with DynamoDB.

ANSWER3:

Notes/Hint3:

Streams support changes to the data record retention period of your stream. An Amazon Kinesis stream is an ordered sequence of data records, meant to be written to and read from in real-time. Data records are therefore stored in shards in your stream temporarily. The period from when a record is added to when it is no longer accessible is called the retention period. An Amazon Kinesis stream stores records for 24 hours by default, up to 168 hours.

Reference3: Kinesis Extended Reading

AWS DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

Ace the Microsoft Azure Fundamentals AZ-900 Certification Exam: Pass the Azure Fundamentals Exam with Ease

AWS DAS-C01 Exam Prep on Windows

Question 4: A publisher website captures user activity and sends clickstream data to Amazon Kinesis Data Streams. The publisher wants to design a cost-effective solution to process the data to create a timeline of user activity within a session. The solution must be able to scale depending on the number of active sessions.
Which solution meets these requirements?

A) Include a variable in the clickstream data from the publisher website to maintain a counter for the number of active user sessions. Use a timestamp for the partition key for the stream. Configure the consumer application to read the data from the stream and change the number of processor threads based upon the counter. Deploy the consumer application on Amazon EC2 instances in an EC2 Auto Scaling group.

B) Include a variable in the clickstream to maintain a counter for each user action during their session. Use the action type as the partition key for the stream. Use the Kinesis Client Library (KCL) in the consumer application to retrieve the data from the stream and perform the processing. Configure the consumer application to read the data from the stream and change the number of processor threads based upon the
counter. Deploy the consumer application on AWS Lambda.

C) Include a session identifier in the clickstream data from the publisher website and use as the partition key for the stream. Use the Kinesis Client Library (KCL) in the consumer application to retrieve the data from the stream and perform the processing. Deploy the consumer application on Amazon EC2 instances in an
EC2 Auto Scaling group. Use an AWS Lambda function to reshard the stream based upon Amazon CloudWatch alarms.

D) Include a variable in the clickstream data from the publisher website to maintain a counter for the number of active user sessions. Use a timestamp for the partition key for the stream. Configure the consumer application to read the data from the stream and change the number of processor threads based upon the counter. Deploy the consumer application on AWS Lambda.

ANSWER4:

Cloud Certification made simple. Ace your exams with Djamgatech.

Notes/Hint4:

Partitioning by the session ID will allow a single processor to process all the actions for a user session in order. An AWS Lambda function can call the UpdateShardCount API action to change the number of shards in the stream. The KCL will automatically manage the number of processors to match the number of shards. Amazon EC2 Auto Scaling will assure the correct number of instances are running to meet the processing load.

Reference4: UpdateShardCount API

AWS DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Question 5: Your company has two batch processing applications that consume financial data about the day’s stock transactions. Each transaction needs to be stored durably and guarantee that a record of each application is delivered so the audit and billing batch processing applications can process the data. However, the two applications run separately and several hours apart and need access to the same transaction information. After reviewing the transaction information for the day, the information no longer needs to be stored. What is the best way to architect this application?

A) Use SQS for storing the transaction messages; when the billing batch process performs first and consumes the message, write the code in a way that does not remove the message after consumed, so it is available for the audit application several hours later. The audit application can consume the SQS message and remove it from the queue when completed.

B) Use Kinesis to store the transaction information. The billing application will consume data from the stream and the audit application can consume the same data several hours later.

C) Store the transaction information in a DynamoDB table. The billing application can read the rows while the audit application will read the rows then remove the data.

D) Use SQS for storing the transaction messages. When the billing batch process consumes each message, have the application create an identical message and place it in a different SQS for the audit application to use several hours later.

SQS would make this more difficult because the data does not need to persist after a full day.

ANSWER5:

Notes/Hint5:

Kinesis appears to be the best solution that allows multiple consumers to easily interact with the records.

Reference5: Amazon Kinesis

Get mobile friendly version of the quiz @ the App Store

AWS DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Question 6: A company is currently using Amazon DynamoDB as the database for a user support application. The company is developing a new version of the application that will store a PDF file for each support case ranging in size from 1–10 MB. The file should be retrievable whenever the case is accessed in the application.
How can the company store the file in the MOST cost-effective manner?

A) Store the file in Amazon DocumentDB and the document ID as an attribute in the DynamoDB table.

B) Store the file in Amazon S3 and the object key as an attribute in the DynamoDB table.

C) Split the file into smaller parts and store the parts as multiple items in a separate DynamoDB table.

D) Store the file as an attribute in the DynamoDB table using Base64 encoding.

ANSWER6:

Notes/Hint6:

Use Amazon S3 to store large attribute values that cannot fit in an Amazon DynamoDB item. Store each file as an object in Amazon S3 and then store the object path in the DynamoDB item.

Reference6: S3 Storage Cost – DynamODB Storage Cost

AWS DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Question 7: Your client has a web app that emits multiple events to Amazon Kinesis Streams for reporting purposes. Critical events need to be immediately captured before processing can continue, but informational events do not need to delay processing. What solution should your client use to record these types of events without unnecessarily slowing the application?

A) Log all events using the Kinesis Producer Library.

B) Log critical events using the Kinesis Producer Library, and log informational events using the PutRecords API method.

C) Log critical events using the PutRecords API method, and log informational events using the Kinesis Producer Library.

D) Log all events using the PutRecords API method.

ANSWER2:

Notes/Hint7:

The PutRecords API can be used in code to be synchronous; it will wait for the API request to complete before the application continues. This means you can use it when you need to wait for the critical events to finish logging before continuing. The Kinesis Producer Library is asynchronous and can send many messages without needing to slow down your application. This makes the KPL ideal for the sending of many non-critical alerts asynchronously.

Reference7: PutRecords API

AWS DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Question 8: You work for a start-up that tracks commercial delivery trucks via GPS. You receive coordinates that are transmitted from each delivery truck once every 6 seconds. You need to process these coordinates in near real-time from multiple sources and load them into Elasticsearch without significant technical overhead to maintain. Which tool should you use to digest the data?

A) Amazon SQS

B) Amazon EMR

C) AWS Data Pipeline

D) Amazon Kinesis Firehose

ANSWER8:

Notes/Hint8:

Amazon Kinesis Firehose is the easiest way to load streaming data into AWS. It can capture, transform, and load streaming data into Amazon S3, Amazon Redshift, and Amazon Elasticsearch Service, enabling near real-time analytics with existing business intelligence tools and dashboards.

Reference8: Amazon Kinesis Firehose

AWS DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Question 9: A company needs to implement a near-real-time fraud prevention feature for its ecommerce site. User and order details need to be delivered to an Amazon SageMaker endpoint to flag suspected fraud. The amount of input data needed for the inference could be as much as 1.5 MB.
Which solution meets the requirements with the LOWEST overall latency?

A) Create an Amazon Managed Streaming for Kafka cluster and ingest the data for each order into a topic. Use a Kafka consumer running on Amazon EC2 instances to read these messages and invoke the Amazon SageMaker endpoint.

B) Create an Amazon Kinesis Data Streams stream and ingest the data for each order into the stream. Create an AWS Lambda function to read these messages and invoke the Amazon SageMaker endpoint.

C) Create an Amazon Kinesis Data Firehose delivery stream and ingest the data for each order into the stream. Configure Kinesis Data Firehose to deliver the data to an Amazon S3 bucket. Trigger an AWS Lambda function with an S3 event notification to read the data and invoke the Amazon SageMaker endpoint.

D) Create an Amazon SNS topic and publish the data for each order to the topic. Subscribe the Amazon SageMaker endpoint to the SNS topic.

ANSWER9:

Notes/Hint9:

An Amazon Managed Streaming for Kafka cluster can be used to deliver the messages with very low latency. It has a configurable message size that can handle the 1.5 MB payload.

Reference9: Amazon Managed Streaming for Kafka cluster

AWS DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Question 10: You need to filter and transform incoming messages coming from a smart sensor you have connected with AWS. Once messages are received, you need to store them as time series data in DynamoDB. Which AWS service can you use?

A) IoT Device Shadow Service

B) Redshift

C) Kinesis

D) IoT Rules Engine

ANSWER10:

Notes/Hint10:

The IoT rules engine will allow you to send sensor data over to AWS services like DynamoDB

Reference10: The IoT rules engine

Get mobile friendly version of the quiz @ the App Store

Question 11: A media company is migrating its on-premises legacy Hadoop cluster with its associated data processing scripts and workflow to an Amazon EMR environment running the latest Hadoop release. The developers want to reuse the Java code that was written for data processing jobs for the on-premises cluster.
Which approach meets these requirements?

A) Deploy the existing Oracle Java Archive as a custom bootstrap action and run the job on the EMR cluster.

B) Compile the Java program for the desired Hadoop version and run it using a CUSTOM_JAR step on the EMR cluster.

C) Submit the Java program as an Apache Hive or Apache Spark step for the EMR cluster.

D) Use SSH to connect the master node of the EMR cluster and submit the Java program using the AWS CLI.

ANSWER11:

Notes/Hint11:

A CUSTOM JAR step can be configured to download a JAR file from an Amazon S3 bucket and execute it. Since the Hadoop versions are different, the Java application has to be recompiled.

Reference11: Automating analytics workflows on EMR

AWS DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Question 12: You currently have databases running on-site and in another data center off-site. What service allows you to consolidate to one database in Amazon?

A) AWS Kinesis

B) AWS Database Migration Service

C) AWS Data Pipeline

D) AWS RDS Aurora

ANSWER12:

Notes/Hint12:

AWS Database Migration Service can migrate your data to and from most of the widely used commercial and open source databases. It supports homogeneous migrations such as Oracle to Oracle, as well as heterogeneous migrations between different database platforms, such as Oracle to Amazon Aurora. Migrations can be from on-premises databases to Amazon RDS or Amazon EC2, databases running on EC2 to RDS, or vice versa, as well as from one RDS database to another RDS database.

Reference12: DMS

AWS DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Question 13: An online retail company wants to perform analytics on data in large Amazon S3 objects using Amazon EMR. An Apache Spark job repeatedly queries the same data to populate an analytics dashboard. The analytics team wants to minimize the time to load the data and create the dashboard.
Which approaches could improve the performance? (Select TWO.)

A) Copy the source data into Amazon Redshift and rewrite the Apache Spark code to create analytical reports by querying Amazon Redshift.

B) Copy the source data from Amazon S3 into Hadoop Distributed File System (HDFS) using s3distcp.

C) Load the data into Spark DataFrames.

D) Stream the data into Amazon Kinesis and use the Kinesis Connector Library (KCL) in multiple Spark jobs to perform analytical jobs.

E) Use Amazon S3 Select to retrieve the data necessary for the dashboards from the S3 objects.

ANSWER13:

C and E

Notes/Hint13:

One of the speed advantages of Apache Spark comes from loading data into immutable dataframes, which can be accessed repeatedly in memory. Spark DataFrames organizes distributed data into columns. This makes summaries and aggregates much quicker to calculate. Also, instead of loading an entire large Amazon S3 object, load only what is needed using Amazon S3 Select. Keeping the data in Amazon S3 avoids loading the large dataset into HDFS.

Reference13: Spark DataFrames

Question 14: You have been hired as a consultant to provide a solution to integrate a client’s on-premises data center to AWS. The customer requires a 300 Mbps dedicated, private connection to their VPC. Which AWS tool do you need?

A) VPC peering

B) Data Pipeline

C) Direct Connect

D) EMR

ANSWER14:

Notes/Hint14:

Direct Connect will provide a dedicated and private connection to an AWS VPC.

Reference14: Direct Connect

AWS DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Question 15: Your organization has a variety of different services deployed on EC2 and needs to efficiently send application logs over to a central system for processing and analysis. They’ve determined it is best to use a managed AWS service to transfer their data from the EC2 instances into Amazon S3 and they’ve decided to use a solution that will do what?

A) Installs the AWS Direct Connect client on all EC2 instances and uses it to stream the data directly to S3.

B) Leverages the Kinesis Agent to send data to Kinesis Data Streams and output that data in S3.

C) Ingests the data directly from S3 by configuring regular Amazon Snowball transactions.

D) Leverages the Kinesis Agent to send data to Kinesis Firehose and output that data in S3.

ANSWER15:

Notes/Hint15:

Kinesis Firehose is a managed solution, and log files can be sent from EC2 to Firehose to S3 using the Kinesis agent.

Reference15: Kinesis Firehose

Question 16: A data engineer needs to create a dashboard to display social media trends during the last hour of a large company event. The dashboard needs to display the associated metrics with a latency of less than 1 minute.
Which solution meets these requirements?

A) Publish the raw social media data to an Amazon Kinesis Data Firehose delivery stream. Use Kinesis Data Analytics for SQL Applications to perform a sliding window analysis to compute the metrics and output the results to a Kinesis Data Streams data stream. Configure an AWS Lambda function to save the stream data to an Amazon DynamoDB table. Deploy a real-time dashboard hosted in an Amazon S3 bucket to read and display the metrics data stored in the DynamoDB table.

B) Publish the raw social media data to an Amazon Kinesis Data Firehose delivery stream. Configure the stream to deliver the data to an Amazon Elasticsearch Service cluster with a buffer interval of 0 seconds. Use Kibana to perform the analysis and display the results.

C) Publish the raw social media data to an Amazon Kinesis Data Streams data stream. Configure an AWS Lambda function to compute the metrics on the stream data and save the results in an Amazon S3 bucket. Configure a dashboard in Amazon QuickSight to query the data using Amazon Athena and display the results.

D) Publish the raw social media data to an Amazon SNS topic. Subscribe an Amazon SQS queue to the topic. Configure Amazon EC2 instances as workers to poll the queue, compute the metrics, and save the results to an Amazon Aurora MySQL database. Configure a dashboard in Amazon QuickSight to query the data in Aurora and display the results.

ANSWER16:

Notes/Hint16:

Amazon Kinesis Data Analytics can query data in a Kinesis Data Firehose delivery stream in near-real time using SQL. A sliding window analysis is appropriate for determining trends in the stream. Amazon S3 can host a static webpage that includes JavaScript that reads the data in Amazon DynamoDB and refreshes the dashboard.

Reference16: Amazon Kinesis Data Analytics can query data in a Kinesis Data Firehose delivery stream in near-real time using SQL

Question 17: A real estate company is receiving new property listing data from its agents through .csv files every day and storing these files in Amazon S3. The data analytics team created an Amazon QuickSight visualization report that uses a dataset imported from the S3 files. The data analytics team wants the visualization report to reflect the current data up to the previous day. How can a data analyst meet these requirements?

A) Schedule an AWS Lambda function to drop and re-create the dataset daily.

B) Configure the visualization to query the data in Amazon S3 directly without loading the data into SPICE.

C) Schedule the dataset to refresh daily.

D) Close and open the Amazon QuickSight visualization.

ANSWER17:

Notes/Hint17:

Datasets created using Amazon S3 as the data source are automatically imported into SPICE. The Amazon QuickSight console allows for the refresh of SPICE data on a schedule.

Reference17: Amazon QuickSight and SPICE

AWS DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Question 18: You need to migrate data to AWS. It is estimated that the data transfer will take over a month via the current AWS Direct Connect connection your company has set up. Which AWS tool should you use?

A) Establish additional Direct Connect connections.

B) Use Data Pipeline to migrate the data in bulk to S3.

C) Use Kinesis Firehose to stream all new and existing data into S3.

D) Snowball

ANSWER18:

Notes/Hint18:

As a general rule, if it takes more than one week to upload your data to AWS using the spare capacity of your existing Internet connection, then you should consider using Snowball. For example, if you have a 100 Mb connection that you can solely dedicate to transferring your data and need to transfer 100 TB of data, it takes more than 100 days to complete a data transfer over that connection. You can make the same transfer by using multiple Snowballs in about a week.

Reference18: Snowball

Question 19: You currently have an on-premises Oracle database and have decided to leverage AWS and use Aurora. You need to do this as quickly as possible. How do you achieve this?

A) It is not possible to migrate an on-premises database to AWS at this time.

B) Use AWS Data Pipeline to create a target database, migrate the database schema, set up the data replication process, initiate the full load and a subsequent change data capture and apply, and conclude with a switchover of your production environment to the new database once the target database is caught up with the source database.

C) Use AWS Database Migration Services and create a target database, migrate the database schema, set up the data replication process, initiate the full load and a subsequent change data capture and apply, and conclude with a switch-over of your production environment to the new database once the target database is caught up with the source database.

D) Use AWS Glue to crawl the on-premises database schemas and then migrate them into AWS with Data Pipeline jobs.

https://aws.amazon.com/dms/faqs/

ANSWER19:

Notes/Hint19:

DMS can efficiently support this sort of migration using the steps outlined. While AWS Glue can help you crawl schemas and store metadata on them inside of Glue for later use, it isn’t the best tool for actually transitioning a database over to AWS itself. Similarly, while Data Pipeline is great for ETL and ELT jobs, it isn’t the best option to migrate a database over to AWS.

Reference19: DMS

AWS DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Question 20: A financial company uses Amazon EMR for its analytics workloads. During the company’s annual security audit, the security team determined that none of the EMR clusters’ root volumes are encrypted. The security team recommends the company encrypt its EMR clusters’ root volume as soon as possible.
Which solution would meet these requirements?

A) Enable at-rest encryption for EMR File System (EMRFS) data in Amazon S3 in a security configuration. Re-create the cluster using the newly created security configuration.

B) Specify local disk encryption in a security configuration. Re-create the cluster using the newly created security configuration.

C) Detach the Amazon EBS volumes from the master node. Encrypt the EBS volume and attach it back to the master node.

D) Re-create the EMR cluster with LZO encryption enabled on all volumes.

ANSWER20:

Notes/Hint20:

Local disk encryption can be enabled as part of a security configuration to encrypt root and storage volumes.

Reference20: EMR Cluster Local disk encryption

Question 21: A company has a clickstream analytics solution using Amazon Elasticsearch Service. The solution ingests 2 TB of data from Amazon Kinesis Data Firehose and stores the latest data collected within 24 hours in an Amazon ES cluster. The cluster is running on a single index that has 12 data nodes and 3 dedicated master nodes. The cluster is configured with 3,000 shards and each node has 3 TB of EBS storage attached. The Data Analyst noticed that the query performance of Elasticsearch is sluggish, and some intermittent errors are produced by the Kinesis Data Firehose when it tries to write to the index. Upon further investigation, there were occasional JVMMemoryPressure errors found in Amazon ES logs.

What should be done to improve the performance of the Amazon Elasticsearch Service cluster?

A) Improve the cluster performance by increasing the number of master nodes of Amazon Elasticsearch.

B) Improve the cluster performance by increasing the number of shards of the Amazon Elasticsearch index.

C) Improve the cluster performance by decreasing the number of data nodes of Amazon Elasticsearch.

D) Improve the cluster performance by decreasing the number of shards of the Amazon Elasticsearch index.

ANSWER21:
D

Notes/Hint21:

“Amazon Elasticsearch Service (Amazon ES) is a managed service that makes it easy to deploy, operate, and scale Elasticsearch clusters in AWS Cloud. Elasticsearch is a popular open-source search and analytics engine for use cases such as log analytics, real-time application monitoring, and clickstream analysis. With Amazon ES, you get direct access to the Elasticsearch APIs; existing code and applications work seamlessly with the service.

Each Elasticsearch index is split into some number of shards. You should decide the shard count before indexing your first document. The overarching goal of choosing a number of shards is to distribute an index evenly across all data nodes in the cluster. However, these shards shouldn’t be too large or too numerous.

A good rule of thumb is to try to keep a shard size between 10 – 50 GiB. Large shards can make it difficult for Elasticsearch to recover from failure, but because each shard uses some amount of CPU and memory, having too many small shards can cause performance issues and out of memory errors. In other words, shards should be small enough that the underlying Amazon ES instance can handle them, but not so small that they place needless strain on the hardware. Therefore the correct answer is: Improve the cluster performance by decreasing the number of shards of Amazon Elasticsearch index.

Reference: ElasticsSearch

Question 22: A data lake is a central repository that enables which operation?

A) Store unstructured data from a single data source

B) Store structured data from any data source

C) Store structure and unstructured data from any source

D) Store structured and unstructured data from a single source

ANSWER22:
C

Notes/Hint22:

Data lake is a centralized repository for large amounts of structured and unstructured data to enable direct analytics.

Reference: Data Lakes

Question 23: What is the most cost-effective storage option for your data lake?

A) Amazon EBS

B) Amazon S3

C) Amazon RDS

D) Amazon Redshift

ANSWER23:
B

Notes/Hint23:

Amazon S3

Reference: Data Lakes – S3

Question 24: Which services are used in the processing layer of a data lake architecture? (SELECT TWO)

A. AWS Snowball

B. AWS Glue

C. Amazon EMR

D. Amazon QuickSight

ANSWER24:
B and C

Notes/Hint24:

Amazon Glue and Amazon EMR

Reference: Data Lakes – Glue and EMR

Question 25: Which services can be used for data ingestion into your data lake? (SELECT TWO)

A) Amazon Kinesis Data Firehose

B) Amazon QuickSight

C) Amazon Athena

D) AWS Storage Gateway

ANSWER25:
A and D

Notes/Hint25:

Amazon Kinesis Data Firehose and and Amazon Storage Gateway

Reference: Data Lakes

Question 26: Which service uses continuous data replication with high availability to consolidate databases into a petabyte-scale data warehouse by streaming data to amazon Redshift and Amazon S3?

A) AWS Storage Gateway

B) AWS Schema Conversion Tool

C) AWS Database Migration Service

D) Amazon Kinesis Data Firehose

ANSWER26:
C

Notes/Hint26:

AWS Database Migration Service

Reference: Data Lakes

Question 27: What is the AWS Glue Data Catalog?

A) A fully managed ETL (extract, transform, and load) pipeline service

B) A service to schedule jobs

C) A visual data preparation tool

D) An index to the location, schema, and runtime metrics of your data

ANSWER27:
D

Notes/Hint27:

An index to the location, schema, and runtime metrics of your data

Reference: Data Lakes

Questions 28: What AWS Glue feature “catalogs” your data?

A) AWS Glue crawler

B) AWS Glue DataBrew

C) AWS Glue Studio

D) AWS Glue Elastic Views

ANSWER28:
A

Notes/Hint28:

AWS Glue crawler

Reference: Data Lakes

Question 29: During your data preparation stage, the raw data has been enriched to support additional insights. You need to improve query performance and reduce costs of the final analytics solution.

Which data formats meet these requirements (SELECT TWO)

ANSWER29:
C and D

Notes/Hint29:

Apache Parquet and Apache ORC

Reference: Data Lakes

Question 30: Your small start-uo company is developing a data analytics solution. You need to clean and normalize large datasets, but you do not have developers with the skill set to write custom scripts. Which tool will help efficiently design and run the data preparation activities?

ANSWER30:
B

Notes/Hint30:

AWS Glue DataBrew

To be able to run analytics, build reports, or apply machine learning, you need to be sure the data you’re using is clean and in the right format. This data preparation step requires data analysts and data scientists to write custom code and perform many manual activities. When cleaning and normalizing data, it is helpful to first review the dataset to understand which possible values are present. Simple visualizations are helpful for determining whether correlations exist between the columns.

AWS Glue DataBrew is a visual data preparation tool that helps you clean and normalize data up to 80% faster so you can focus more on the business value you can get. DataBrew provides a visual interface that quickly connects to your data stored in Amazon S3, Amazon Redshift, Amazon Relational Database Service (RDS), any JDBC-accessible data store, or data indexed by the AWS Glue Data Catalog. You can then explore the data, look for patterns, and apply transformations. For example, you can apply joins and pivots, merge different datasets, or use functions to manipulate data.

Reference: Data Lakes

Question 30: In which scenario would you use AWS Glue jobs?

A) Analyze data in real-time as data comes into the data lake

B) Transform data in real-time as data comes into the data lake

C) Analyze data in batches on schedule or on demand

D) Transform data in batches on schedule or on demand.

ANSWER30:
D

Notes/Hint30:

An AWS Glue job encapsulates a script that connects to your source data, processes it, and then writes it out to your data target. Typically, a job runs extract, transform, and load (ETL) scripts. Jobs can also run general-purpose Python scripts (Python shell jobs.) AWS Glue triggers can start jobs based on a schedule or event, or on demand. You can monitor job runs to understand runtime metrics such as completion status, duration, and start tim

Question 31: Your data resides in multiple data stores, including Amazon S3, Amazon RDS, and Amazon DynamoDB. You need to efficiently query the combined datasets.

Which tool can achieve this, using a single query, without moving data?

A) Amazon Athena Federated Query

B) Amazon Redshift Query Editor

C) SQl Workbench

D) AWS Glue DataBrew

ANSWER31:
A

Notes/Hint31:

With Amazon Athena Federated Query, you can run SQL queries across a variety of relational, non-relational, and custom data sources. You get a unified way to run SQL queries across various data stores.

Athena uses data source connectors that run on AWS Lambda to run federated queries. A data source connector is a piece of code that can translate between your target data source and Athena. You can think of a connector as an extension of Athena’s query engine. Pre-built Athena data source connectors exist for data sources like Amazon CloudWatch Logs, Amazon DynamoDB, Amazon DocumentDB, Amazon RDS, and JDBC-compliant relational data sources such MySQL and PostgreSQL under the Apache 2.0 license. You can also use the Athena Query Federation SDK to write custom connectors. To choose, configure, and deploy a data source connector to your account, you can use the Athena and Lambda consoles or the AWS Serverless Application Repository. After you deploy data source connectors, the connector is associated with a catalog that you can specify in SQL queries. You can combine SQL statements from multiple catalogs and span multiple data sources with a single query.

Question 32: Which benefit do you achieve by using AWS Lake Formation to build data lakes?

A) Build data lakes quickly

B) Simplify security management

C) Provide self-service access to data

D) All of the above

ANSWER32:
D

Notes/Hint32:

Build data lakes quickly

With Lake Formation, you can move, store, catalog, and clean your data faster. You simply point Lake Formation at your data sources, and Lake Formation crawls those sources and moves the data into your new Amazon S3 data lake. Lake Formation organizes data in S3 around frequently used query terms and into right-sized chunks to increase efficiency. Lake Formation also changes data into formats like Apache Parquet and ORC for faster analytics. In addition, Lake Formation has built-in machine learning to deduplicate and find matching records (two entries that refer to the same thing) to increase data quality.

Simplify security management

You can use Lake Formation to centrally define security, governance, and auditing policies in one place, versus doing these tasks per service. You can then enforce those policies for your users across their analytics applications. Your policies are consistently implemented, eliminating the need to manually configure them across security services like AWS Identity and Access Management (AWS IAM) and AWS Key Management Service (AWS KMS), storage services like Amazon S3, and analytics and machine learning services like Amazon Redshift, Amazon Athena, and (in beta) Amazon EMR for Apache Spark. This reduces the effort in configuring policies across services and provides consistent enforcement and compliance.

Provide self-service access to data

With Lake Formation, you build a data catalog that describes the different available datasets along with which groups of users have access to each. This makes your users more productive by helping them find the right dataset to analyze. By providing a catalog of your data with consistent security enforcement, Lake Formation makes it easier for your analysts and data scientists to use their preferred analytics service. They can use Amazon EMR for Apache Spark (in beta), Amazon Redshift, or Amazon Athena on diverse datasets that are now housed in a single data lake. Users can also combine these services without having to move data between silos.

Question 33: What are the three stages to set up a data lake using AWS Lake Formation? (SELECT THREE)

A) Register the storage location

B) Create a database

C) Populate the database

D) Grant permissions

ANSWER33:
A B and D

Notes/Hint33:

Lake Formation manages access to designated storage locations within Amazon S3. Register the storage locations that you want to be part of the data lake.

Create a database

Lake Formation organizes data into a catalog of logical databases and tables. Create one or more databases and then automatically generate tables during data ingestion for common workflows.

Grant permissions

Lake Formation manages access for IAM users, roles, and Active Directory users and groups via flexible database, table, and column permissions. Grant permissions to one or more resources for your selected users.

Question 34: Which of the following AWS Lake Formation tasks are performed by the AWS Glue service? (SELECT THREE)

A) ETL code creation and job monitoring

B) Blueprints to create workflows

C) Data catalog and serverless architecture

D) Simplify securty management

ANSWER34:
A B and C

Notes/Hint34:

Lake Formation leverages a shared infrastructure with AWS Glue, including console controls, ETL code creation and job monitoring, blueprints to create workflows for data ingest, the same data catalog, and a serverless architecture. While AWS Glue focuses on these types of functions, Lake Formation encompasses all AWS Glue features AND provides additional capabilities designed to help build, secure, and manage a data lake. See the AWS Glue features page for more de

Question 35: A digital media customer needs to quickly build a data lake solution for the data housed in a PostgreSQL database. As a solutions architect, what service and feature would meet this requirement?

A) Copy PostgreSQL data to an Amazon S3 bucket and build a data lake using AWS Lake Formation

B) Use AWS Lake Formation blueprints

C) Build a data lake manually

D) Build an analytics solution by directly accessing the database.

ANSWER35:
B

Notes/Hint35:

A blueprint is a data management template that enables you to easily ingest data into a data lake. Lake Formation provides several blueprints, each for a predefined source type, such as a relational database or AWS CloudTrail logs. From a blueprint, you can create a workflow. Workflows consist of AWS Glue crawlers, jobs, and triggers that are generated to orchestrate the loading and update of data. Blueprints take the data source, data target, and schedule as input to configure the workflow.

Question 36: AWS Lake Formation has a set of suggested personas and IAM permissions. Which is a required persona?

A) Data lake administrator

B) Data engineer

C) Data analyst

D) Business analyst

ANSWER36:
A

Notes/Hint36:

Data lake administrator (Required)

A user who can register Amazon S3 locations, access the Data Catalog, create databases, create and run workflows, grant Lake Formation permissions to other users, and view AWS CloudTrail logs. The user has fewer IAM permissions than the IAM administrator but enough to administer the data lake. Cannot add other data lake administrators.

Data engineer (Optional) A user who can create and run crawlers and workflows and grant Lake Formation permissions on the Data Catalog tables that the crawlers and workflows create.

Data analyst (Optional) A user who can run queries against the data lake using, for example, Amazon Athena. The user has only enough permissions to run queries.

Business analyst (Optional) Generally, an end-user application specific persona that would query data and resource using a workflow role.

Question 37: Which three types of blueprints does AWS Lake Formation support? (SELECT THREE)

A) ETL code creation and job monitoring

B) Database snapshot

C) Incremental database

D) Log file sources (AWS CloudTrail, ELB/ALB logs)

ANSWER37:
B C and D

Notes/Hint37:

AWS Lake Formation blueprints simplify and automate creating workflows. Lake Formation provides the following types of blueprints:

• Database snapshot – Loads or reloads data from all tables into the data lake from a JDBC source. You can exclude some data from the source based on an exclude pattern.

• Incremental database – Loads only new data into the data lake from a JDBC source, based on previously set bookmarks. You specify the individual tables in the JDBC source database to include. For each table, you choose the bookmark columns and bookmark sort order to keep track of data that has previously been loaded. The first time that you run an incremental database blueprint against a set of tables, the workflow loads all data from the tables and sets bookmarks for the next incremental database blueprint run. You can therefore use an incremental database blueprint instead of the database snapshot blueprint to load all data, provided that you specify each table in the data source as a paramete

• Log file – Bulk loads data from log file sources, including AWS CloudTrail, Elastic Load Balancing logs, and Application Load Balancer logs.

Question 38: Which one of the following is the best description of the capabilities of Amazon QuickSight?

A) Automated configuration service build on AWS Glue

B) Fast, serverless, business intelligence service

C) Fast, simple, cost-effective data warehousing

D) Simple, scalable, and serverless data integration

ANSWER38:
B C and D

Notes/Hint38:

B. Scalable, serverless business intelligence service is the correct choice.

See the brief descriptions of several AWS Analytics services below:

AWS Lake Formation Build a secure data lake in days using Glue blueprints and workflows

Amazon QuickSight Scalable, serverless, embeddable, ML-powered BI Service built for the cloud

Amazon Redshift Analyze all of your data with the fastest and most widely used cloud data warehouse

AWS Glue Simple, scalable, and serverless data integration

Question 39: Which benefits are provided by Amazon Redshift? (Select TWO)

A) Analyze Data stored in your data lake

B) Maintain performance at scale

C) Focus effort on Data warehouse administration

D) Store all the data to meet analytics need

E) Amazon Redshift includes enterprise-level security and compliance features.

ANSWER38:
A and B

Notes/Hint38:

A is correct – With Amazon Redshift, you can analyze all your data, including exabytes of data stored in your Amazon S3 data lake.

B is correct – Amazon Redshift provides consistent performance at scale.

• C is incorrect – Amazon Redshift is a fully managed data warehouse solution. It includes automations to reduce the administrative overhead traditionally associated with data warehouses. When using Amazon Redshift, you can focus your development effort on strategic data analytics solutions.

• D is incorrect – With Amazon Redshift features—such as Amazon Redshift Spectrum, materialized views, and federated query—you can analyze data where it is stored in your data lake or AWS databases. This capability provides flexibility to meet new analytics requirements without the cost, time, or complexity of moving large volumes of data between solutions.

• Answer E is incorrect – Amazon Redshift includes enterprise-level security and compliance features.

Djamga Data Sciences Big Data – Data Analytics Youtube Playlist

2- Prepare for Your AWS Certification Exam

3- LinuxAcademy

Big Data – Data Analytics Jobs:

Big Data – Data Analytics – Data Sciences Latest News:

AWS DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

DATA ANALYTICS Q&A:

Can I know what is big data?

(MySQL) When I save an invoice, the records save in more than 5 tables (ex: invoice, stock, etc). Save, if only, all the tables received all the data successfully otherwise reject that invoice which should not impact any table?

How can I become a data scientist?

Data analysis made easy: Text2Code for Jupyter notebook

What should be the learning sequence (among the following skills) to become a Data scientist – Tableau, Data Analysis, SQL, Data Science and Python (please indicate if any skill(s) is redundant?

What’s the best thing a data scientist can do for his career?

What is the best way to learn SQL for data science?

Learn Deep Learning With Python Tutorials Complete Guide…

How can I become a data scientist?

What are some struggles for people who are trying to break into the Data Science field?

Why should a data scientist work in a startup?

Why does a Data scientist needs to learn SQL when all the data exploration and data munging can be done in R or Python?

As data science changes over the next decade, what are the most crucial skills to learn to stay relevant?

Is Machine Learning a good choice for a career?

How do I learn more advanced SQL?

How can I become a data scientist?

Where can I intuitively learn SQL Joins specifically?

What is the most overrated thing about data science?

Is SQL an easy programming language?

Can a position as a data analyst lead to a position as a data scientist?

In layman’s terms (you naughty experts), can you explain how huge corporations like Google, Microsoft, Facebook, etc. upgrade their data centres and servers?

What are some of the natural careers people transition into after 5-7 years in data science/analytics?

Why would people, especially data scientists (maybe those in the industry) use SQL, rather than Pandas?

How much level of SQL knowledge is required for data analytics?

How do I learn data science (step-by-step)?

How do I get data science, computer vision, and machine learning tutorial apps?

Roughly speaking, how long should it take a person with no programming background (who already knows probability & statistics) to learn how to use R and Python for data analytics well enough to land a decent job as a data analyst?

How can I become a data scientist or a data analyst?

Does data science have good potential, or is it only hype?

What are some wrong ways to learn data science?

Is Data Science actually in demand? What skills can make me launch a fruitful career in Data Science?

Which are the best analytics and data visualization companies who implement Tableau?

Will the salary of data scientists decrease a lot in 2025? Is it worth starting a degree in it?

Will there be more data science and machine learning jobs in the future? I’m wondering if it is a good career choice.

Here’s a all list of the popular Data Science Tools used by Data Scientists !!!!…

How should you answer the question “How’s your SQL skills?” when asked by the data scientist hiring manager?

Are CS questions part of a data scientist interview at Facebook and are there interview questions position specific?

Can I work in data science if I don’t like programming?

How to organize data cleaning scripts

[/bg_collapse]

Clever Questions, Answers, Resources about:

Data Sciences
Big Data
Data Analytics
Data Sciences
Databases
Data Streams
Large DataSets

What Is a Data Scientist?

Data Scientist (n.): Person who is better at statistics than any software engineer and better at software engineering than any statistician. – Josh Wills
Data scientists apply sophisticated quantitative and computer science skills to both structure and analyze massive stores or continuous streams of unstructured data, with the intent to derive insights and prescribe action. – Burtch Works Data Science Salary Survey, May 2018
More than anything, what data scientists do is make discoveries while swimming in data… In a competitive landscape where challenges keep changing and data never stop flowing, data scientists help decision makers shift from ad hoc analysis to an ongoing conversation with data. – Data Scientist: The Sexiest Job of the 21st Century, Harvard Business Review

Do All Data Scientists Hold Graduate Degrees?

Data scientists are highly educated. With exceedingly rare exception, every data scientist holds at least an undergraduate degree. 91% of data scientists in 2018 held advanced degrees. The remaining 9% all held undergraduate degrees. Furthermore,

25% of data scientists hold a degree in statistics or mathematics,
20% have a computer science degree,
an additional 20% hold a degree in the natural sciences, and
18% hold an engineering degree.

The remaining 17% of surveyed data scientists held degrees in business, social science, or economics.

How Are Data Scientists Different From Data Analysts?

Broadly speaking, the roles differ in scope: data analysts build reports with narrow, well-defined KPIs. Data scientists often to work on broader business problems without clear solutions. Data scientists live on the edge of the known and unknown.

We’ll leave you with a concrete example: A data analyst cares about profit margins. A data scientist at the same company cares about market share.

How Is Data Science Used in Medicine?

Data science in healthcare best translates to biostatistics. It can be quite different from data science in other industries as it usually focuses on small samples with several confounding variables.

How Is Data Science Used in Manufacturing?

Data science in manufacturing is vast; it includes everything from supply chain optimization to the assembly line.

What are data scientists paid?

Most people are attracted to data science for the salary. It’s true that data scientists garner high salaries compares to their peers. There is data to support this: The May 2018 edition of the BurtchWorks Data Science Salary Survey, annual salary statistics were

Note the above numbers do not reflect total compensation which often includes standard benefits and may include company ownership at high levels.

How will data science evolve in the next 5 years?

Will AI replace data scientists?

What is the workday like for a data scientist?

It’s common for data scientists across the US to work 40 hours weekly. While company culture does dictate different levels of work life balance, it’s rare to see data scientists who work more than they want. That’s the virtue of being an expensive resource in a competitive job market.

How do I become a Data Scientist?

The roadmap given to aspiring data scientists can be boiled down to three steps:

Earning an undergraduate and/or advanced degree in computer science, statistics, or mathematics,
Building their portfolio of SQL, Python, and R skills, and
Getting related work experience through technical internships.

All three require a significant time and financial commitment.

There used to be a saying around datascience: The road into a data science starts with two years of university-level math.

What Should I Learn? What Order Do I Learn Them?

This answer assumes your academic background ends with a HS diploma in the US.

Python
Differential Calculus
Integral Calculus
Multivariable Calculus
Linear Algebra
Probability
Statistics

Some follow up questions and answers:

Why Python first?

Python is a general purpose language. R is used primarily by statisticians. In the likely scenario that you decide data science requires too much time, effort, and money, Python will be more valuable than your R skills. It’s preparing you to fail, sure, but in the same way a savings account is preparing you to fail.

When do I start working with data?

You’ll start working with data when you’ve learned enough Python to do so. Whether you’ll have the tools to have any fun is a much more open-ended question.

How long will this take me?

Assuming self-study and average intelligence, 3-5 years from start to finish.

How Do I Learn Python?

If you don’t know the first thing about programming, start with MIT’s course in the curated list.

These modules are the standard tools for data analysis in Python:

pandas (and by extension, numpy)Check out Minimally Sufficient Pandas for style guides and best practices.
matplotlib and seaborn See /u/rhiever’s response to How do you decide between the plotting libraries: Matplotlib, Seaborn, Bokeh?Don’t worry about bokeh or dash unless you have a personal interest in interactive visualizations.
scipy and scikit-learnInternalize the .fit() and .predict() pattern.

Curated Threads & Resources

MIT’s Introduction to Computer Science and Programming in Python A free, archived course taught at MIT in the fall 2016 semester.
Data Scientist with Python Career Track | DataCamp The first courses are free, but unlimited access costs $29/month. Users usually report a positive experience, and it’s one of the better hands-on ways to learn Python.
Sentdex’s (Harrison Kinsley) Youtube Channel Related to Python Programming Tutorials
/r/learnpython is an active sub and very useful for learning the basics.

How Do I Learn R?

If you don’t know the first thing about programming, start with R for Data Science in the curated list.

These modules are the standard tools for data analysis in Python:

Curated Threads & Resources

R for Data Science by Hadley WickhamA free ebook full of succinct code examples. Terrific for learning tidyverse syntax.Folks with some math background may prefer the free alternative, Introduction to Statistical Learning.
Data Scientist with R Career Track | DataCamp The first courses are free, but unlimited access costs $29/month. Users usually report a positive experience, and it’s one of the few hands-on ways to learn R.
R Inferno Learners with a CS background will appreciate this free handbook explaining how and why R behaves the way that it does.

How Do I Learn SQL?

Prioritize the basics of SQL. i.e. when to use functions like POW, SUM, RANK; the computational complexity of the different kinds of joins.

Concepts like relational algebra, when to use clustered/non-clustered indexes, etc. are useful, but (almost) never come up in interviews.

You absolutely do not need to understand administrative concepts like managing permissions.

Finally, there are numerous query engines and therefore numerous dialects of SQL. Use whichever dialect is supported in your chosen resource. There’s not much difference between them, so it’s easy to learn another dialect after you’ve learned one.

Curated Threads & Resources

The SQL Tutorial for Data Analysis | Mode.com
Introduction to Databases A Free MOOC supported by Stanford University.
SQL Queries for Mere MortalsA $30 book highly recommended by /u/karmanujan

How Do I Learn Calculus?

Fortunately (or unfortunately), calculus is the lament of many students, and so resources for it are plentiful. Khan Academy mimics lectures very well, and Paul’s Online Math Notes are a terrific reference full of practice problems and solutions.

Calculus, however, is not just calculus. For those unfamiliar with US terminology,

Calculus I is differential calculus.
Calculus II is integral calculus.
Calculus III is multivariable calculus.
Calculus IV is differential equations.

Differential and integral calculus are both necessary for probability and statistics, and should be completed first.

Multivariable calculus can be paired with linear algebra, but is also required.

Differential equations is where consensus falls apart. The short it is, they’re all but necessary for mathematical modeling, but not everyone does mathematical modeling. It’s another tool in the toolbox.

Curated Threads & Resources about Data Science and Data Analytics

Khan AcademyDifferential Calculus Integral Calculus Multivariable Calculus Differential Equations
Paul’s Online Math NotesDifferential Calculus Integral Calculus Multivariable Calculus

How Do I Learn Probability?

Probability is not friendly to beginners. Definitions are rooted in higher mathematics, notation varies from source to source, and solutions are frequently unintuitive. Probability may present the biggest barrier to entry in data science.

It’s best to pick a single primary source and a community for help. If you can spend the money, register for a university or community college course and attend in person.

The best free resource is MIT’s 18.05 Introduction to Probability and Statistics (Spring 2014). Leverage /r/learnmath, /r/learnmachinelearning, and /r/AskStatistics when you get inevitably stuck.

How Do I Learn Linear Algebra?

Curated Threads & Resources https://www.youtube.com/watch?v=fNk_zzaMoSs&index=1&list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab

What does the typical data science interview process look like?

For general advice, Mastering the DS Interview Loop is a terrific article. The community discussed the article here.

Briefly summarized, most companies follow a five stage process:

Coding Challenge: Most common at software companies and roles contributing to a digital product.
HR Screen
Technical Screen: Often in the form of a project. Less frequently, it takes the form of a whiteboarding session at the onsite.
Onsite: Usually the project from the technical screen is presented here, followed by a meeting with the director overseeing the team you’ll join.
Negotiation & Offer

Mastering the DS Interview Loop

Preparation:

Practice questions on Leetcode which has both SQL and traditional data structures/algorithm questions
Review Brilliant for math and statistics questions.
SQL Zoo and Mode Analytics both offer various SQL exercises you can solve in your browser.

Tips:

Before you start coding, read through all the questions. This allows your unconscious mind to start working on problems in the background.
Start with the hardest problem first, when you hit a snag, move to the simpler problem before returning to the harder one.
Focus on passing all the test cases first, then worry about improving complexity and readability.
If you’re done and have a few minutes left, go get a drink and try to clear your head. Read through your solutions one last time, then submit.
It’s okay to not finish a coding challenge. Sometimes companies will create unreasonably tedious coding challenges with one-week time limits that require 5–10 hours to complete. Unless you’re desperate, you can always walk away and spend your time preparing for the next interview.

Remember, interviewing is a skill that can be learned, just like anything else. Hopefully, this article has given you some insight on what to expect in a data science interview loop.

The process also isn’t perfect and there will be times that you fail to impress an interviewer because you don’t possess some obscure piece of knowledge. However, with repeated persistence and adequate preparation, you’ll be able to land a data science job in no time!

What does the Airbnb data science interview process look like? [Coming soon]

What does the Facebook data science interview process look like? [Coming soon]

What does the Uber data science interview process look like? [Coming soon]

What does the Microsoft data science interview process look like? [Coming soon]

What does the Google data science interview process look like? [Coming soon]

What does the Netflix data science interview process look like? [Coming soon]

What does the Apple data science interview process look like? [Coming soon]

Question: How is SQL used in real data science jobs?

Real life enterprise databases are orders of magnitude more complex than the “customers, products, orders” examples used as teaching tools. SQL as a language is actually, IMO, a relatively simple language (the db administration component can get complex, but mostly data scientists aren’t doing that anyways). SQL is an incredibly important skill though for any DS role. I think when people emphasize SQL, what they really are talking about is the ability to write queries that interrogate the data and discover the nuances behind how it is collected and/or manipulated by an application before it is written to the dB. For example, is the employee’s phone number their current phone number or does the database store a history of all previous phone numbers? Critically important questions for understanding the nature of your data, and it doesn’t necessarily deal with statistics! The level of syntax required to do this is not that sophisticated, you can get pretty damn far with knowledge of all the joins, group by/analytical functions, filtering and nesting queries. In many cases, the data is too large to just select * and dump into a csv to load into pandas, so you start with SQL against the source. In my mind it’s more important for “SQL skills” to know how to generate hypotheses (that will build up to answering your business question) that can be investigated via a query than it is to be a master of SQL’s syntax. Just my two cents though!

AWS DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Data Visualization example: 12000 Years of Human Population Dynamic

[OC] 12,000 years of human population dynamics from dataisbeautiful

Human population density estimates based on the Hyde 3.2 model.

Data visualization example: AirPods Revenue vs. Top Tech Companies

[OC] AirPods Revenue vs. Top Tech Companies from dataisbeautiful

Source: 24/7 Kevin Rooke, Google Search, SEC Edgar

Data visualization example: Crypto race: DOGE (red) versus BTC (blue), 5/6/2020 – 5/5/2021

[OC] Crypto race: DOGE (red) versus BTC (blue), 5/6/2020 – 5/5/2021 from dataisbeautiful

Data sources: Coindesk-Bitcoin ; Coindesk-Dodgecoin

Data visualization example: How have cryptocurrencies done during the Pandemic?

[OC] How have cryptocurrencies done during the Pandemic? from dataisbeautiful

Data source: Performance data on these cryptocurrencies from Investing.com which provides free historic data

Data visualization example: Countries with the Most Nuclear Warheads

[OC] Countries with the Most Nuclear Warheads from dataisbeautiful

Data Source: Here

AWS Data Analytics Specialty Certification DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Capitol insurrection arrests per million people by state

[OC] Capitol insurrection arrests per million people by state from dataisbeautiful

Data Source: Made in Google Sheets using data from this USA Today article (for the number of arrests by arrestee’s home state) and this spreadsheet of the results of the 2020 Census (for the population of each state and DC in 2020, which was used as the denominator in calculating arrests/million people).

For more information about analytics architecture, visit the AWS Big Data Blog: AWS serverless data analytics pipeline reference architecture here

Basic Data Lake Architecture

Data Analytics Architecture on AWS

Data Analytics Process

AWS Data Analytics Specialty Certification DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Data Lake Storage:

AWS DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Event Driven Data Analytics Workflow on AWS

What is a Data Lake?

What is a Data Warehouse?

What are benefits of a data warehouse?

• Informed decision making

• Consolidated data from many sources

• Historical data analysis

• Data quality, consistency, and accuracy

• Separation of analytics processing from transactional databases

AWS Data Analytics Specialty Certification DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Data Lake vs Data Warehouse – Comparison

A data warehouse is specially designed for data analytics, which identifies relationships and trends across large amounts of data. A database is used to capture and store data, such as the details of a transaction. Unlike a data warehouse, a data lake is a centralized repository for structured, semi-structured, and unstructured data. A data warehouse organizes data in a tabular format (or schema) that enables SQL queries on the data. But not all applications require data to be in tabular format. Some applications can access data in the data lake even if it is “semi-structured” or unstructured. These include big data analytics, full-text search, and machine learning.

An AWS data lake only has a storage charge for the data. No servers are necessary for the data to be stored and accessed. In the case of Amazon Athena, also, there are no additional charges for processing. Data warehouse enable fast queries of structured data from transactional systems for batch reports, business intelligence, and visualization use cases. A data lake stores data without regard to its structure. Data scientists, data analysts, and business analysts use the data lake. They support use cases such as machine learning, predictive analytics, and data discovery and profiling.

Transactional Data Ingestion

AWS Data Analytics Specialty Certification DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Structured Query Language (SQL)

Data definition language (DDL) refers to the subset of SQL commands that define data structures and objects such as databases, tables, and views. DDL commands include the following:

• CREATE: used to create a new object.

• DROP: used to delete an object.

• ALTER: used to modify an object.

• RENAME: used to rename an object.

• TRUNCATE: used to remove all rows from a table without deleting the table itself.

AWS Data Analytics Specialty Certification DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Data manipulation language (DML) refers to the subset of SQL commands that are used to work with data. DML commands include the following:

• SELECT: used to request records from one or more tables.

• INSERT: used to insert one or more records into a table.

• UPDATE: used to modify the data of one or more records in a table.

• DELETE: used to delete one or more records from a table.

• EXPLAIN: used to analyze and display the expected execution plan of a SQL statement.

• LOCK: used to lock a table from write operations (INSERT, UPDATE, DELETE) and prevent concurrent operations from conflicting with one another.

Data control language (DCL) refers to the subset of SQL commands that are used to configure permissions to objects. DCL commands include:

• GRANT: used to grant access and permissions to a database or object in a database, such as a schema or table.

• REVOKE: used to remove access and permissions from a database or objects in a database.

AWS Data Analytics Specialty Certification DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Comparison of OLTP and OLAP

AWS Data Analytics Specialty Certification DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

What is Amazon Macie?

Businesses are responsible to identify and limit disclosure of sensitive data such as personally identifiable information (PII) or proprietary information. Identifying and masking sensitive information is time consuming, and becomes more complex in data lakes with various data sources and formats and broad user access to published data sets.

Amazon Macie is a fully managed data security and privacy service that uses machine learning and pattern matching to discover sensitive data in AWS. Macie includes a set of managed data identifiers which automatically detect common types of sensitive data. Examples of managed data identifiers include keywords, credentials, financial information, health information, and PII. You can also configure custom data identifiers using keywords or regular expressions to highlight organizational proprietary data, intellectual property, and other specific scenarios. You can develop security controls that operate at scale to monitor and remediate risk automatically when Macie detects sensitive data. You can use AWS Lambda functions to automatically turn on encryption for an Amazon S3 bucket where Macie detects sensitive data. Or automatically tag datasets containing sensitive data, for inclusion in orchestrated data transformations or audit reports.

Amazon Macie can be integrated into the data ingestion and processing steps of your data pipeline. This approach avoids inadvertent disclosures in published data sets by detecting and addressing the sensitive data as it is ingested and processed. Building the automated detection and processing of sensitive data into your ETL pipelines simplifies and standardizes handling of sensitive data at scale.

AWS Data Analytics Specialty Certification DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

What is AWS Glue DataBrew?

AWS Glue DataBrew is a visual data preparation tool that simplifies cleaning and normalizing datasets in preparation for use in analytics and machine learning.

• Profile data quality, identifying patterns and automatically detecting anomalies.

• Clean and normalize data using over 250 pre-built transformations, without writing code.

• Visually map the lineage of your data to understand data sources and transformation history.

• Save data cleaning and normalization workflows for automatic application to new data.

Data processed in AWS Glue DataBrew is immediately available for use in analytics and machine learning projects.

Learn more about the built-in transformations available in AWS Glue DataBrew in the Recipe actions reference: https://docs.aws.amazon.com/databrew/latest/dg/recipe-actions-reference.html

AWS Data Analytics Specialty Certification DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

What is AWS Glue?

AWS Glue is a fully managed ETL (extract, transform, and load) service that makes it simple and cost-effective to categorize your data, clean it, enrich it, and move it reliably between various data stores and data streams. AWS Glue consists of a central metadata repository known as the AWS Glue Data Catalog, an ETL engine that automatically generates Python or Scala code, and a flexible scheduler that handles dependency resolution, job monitoring, and retries. AWS Glue can run your ETL jobs as new data arrives. For example, you can use an AWS Lambda function to trigger your ETL jobs to run as soon as new data becomes available in Amazon S3. You can also register this new dataset in the

AWS Glue Data Catalog as part of your ETL jobs.

AWS Glue is serverless, so there’s no infrastructure to set up or manage.

AWS Data Analytics Specialty Certification DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

AWS Glue Data Catalog The AWS Glue Data Catalog provides a uniform repository where disparate systems can store and find metadata to keep track of data in data silos, and use that metadata to query and transform the data. Once the data is cataloged, it is immediately available for search and query using Amazon Athena, Amazon EMR, and Amazon Redshift Spectrum.

You can use AWS Identity and Access Management (IAM) policies to control access to the data sources managed by the AWS Glue Data Catalog. The Data Catalog also provides comprehensive audit and governance capabilities, with schema-change tracking and data access controls.

AWS Glue crawler

AWS Glue crawlers can scan data in all kinds of repositories, classify it, extract schema information from it, and store the metadata automatically in the AWS Glue Data Catalog.

AWS Glue ETL

AWS Glue can run your ETL jobs as new data arrives. For example, you can use an AWS Lambda function to trigger your ETL jobs to run as soon as new data becomes available in Amazon S3. You can also register this new dataset in the AWS Glue Data Catalog as part of your ETL jobs.

AWS Glue Studio

AWS Glue Studio provides a graphical interface to create, run, and monitor extract, transform, and load (ETL) jobs in AWS Glue. You can visually compose data transformation workflows and seamlessly run them on AWS Glue’s Apache Spark-based serverless ETL engine. AWS Glue Studio also offers tools to monitor ETL workflows and validate that they are operating as intended.

AWS Data Analytics Specialty Certification DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

What is Amazon Athena?

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to set up or manage, and you can start analyzing data immediately. You don’t even need to load your data into Athena, it works directly with data stored in S3. To get started, just log into the Amazon Athena console, define your schema, and start querying. Athena uses Presto with full standard SQL support. It works with a variety of standard data formats, including CSV, JSON, ORC, Apache Parquet and Avro. While Athena is ideal for quick, ad-hoc querying, it can also handle complex analysis, including large joins, window functions, and arrays.

Amazon Athena helps you analyze data stored in Amazon S3. You can use Athena to run ad-hoc queries using ANSI SQL, without the need to aggregate or load the data into Athena. It can process unstructured, semi-structured, and structured datasets. Examples include CSV, JSON, Avro or columnar data formats such as Apache Parquet and Apache ORC. Athena integrates with Amazon QuickSight for easy visualization. You can also use Athena to generate reports or to explore data with business intelligence tools or SQL clients, connected via an ODBC or JDBC driver.

The tables and databases that you work with in Athena to run queries are based on metadata. Metadata is data about the underlying data in your dataset. How that metadata describes your dataset is called the schema. For example, a table name, the column names in the table, and the data type of each column are schema, saved as metadata, that describe an underlying dataset. In Athena, we call a system for organizing metadata a data catalog or a metastore. The combination of a dataset and the data catalog that describes it is called a data source.

The relationship of metadata to an underlying dataset depends on the type of data source that you work with. Relational data sources like MySQL, PostgreSQL, and SQL Server tightly integrate the metadata with the dataset. In these systems, the metadata is most often written when the data is written. Other data sources, like those built using Hive, allow you to define metadata on-the-fly when you read the dataset. The dataset can be in a variety of formats; for example, CSV, JSON, Parquet, or Avro.

AWS Data Analytics Specialty Certification DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

What is AWS Lake Formation?

Lake Formation is a fully managed service that enables data engineers, security officers, and data analysts to build, secure, manage, and use your data lake

To build your data lake in AWS Lake Formation, you must register an Amazon S3 location as a data lake. The Lake Formation service must have permission to write to the AWS Glue Data Catalog and to Amazon S3 locations in the data lake.

Next, identify the data sources to be ingested. AWS Lake formation can move data into your data lake from existing Amazon S3 data stores. Lake Formation can collect and organize datasets, such as logs from AWS CloudTrail, AWS CloudFront, detailed billing reports, or Elastic Load Balancing. You can ingest bulk or incremental datasets from relational, NoSQL, or non-relational databases. Lake Formation can ingest data from databases running in Amazon RDS or hosted in Amazon EC2. You can also ingest data from on-premises databases using Java Database Connectivity JDBC connectors. You can use custom AWS Glue jobs to load data from other databases or to ingest streaming data using Amazon Kinesis or Amazon DynamoDB.

AWS Lake Formation manages AWS Glue crawlers, AWS Glue ETL jobs, the AWS Glue Data Catalog, security settings, and access control:

• Lake Formation is an automated build environment based on AWS Glue.

• Lake Formation coordinates AWS Glue crawlers to identify datasets within the specified data stores and collect metadata for each dataset

• Lake Formation can perform transformations on your data, such as rewriting and organizing data into a consistent, analytics-friendly format. Lake Formation creates transformation templates and schedules AWS Glue jobs to prepare and optimize your data for analytics. Lake Formation also helps clean your data using FindMatches, an ML-based deduplication transform. AWS Glue jobs encapsulate scripts, such as ETL scripts, which connect to source data, process it, and write it out to a data target. AWS Glue triggers can start jobs based on a schedule or event, or on demand. AWS Glue workflows orchestrate AWS ETL jobs, crawlers, and triggers. You can define a workflow manually or use a blueprint based on commonly ingested data source types.

• The AWS Glue Data Catalog within the data lake persistently stores the metadata from raw and processed datasets. Metadata about data sources and targets is in the form of databases and tables. Tables store information about the underlying data, including schema information, partition information, and data location. Databases are collections of tables. Each AWS account has one data catalog per AWS Region.

• Lake Formation provides centralized access controls for your data lake, including security policy-based rules for users and applications by role. You can authenticate the users and roles using AWS IAM. Once the rules are defined, Lake Formation enforces them with table-and column-level granularity for users of Amazon Redshift Spectrum and Amazon Athena. Rules are enforced at the table-level in AWS Glue, which is normally accessed for administrators.

• Lake Formation leverages the encryption capabilities of Amazon S3 for data in the data lake. This approach provides automatic server-side encryption with keys managed by the AWS Key Management Service (KMS). S3 encrypts data in transit when replicating across Regions. You can separate accounts for source and destination Regions to further protect your data

AWS Data Analytics Specialty Certification DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

What is Amazon Quicksight?

Amazon QuickSight is a cloud-scale business intelligence (BI) service. In a single data dashboard, QuickSight gives decision-makers the opportunity to explore and interpret information in an interactive visual environment. QuickSight can include AWS data, third-party data, big data, spreadsheet data, SaaS data, B2B data, and more. QuickSight delivers fast and responsive query performance by using a robust in-memory engine (SPICE).

Scale from tens to tens of thousands of users

Amazon QuickSight has a serverless architecture that automatically scales to tens of thousands of users without the need to setup, configure, or manage your own servers.

Embed BI dashboards in your applications

With QuickSight, you can quickly embed interactive dashboards into your applications, websites, and portals.

Access deeper insights with Machine Learning

QuickSight leverages the proven machine learning (ML) capabilities of AWS. BI teams can perform advanced analytics without prior data science experience.

Ask questions of your data, receive answers

With QuickSight, you can quickly get answers to business questions asked in natural language with QuickSight’s new ML-powered natural language query capability, Q.

AWS Data Analytics Specialty Certification DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

What is SPICE?

SPICE is the Super-fast, Parallel, In-memory Calculation Engine in QuickSight. SPICE is engineered to rapidly perform advanced calculations and serve data. The storage and processing capacity available in SPICE speeds up the analytical queries that you run against your imported data. By using SPICE, you save time because you don’t need to retrieve the data every time that you change an analysis or update a visual.

When you import data into a dataset rather than using a direct SQL query, it becomes SPICE data because of how it’s stored. SPICE is the Amazon QuickSight Super-fast, Parallel, In-memory Calculation Engine. It’s engineered to rapidly perform advanced calculations and serve data. In Enterprise edition, data stored in SPICE is encrypted at rest.

When you create or edit a dataset, you choose to use either SPICE or a direct query, unless the dataset contains uploaded files. Importing (also called ingesting) your data into SPICE can save time and money:

• Your analytical queries process faster.

• You don’t need to wait for a direct query to process.

• Data stored in SPICE can be reused multiple times without incurring additional costs. If you use a data source that charges per query, you’re charged for querying the data when you first create the dataset and later when you refresh the dataset.

AWS Data Analytics Specialty Certification DAS-C01 Exam Prep on iOS

AWS DAS-C01 Exam Prep on android

AWS DAS-C01 Exam Prep on Windows

Serverless data lake reference architecture:

You can use AWS services as building blocks to build serverless data lakes and analytics pipelines. You can apply best practices on how to ingest, store, transform, and analyze structured and unstructured data at scale. Achieve the scale without needing to manage any storage or compute infrastructure. A decoupled, component-driven architecture allows you to start small and scale out slowly. You can quickly add new purpose-built components to one of six architecture layers to address new requirements and data sources.

This data lake-centric architecture can support business intelligence (BI) dashboarding, interactive SQL queries, big data processing, predictive analytics, and machine learning use cases.

• The ingestion layer includes protocols to support ingestion of structured, unstructured, or streaming data from a variety of sources.

• The storage layer provides durable, scalable, secure, and cost-effective storage of datasets across ingestion and processing.

• The landing zone stores data as ingested.

• Data engineers run initial quality checks to validate and cleanse data in the landing zone, producing the raw dataset.

• The processing layer creates curated datasets by further cleansing, normalizing, standardizing, and enriching data from the raw zone. The curated dataset is typically stored in formats that support performant and cost-effective access by the consumption layer.

• The catalog layer stores business and technical metadata about the datasets hosted in the storage layer.

• The consumption layer contains functionality for Search, Analytics, and Visualization. It integrates with the data lake storage, cataloging, and security layers. This integration supports analysis methods such as SQL, batch analytics, BI dashboards, reporting, and ML.

• The security and monitoring layer protects data within the storage layer and other resources in the data lake. This layer includes access control, encryption, network protection, usage monitoring, and auditing.

You can learn more about this reference architecture at AWS Big Data Blog: AWS serverless data analytics pipeline reference architecture: https://aws.amazon.com/blogs/big-data/aws-serverless-data-analytics-pipeline-reference-architecture/

What are Data Lakes Best Practices?

The main challenge with a data lake architecture is that raw data is stored with no oversight of the contents. To make the data usable, you must have defined mechanisms to catalog and secure the data. Without these mechanisms, data cannot be found or trusted, resulting in a “data swamp.” Meeting the needs of diverse stakeholders requires data lakes to have governance, semantic consistency, and access controls.

The Analytics Lens for the AWS Well-Architected Framework covers common analytics applications scenarios, including data lakes. It identifies key elements to help you architect your data lake according to best practices, including the following configuration notes:

• Decide on a location for data lake ingestion (that is, an S3 bucket). Select a frequency and isolation mechanism that meets your business needs.

• For Tier 2 Data, partition the data with keys that align to common query filter

. This enables pruning by common analytics tools that work on raw data files and increases performance

• Choose optimal file sizes to reduce Amazon S3 round trips during compute environment ingestion. Recommended: 512 MB – 1 GB in a columnar format (ORC/Parquet) per partition.

• Perform frequent scheduled compactions that align to the optimal file sizes noted previously. For example, compact into daily partitions if hourly files are too small.

• For data with frequent updates or deletes (that is, mutable data), either: o Temporarily store replicated data to a database like Amazon Redshift, Apache Hive, or Amazon RDS. Once the data becomes static, and then offload it to Amazon S3. Or, o Append the data to delta files per partition and compact it on a scheduled basis. You can use AWS Glue or Apache Spark on Amazon EMR for this processing.

With Tier 2 and Tier 3 Data being stored in Amazon S3, partition data using a high cardinality key. This is honored by Presto, Apache Hive, and Apache Spark and improves the query filter performance on that key

• Sort data in each partition with a secondary key that aligns to common filter queries. This allows query engines to skip files and get to requested data faster. For more information on the Analytics Lens for the AWS Well-Architected Framework, visit https://docs.aws.amazon.com/wellarchitected/latest/analytics-lens/data-lake.html

References:

For additional information on AWS data lakes and data analytics architectures, visit:

• AWS Well-Architected: Learn, measure, and build using architectural best practices: https://aws.amazon.com/architecture/well-architected

• AWS Lake Formation: Build a secure data lake in days: https://aws.amazon.com/lake-formation

• Getting Started with Amazon S3: https://aws.amazon.com/s3/getting-started

• Security in AWS Lake Formation: https://docs.aws.amazon.com/lake-formation/latest/dg/security.html

AWS Lake Formation: How It Works: https://docs.aws.amazon.com/lake-formation/latest/dg/how-it-works.html

• AWS Lake Formation Dashboard: https://us-west-2.console.aws.amazon.com/lakeformation

• Data Lake Storage on AWS: https://aws.amazon.com/products/storage/data-lake-storage/

• Building Big Data Storage Solutions (Data Lakes) for Maximum Flexibility: https://docs.aws.amazon.com/whitepapers/latest/building-data-lakes/building-data-lake-aws.html

• Data Ingestion Methods: https://docs.aws.amazon.com/whitepapers/latest/building-data-lakes/data-ingestion-methods.html

May 24, 2020September 23, 2022

Top 30 AWS Certified Developer Associate Exam Tips

AWS Certified Developer Associate Exam Prep Urls

Get the free app at: android: https://play.google.com/store/apps/details?id=com.awscertdevassociateexampreppro.enoumen

iOs: https://apps.apple.com/ca/app/aws-certified-developer-assoc/id1511211095

PRO version with mock exam android: https://play.google.com/store/apps/details?id=com.awscertdevassociateexampreppro.enoumen

PRO version with mock exam ios: https://apps.apple.com/ca/app/aws-certified-dev-ass-dva-c01/id1506519319t

Top 30 AWS Certified Developer Associate Exam Tips

Get the Read Aloud Wonderland Bedtime Adventure iBook here

Understand some basic code samples such as JSON code, config files, and IAM policy documents: Amazon EC2, Amazon Elastic Load Balancing, Amazon EC2 Auto Scaling, Amazon Simple Notification Service (SNS), AWS KMS, Amazon CloudTrail, AWS Organizations, Amazon Simple Workflow Service (SWF), and Amazon Virtual Private Cloud (VPC), Load Balancing, DynamoDB, EBS, Multi-AZ RDS, Aurora, EFS, DynamoDB, NLB, ALB, Aurora, Auto Scalling, DynamoDB(latency), Aurora(performance), Multi-AZ RDS(high availability), Throughput Optimized EBS (highly sequential), Read the quizlet note cards here
AWS topics for DVA-C01

What to study: LAMBDA [10-15% of Exam]Invocation types, Using notifications and event source mappings, Concurrency and throttling, X-Ray and Amazon SQS DLQs, Versions and aliases, Blue/green deployment, Packaging and deployment, VPC connections (with Internet/NAT GW), Lambda as ELB target, Dependencies, Environment variables (inc. encrypting them)
AWS topics for DVA-C01

What to study: DYNAMODB [10-12% of Exam]Scans vs queries (and the APIs, parameters you can use), Local and Global Secondary indexes, Calculating Read Capacity Units (RCUs) and Write, Capacity Units (WCUs), Performance / optimization best practices, Use cases (e.g. session state, key/value data store, scalability), DynamoDB Streams, Use in serverless app with Lambda and API Gateway, DynamoDB Accelerator (DAX) use cases
AWS topics for DVA-C01: DynamoDB

What to study: API Gateway [8-10% of Exam] Lambda / IAM / Cognito authorizers, Invalidation of cache, Integration types: proxy vs custom / AWS vs HTTP, Caching, Import / export OpenAPI Swagger specifications, Stage variables, Performance metrics,
AWS topics for DVA-C01: API Gateway

Advertise with us - Post Your Good Content Here
We are ranked in the Top 20 on Google

AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version

What to study: COGNITO [7-8% of Exam] User pools vs Identity pools, Unauthenticated identities, AWS Cognito Sync, Using MFA with Cognito, Web identity federation,
AWS topics for DVA-C01: COGNITO

Invest in your future today by enrolling in this Azure Fundamentals - Pass the Azure Fundamentals Exam with Ease: Master the AZ-900 Certification with the Comprehensive Exam Preparation Guide!

What to study: S3 [7-8% of Exam]Encryption – make sure you understand S3 encryption very well for the exam, S3 Transfer Acceleration, Versioning, Copying data, Lifecycle rules,
AWS topics for DVA-C01

What to study: IAM IAM policies and roles, Cross account access, Multi-factor authentication (MFA), API calls, IAM Roles with EC2 (instance profiles), Access keys vs roles, IAM best practices, Federation,
AWS topics for DVA-C01: IAM

What to study: ECS Shared storage between containers, Single vs multi-docker environments, Uploading / downloading images with ECR, Placement strategies (e.g. spread, binpack, random etc.), Port mappings, Defining task definitions, IAM Roles for Tasks,
AWS topics for DVA-C01: ECS

What to study: ELASTIC BEANSTALK Deployment policies and blue/green, .ebextensions and config file usage, Updating deployments, Worker vs web tier, Deployment, packaging and files, code, commands used, Use cases,
AWS topics for DVA-C01: AMAZON ELASTIC BEANSTALK

What to study: CLOUDFORMATIONS CloudFormation template anatomy (e.g. mappings, outputs, parameters, etc.), Packaging and deployment including commands used, AWS Serverless Application Model (SAM),
AWS topics for DVA-C01

"Pass the AWS Cloud Practitioner Certification with flying colors: Master the Exam with 300+ Quizzes, Cheat Sheets, Flashcards, and Illustrated Study Guides - 2024 Edition"

What to study: AMAZON CLOUDWATCH Monitoring application logs, Trigger scheduled Lambda invocation, Custom metrics, Metric resolution,
AWS topics for DVA-C01: AMAZON CLOUDWATCH

What to study: CODECOMMIT, CODEBUILD, CODEDEPLOY, CODEPIPELINE, CODESTAR Know how each tool fits into the CI/CD pipeline, Various files used such as appspec.yml, buildspec.yml etc., Process for packaging and deployment, Deployment types with CodeDeploy including different , destination services (e.g. Lambda, ECS, EC2), Manual approvals with CodePipeline,
AWS topics for DVA-C01: CODECOMMIT, CODEBUILD, CODEDEPLOY, CODEPIPELINE, CODESTAR

"Become a Canada Expert: Ace the Citizenship Test and Impress Everyone with Your Knowledge of Canadian History, Geography, Government, Culture, People, Languages, Travel, Wildlife, Hockey, Tourism, Sceneries, Arts, and Data Visualization. Get the Top 1000 Canada Quiz Now!"

What to study: AMAZON CLOUDFRONT
AWS topics for DVA-C01: AMAZON CLOUDFRONT

What to study: X-RAYS X-Ray daemon, installing and configuring, Lambda with X-Ray, Use cases / benefits, Inclusion in Elastic Beanstalk environment, Annotations vs segments vs subsegments vs metadata, API calls, Port used (UDP 2000),
AWS topics for DVA-C01: X-RAYS

What to study: SQS Standard queues, FIFO, DLQ, delay queue, Decoupling applications use cases, Event source mapping to Lambda, Visibility timeout, Short polling vs long polling,
AWS topics for DVA-C01: SQS

Unlock the Secrets of Africa: Master African History, Geography, Culture, People, Cuisine, Economics, Languages, Music, Wildlife, Football, Politics, Animals, Tourism, Science and Environment with the Top 1000 Africa Quiz and Trivia. Get Yours Now!

What to study: ELASTICACHE Use cases (caching and session state), In-memory data store, Services it sits in front of (e.g. Amazon RDS), Comparison against DynamoDB DAX, Lazy loading vs Write Through Caching, Memcached vs Redis,
AWS topics for DVA-C01: ELASTICACHE

What to study: STEP FUNCTIONS Step Functions state machines, Using to coordinate multiple Lambda function invocations
AWS topics for DVA-C01: STEP FUNCTIONS

What to study: SSM PARAMETER STORE Storing credentials, Rotation (application needs to do it) – comparison with secrets manager (which handles rotation automatically),
SSM PARAMETER STORE

Dive into a comprehensive AWS Cloud Practitioner CLF-C02 Certification guide, masterfully weaving insights from Tutorials Dojo, Adrian Cantrill, Stephane Maarek, and AWS Skills Builder into one unified resource.

Know what instance types can be launched from which types of AMIs, and which instance types require an HVM AMI
AWS HVM AMI

Have a good understanding of how Route53 supports all of the different DNS record types, and when you would use certain ones over others.
Route 53 supports all of the different DNS record types

Know which services have native encryption at rest within the region, and which do not.
AWS Services with native Encryption at rest

Kinesis Sharding:
#AWS Kinesis Sharding

Handling SSL Certificates in ELB ( Wildcard certificate vs SNI )
#AWS Handling SSL Certificates in ELB ( Wildcard certificate vs SNI )

Different types of Aurora Endpoints
#AWS Different types of Aurora Endpoints

The Default Termination Policy for Auto Scaling Group (Oldest launch configuration vs Instance Protection)
#AWS Default Termination Policy for Auto Scaling Group

Cloud Certification made simple. Ace your exams with Djamgatech.

Use AWS Cheatsheets – I also found the cheatsheets provided by Tutorials Dojo very helpful. In my opinion, it is better than Jayendrapatil Patil’s blog since it contains more updated information that complements your review notes.
#AWS Cheat Sheet

Watch this exam readiness 3hr video, it very recent webinar this provides what is expected in the exam.
#AWS Exam Prep Video

Start off watching Ryan’s videos. Try and completely focus on the hands on. Take your time to understand what you are trying to learn and achieve in those LAB Sessions.
#AWS Exam Prep Video

Do not rush into completing the videos. Take your time and hone the basics. Focus and spend a lot of time for the back bone of AWS infrastructure – Compute/EC2 section, Storage (S3/EBS/EFS), Networking (Route 53/Load Balancers), RDS, VPC, Route 3. These sections are vast, with lot of concepts to go over and have loads to learn. Trust me you will need to thoroughly understand each one of them to ensure you pass the certification comfortably.
#AWS Exam Prep Video

Make sure you go through resources section and also AWS documentation for each components. Go over FAQs. If you have a question, please post it in the community. Trust me, each answer here helps you understand more about AWS.
#AWS Faqs

Like any other product/service, each AWS offering has a different flavor. I will take an example of EC2 (Spot/Reserved/Dedicated/On Demand etc.). Make sure you understand what they are, what are the pros/cons of each of these flavors. Applies for all other offerings too.
#AWS Services

Follow Neal K Davis on Linkedin and Read his updates about DVA-C01
#AWS Services

What is the AWS Certified Developer Associate Exam?

The AWS Certified Developer – Associate examination is intended for individuals who perform a development role and have one or more years of hands-on experience developing and maintaining an AWS-based application. It validates an examinee’s ability to:

Demonstrate an understanding of core AWS services, uses, and basic AWS architecture best practices
Demonstrate proficiency in developing, deploying, and debugging cloud-based applications using AWS

There are two types of questions on the examination:

Multiple-choice: Has one correct response and three incorrect responses (distractors).
Provide implementation guidance based on best practices to the organization throughout the lifecycle of the project.

Select one or more responses that best complete the statement or answer the question. Distractors, or incorrect answers, are response options that an examinee with incomplete knowledge or skill would likely choose. However, they are generally plausible responses that fit in the content area defined by the test objective. Unanswered questions are scored as incorrect; there is no penalty for guessing.

To succeed with the real exam, do not memorize the answers below. It is very important that you understand why a question is right or wrong and the concepts behind it by carefully reading the reference documents in the answers.

Top

AWS Certified Developer Associate info and details

The AWS Certified Developer Associate Exam is a multiple choice, multiple answer exam. Here is the Exam Overview:

Certification Name: AWS Certified Developer Associate.
Prerequisites for the Exam: None.
Exam Pattern: Multiple Choice Questions
Number of Questions: 65
Duration: 130 mins
Exam fees: US $150
Exam Guide on AWS Website
Available languages for tests: English, Japanese, Korean, Simplified Chinese
Read AWS whitepapers
Register for certification account here.
Prepare for Certification Here

Top

Additional Information for reference

Below are some useful reference links that would help you to learn about AWS Practitioner Exam.

Other Relevant and Recommended AWS Certifications

AWS Certification Exams Roadmap[/caption]

AWS Developer Associate Exam Whitepapers:

AWS has provided whitepapers to help you understand the technical concepts. Below are the recommended whitepapers.

AWS Well-Architected webpage (various whitepapers linked)

Top

Online Training and Labs for AWS Certified Developer Associate Exam

Top

AWS Certified Developer Associate Jobs

May 21, 2020January 24, 2023

Get the Read Aloud Wonderland Bedtime Adventure iBook here

Understand bastion hosts, and which subnet one might live on. Bastion hosts are instances that sit within your public subnet and are typically accessed using SSH or RDP. Once remote connectivity has been established with the bastion host, it then acts as a ‘jump’ server, allowing you to use SSH or RDP to login to other instances (within private subnets) deeper within your network. When properly configured through the use of security groups and Network ACLs, the bastion essentially acts as a bridge to your private instances via the Internet.”
Bastion Hosts

Know the difference between Directory Service’s AD Connector and Simple AD. Use Simple AD if you need an inexpensive Active Directory–compatible service with the common directory features. AD Connector lets you simply connect your existing on-premises Active Directory to AWS.
AD Connector and Simple AD

Know how to enable cross-account access with IAM: To delegate permission to access a resource, you create an IAM role that has two policies attached. The permissions policy grants the user of the role the needed permissions to carry out the desired tasks on the resource. The trust policy specifies which trusted accounts are allowed to grant its users permissions to assume the role. The trust policy on the role in the trusting account is one-half of the permissions. The other half is a permissions policy attached to the user in the trusted account that allows that user to switch to, or assume the role.
Enable cross-account access with IAM

Have a good understanding of how Route53 supports all of the different DNS record types, and when you would use certain ones over others.
Route 53 supports all of the different DNS record types

Advertise with us - Post Your Good Content Here
We are ranked in the Top 20 on Google

AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version

Know which services have native encryption at rest within the region, and which do not.
AWS Services with native Encryption at rest

Invest in your future today by enrolling in this Azure Fundamentals - Pass the Azure Fundamentals Exam with Ease: Master the AZ-900 Certification with the Comprehensive Exam Preparation Guide!

Know which services allow you to retain full admin privileges of the underlying EC2 instances
EC2 Full admin privilege

Know When Elastic IPs are free or not: If you associate additional EIPs with that instance, you will be charged for each additional EIP associated with that instance per hour on a pro rata basis. Additional EIPs are only available in Amazon VPC. To ensure efficient use of Elastic IP addresses, we impose a small hourly charge when these IP addresses are not associated with a running instance or when they are associated with a stopped instance or unattached network interface.
When are AWS Elastic IPs Free or not?

Know what are the four high level categories of information Trusted Advisor supplies.
#AWS Trusted advisor

Know how to troubleshoot a connection time out error when trying to connect to an instance in your VPC. You need a security group rule that allows inbound traffic from your public IP address on the proper port, you need a route that sends all traffic destined outside the VPC (0.0.0.0/0) to the Internet gateway for the VPC, the network ACLs must allow inbound and outbound traffic from your public IP address on the proper port, etc.
#AWS Connection time out error

Be able to identify multiple possible use cases and eliminate non-use cases for SWF.
#AWS

"Pass the AWS Cloud Practitioner Certification with flying colors: Master the Exam with 300+ Quizzes, Cheat Sheets, Flashcards, and Illustrated Study Guides - 2024 Edition"

Understand how you might set up consolidated billing and cross-account access such that individual divisions resources are isolated from each other, but corporate IT can oversee all of it.
#AWS Set up consolidated billing

"Become a Canada Expert: Ace the Citizenship Test and Impress Everyone with Your Knowledge of Canadian History, Geography, Government, Culture, People, Languages, Travel, Wildlife, Hockey, Tourism, Sceneries, Arts, and Data Visualization. Get the Top 1000 Canada Quiz Now!"

Know how you would go about making changes to an Auto Scaling group, fully understanding what you can and can’t change. “You can only specify one launch configuration for an Auto Scaling group at a time, and you can’t modify a launch configuration after you’ve created it. Therefore, if you want to change the launch configuration for your Auto Scaling group, you must create a launch configuration and then update your Auto Scaling group with the new launch configuration. When you change the launch configuration for your Auto Scaling group, any new instances are launched using the new configuration parameters, but existing instances are not affected.
#AWS Make Change to Auto Scaling group

Know which field you use to run a script upon launching your instance.
#AWS User data script

Know how DynamoDB (durable, and you can pay for strong consistency), Elasticache (great for speed, not so durable), and S3 (eventual consistency results in lower latency) compare to each other in terms of durability and low latency.
#AWS DynamoDB consistency

Unlock the Secrets of Africa: Master African History, Geography, Culture, People, Cuisine, Economics, Languages, Music, Wildlife, Football, Politics, Animals, Tourism, Science and Environment with the Top 1000 Africa Quiz and Trivia. Get Yours Now!

Know the difference between bucket policies, IAM policies, and ACLs for use with S3, and examples of when you would use each. “With IAM policies, companies can grant IAM users fine-grained control to their Amazon S3 bucket or objects while also retaining full control over everything the users do. With bucket policies, companies can define rules which apply broadly across all requests to their Amazon S3 resources, such as granting write privileges to a subset of Amazon S3 resources. Customers can also restrict access based on an aspect of the request, such as HTTP referrer and IP address. With ACLs, customers can grant specific permissions (i.e. READ, WRITE, FULL_CONTROL) to specific users for an individual bucket or object.
#AWS Difference between bucket policies

Know when and how you can encrypt snapshots.
#AWS EBS Encryption

Dive into a comprehensive AWS Cloud Practitioner CLF-C02 Certification guide, masterfully weaving insights from Tutorials Dojo, Adrian Cantrill, Stephane Maarek, and AWS Skills Builder into one unified resource.

Understand how you can use ELB cross-zone load balancing to ensure even distribution of traffic to EC2 instances in multiple AZs registered with a load balancer.
#AWS ELB cross-zone load balancing

How would you allow users to log into the AWS console using active directory integration. Here is a link to some good reference material.
#AWS og into the AWS console using active directory integration

Spot instances are good for cost optimization, even if it seems you might need to fall back to On-Demand instances if you wind up getting kicked off them and the timeline grows tighter. The primary (but still not only) factor seems to be whether you can gracefully handle instances that die on you–which is pretty much how you should always design everything, anyway!
#AWS Spot instances

The term “use case” is not the same as “function” or “capability”. A use case is something that your app/system will need to accomplish, not just behaviour that you will get from that service. In particular, a use case doesn’t require that the service be a 100% turnkey solution for that situation, just that the service plays a valuable role in enabling it.
#AWS use case

There might be extra, unnecessary information in some of the questions (red herrings), so try not to get thrown off by them. Understand what services can and can’t do, but don’t ignore “obvious”-but-still-correct answers in favour of super-tricky ones.
#AWS Exam Answers: Distractors

If you don’t know what they’re trying to ask, in a question, just move on and come back to it later (by using the helpful “mark this question” feature in the exam tool). You could easily spend way more time than you should on a single confusing question if you don’t triage and move on.
#AWS Exa: Skip Questions that are vague and come back to them later

Some exam questions required you to understand features and use cases of: VPC peering, cross-account access, DirectConnect, snapshotting EBS RAID arrays, DynamoDB, spot instances, Glacier, AWS/user security responsibilities, etc.
#AWS

The 30 Day constraint in the S3 Lifecycle Policy before transitioning to S3-IA and S3-One Zone IA storage classes
#AWS S3 lifecycle policy

Cloud Certification made simple. Ace your exams with Djamgatech.

Enabling Cross-region snapshot copy for an AWS KMS-encrypted cluster
Redis Auth / Amazon MQ / IAM DB Authentication
#AWS Cross-region snapshot copy for an AWS KMS-encrypted cluster

Know that FTP is using TCP and not UDP (Helpful for questions where you are asked to troubleshoot the network flow)
TCP and UDP

Know the Difference between S3, EBS and EFS
#AWS Difference between S3, EBS and EFS

Kinesis Sharding:
#AWS Kinesis Sharding

Handling SSL Certificates in ELB ( Wildcard certificate vs SNI )
#AWS Handling SSL Certificates in ELB ( Wildcard certificate vs SNI )

Difference between OAI, Signed URL (CloudFront) and Pre-signed URL (S3)
#AWS Difference between OAI, Signed URL (CloudFront) and Pre-signed URL (S3)

Different types of Aurora Endpoints
#AWS Different types of Aurora Endpoints

The Default Termination Policy for Auto Scaling Group (Oldest launch configuration vs Instance Protection)
#AWS Default Termination Policy for Auto Scaling Group

Watch Acloud Guru Videos Lectures while commuting / lunch break – Reschedule the exam if you are not yet ready
#AWS ACloud Guru

Watch Linux Academy Videos Lectures while commuting / lunch break – Reschedule the exam if you are not yet ready
#AWS Linux Academy

Watch Udemy Videos Lectures while commuting / lunch break – Reschedule the exam if you are not yet ready
#AWS Linux Academy

The Udemy practice test interface is good that it pinpoints your weak areas, so what I did was to re-watch all the videos that I got the wrong answers. Since I was able to gauge my exam readiness, I decided to reschedule my exam for 2 more weeks, to help me focus on completing the practice tests.
#AWS Udemy

Watch this exam readiness 3hr video, it very recent webinar this provides what is expected in the exam.
#AWS Exam Prep Video

Start off watching Ryan’s videos. Try and completely focus on the hands on. Take your time to understand what you are trying to learn and achieve in those LAB Sessions.
#AWS Exam Prep Video

Ensure to attend all quizzes after each section. Please do not treat these quizzes as your practice exams. These quizzes are designed to mostly test your knowledge on the section you just finished. The exam itself is designed to test you with scenarios and questions, where in you will need to recall and apply your knowledge of different AWS technologies/services you learn over multiple lectures.
#AWS Services

I, personally, do not recommend to attempt a practice exam or simulator exam until you have done all of the above. It was a little overwhelming for me. I had thoroughly gone over the videos. And understood the concepts pretty well, but once I opened exam simulator I felt the questions were pretty difficult. I also had a feeling that videos do not cover lot of topics. But later I realized, given the vastness of AWS Services and offerings it is really difficult to encompass all these services and their details in the course content. The fact that these services keep changing so often, does not help
#AWS Services

Go back and make a note of all topics, that you felt were unfamiliar for you. Go through the resources section and fiund links to AWS documentation. After going over them, you shoud gain at least 5-10% more knowledge on AWS. Have expectations from the online courses as a way to get thorough understanding of basics and strong foundations for your AWS knowledge. But once you are done with videos. Make sure you spend a lot of time on AWS documentation and FAQs. There are many many topics/sub topics which may not be covered in the course and you would need to know, atleast their basic functionalities, to do well in the exam.
#AWS Services

Once you start taking practice exams, it may seem really difficult at the beginning. So, please do not panic if you find the questions complicated or difficult. IMO they are designed or put in a way to sound complicated but they are not. Be calm and read questions very carefully. In my observation, many questions have lot of information which sometimes is not relevant to the solution you are expected to provide. Read the question slowly and read it again until you understand what is expected out of it.
#AWS Services

With each practice exam you will come across topics that you may need to scale your knowledge on or learn them from scratch.
#AWS Services

With each test and the subsequent revision, you will surely feel more confident.
There are 130 mins for questions. 2 mins for each question which is plenty of time.
At least take 8-10 practice tests. The ones on udemy/tutorialdojo are really good. If you are a acloudguru member. The exam simulator is really good.
Manage your time well. Keep patience. I saw someone mention in one of the discussions that do not under estimate the mental focus/strength needed to sit through 130 mins solving these questions. And it is really true.
Do not give away or waste any of those precious 130 mins. While answering flag/mark questions you think you are not completely sure. My advice is, even if you finish early, spend your time reviewing the answers. I could review 40 of my answers at the end of test. And I at least rectified 3 of them (which is 4-5% of total score, I think)
So in short – Put a lot of focus on making your foundations strong. Make sure you go through AWS Documentation and FAQs. Try and envision how all of the AWS components can fit together and provide an optimal solution. Keep calm.
This video gives outline about exam, must watch before or after Ryan’s course. #AWS Services

Walking you through how to best prepare for the AWS Certified Solutions Architect Associate SAA-C02 exam in 5 steps:
1. Understand the exam blueprint
2. Learn about the new topics included in the SAA-C02 version of the exam
3. Use the many FREE resources available to gain and deepen your knowledge
4. Enroll in our hands-on video course to learn AWS in depth
5. Use practice tests to fully prepare yourself for the exam and assess your exam readiness
AWS CERTIFIED SOLUTIONS ARCHITECT SAA-C02 : HOW TO BEST PREPARE IN 5 STEPS

Storage:
1. Know your different Amazon S3 storage tiers! You need to know the use cases, features and limitations, and relative costs; e.g. retrieval costs.
2. Amazon S3 lifecycle policies is also required knowledge — there are minimum storage times in certain tiers that you need to know.
3. For Glacier, you need to understand what it is, what it’s used for, and what the options are for retrieval times and fees.
4. For the Amazon Elastic File System (EFS), make sure you’re clear which operating systems you can use with it (just Linux).
5. For the Amazon Elastic Block Store (EBS), make sure you know when to use the different tiers including instance stores; e.g. what would you use for a datastore that requires the highest IO and the data is distributed across multiple instances? (Good instance store use case)
6. Learn about Amazon FSx. You’ll need to know about FSx for Windows and Lustre.
7. Know how to improve Amazon S3 performance including using CloudFront, and byte-range fetches — check out this whitepaper.
8. Make sure you understand about Amazon S3 object deletion protection options including versioning and MFA delete.
AWS CERTIFIED SOLUTIONS ARCHITECT SAA-C02 : HOW TO BEST PREPARE IN 5 STEPS

Compute:
1. You need to have a good understanding of the options for how to scale an Auto Scaling Group using metrics such as SQS queue depth, or numbers of SNS messages.
2. Know your different Auto Scaling policies including Target Tracking Policies.
3. Read up on High Performance Computing (HPC) with AWS. You’ll need to know about Amazon FSx with HPC use cases.
4. Know your placement groups. Make sure you can differentiate between spread, cluster and partition; e.g. what would you use for lowest latency? What about if you need to support an app that’s tightly coupled? Within an AZ or cross AZ?
5. Make sure you know the difference between Elastic Network Adapters (ENAs), Elastic Network Interfaces (ENIs) and Elastic Fabric Adapters (EFAs).
6. For the Amazon Elastic Container Service (ECS), make sure you understand how to assign IAM policies to ECS for providing S3 access. How can you decouple an ECS data processing process — Kinesis Firehose or SQS?
7. Make sure you’re clear on the different EC2 pricing models including Reserved Instances (RI) and the different RI options such as scheduled RIs.
8. Make sure you know the maximum execution time for AWS Lambda (it’s currently 900 seconds or 15 minutes).
AWS CERTIFIED SOLUTIONS ARCHITECT SAA-C02 : HOW TO BEST PREPARE IN 5 STEPS

Network
1. Understand what AWS Global Accelerator is and its use cases.
2. Understand when to use CloudFront and when to use AWS Global Accelerator.
3. Make sure you understand the different types of VPC endpoint and which require an Elastic Network Interface (ENI) and which require a route table entry.
4. You need to know how to connect multiple accounts; e.g. should you use VPC peering or a VPC endpoint?
5. Know the difference between PrivateLink and ClassicLink.
6. Know the patterns for extending a secure on-premises environment into AWS.
7. Know how to encrypt AWS Direct Connect (you can use a Virtual Private Gateway / AWS VPN).
8. Understand when to use Direct Connect vs Snowball to migrate data — lead time can be an issue with Direct Connect if you’re in a hurry.
9. Know how to prevent circumvention of Amazon CloudFront; e.g. Origin Access Identity (OAI) or signed URLs / signed cookies.
AWS CERTIFIED SOLUTIONS ARCHITECT SAA-C02 : HOW TO BEST PREPARE IN 5 STEPS

Databases
1. Make sure you understand Amazon Aurora and Amazon Aurora Serverless.
2. Know which RDS databases can have Read Replicas and whether you can read from a Multi-AZ standby.
3. Know the options for encrypting an existing RDS database; e.g. only at creation time otherwise you must encrypt a snapshot and create a new instance from the snapshot.
4. Know which databases are key-value stores; e.g. Amazon DynamoDB.
AWS CERTIFIED SOLUTIONS ARCHITECT SAA-C02 : HOW TO BEST PREPARE IN 5 STEPS

Application Integration
1. Make sure you know the use cases for the Amazon Simple Queue Service (SQS), and Simple Notification Service (SNS).
2. Understand the differences between Amazon Kinesis Firehose and SQS and when you would use each service.
3. Know how to use Amazon S3 event notifications to publish events to SQS — here’s a good “How To” article.
AWS CERTIFIED SOLUTIONS ARCHITECT SAA-C02 : HOW TO BEST PREPARE IN 5 STEPS

Management and Governance
1. You’ll need to know about AWS Organizations; e.g. how to migrate an account between organizations.
2. For AWS Organizations, you also need to know how to restrict actions using service control policies attached to OUs.
3. Understand what AWS Resource Access Manager is.
AWS CERTIFIED SOLUTIONS ARCHITECT SAA-C02 : HOW TO BEST PREPARE IN 5 STEPS

Jon Bonso list of helpful exam prep materials that you can use.
1. The official AWS SAA-C02 Certification Exam page.
2. The official AWS Exam Guide.
3. The official AWS Sample Questions
4. The official AWS Ramp-Up Guide: Architect PDF
5. Tutorials Dojo SAA-C02 Study Guide
6. Udemy Practice Exams
7. New AWS Services to prepare for:AWS Global Accelerator
8. New AWS Services to prepare for: Elastic Fabric Adapter — Amazon Web Services
9. New AWS Services to prepare for: AWS ParallelCluster – Amazon Web Services
10. New AWS Services to prepare for: Amazon FSx File Storage
Pass your SAA-C02 (AWS Solutions Architect Associate) exam with these Top 5 Resources

About this App

The AWS Certified Solution Architect Associate Examination reparation and Readiness Quiz App (SAA-C01, SAA-C01, SAA) Prep App helps you prepare and train for the AWS Certification Solution Architect Associate Exam with various questions and answers dumps.

This App provide updated Questions and Answers, an Intuitive Responsive Interface allowing to browse questions horizontally and browse tips and resources vertically after completing a quiz.

Features:

100+ Questions and Answers updated frequently to get you AWS certified.
Quiz with score tracker, countdown timer, highest score saving. Vie Answers after completing the quiz for each category.
Can only see answers after completing the quiz.
Show/Hide button option for answers. Link to PRO Version to see all answers for each category
Ability to navigate through questions for each category using next and previous button.
Resource info page about the answer for each category and Top 60 Tips to succeed in the exam.
Prominent Cloud Evangelist latest tweets and Technology Latest News Feed
The app helps you study and practice from your mobile device with an intuitive interface.
SAA-C01 and SAA-C02 compatible
Resource info page about the answer for each category.
Helps you study and practice from your mobile device with an intuitive interface.

The questions and Answers are divided in 4 categories:

Design High Performing Architectures,
Design Cost Optimized Architectures,
Design Secure Applications And Architectures,
Design Resilient Architecture,

The questions and answers cover the following topics: AWS VPC, S3, DynamoDB, EC2, ECS, Lambda, API Gateway, CloudWatch, CloudTrail, Code Pipeline, Code Deploy, TCO Calculator, AWS S3, AWS DynamoDB, CloudWatch , AWS SES, Amazon Lex, AWS EBS, AWS ELB, AWS Autoscaling , RDS, Aurora, Route 53, Amazon CodeGuru, Amazon Bracket, AWS Billing and Pricing, AWS Simply Monthly Calculator, AWS cost calculator, Ec2 pricing on-demand, AWS Pricing, AWS Pay As You Go, AWS No Upfront Cost, Cost Explorer, AWS Organizations, Consolidated billing, Instance Scheduler, on-demand instances, Reserved instances, Spot Instances, CloudFront, Web hosting on S3, S3 storage classes, AWS Regions, AWS Availability Zones, Trusted Advisor, Various architectural Questions and Answers about AWS, AWS SDK, AWS EBS Volumes, EC2, S3, Containers, KMS, AWS read replicas, Cloudfront, API Gateway, AWS Snapshots, Auto shutdown Ec2 instances, High Availability, RDS, DynamoDB, Elasticity, AWS Virtual Machines, AWS Caching, AWS Containers, AWS Architecture, AWS Ec2, AWS S3, AWS Security, AWS Lambda, Bastion Hosts, S3 lifecycle policy, kinesis sharing, AWS KMS, Design High Performing Architectures, Design Cost Optimized Architectures, Design Secure Applications And Architectures, Design Resilient Architecture, AWS vs Azure vs Google Cloud, Resources, Questions, AWS, AWS SDK, AWS EBS Volumes, AWS read replicas, Cloudfront, API Gateway, AWS Snapshots, Auto shutdown Ec2 instances, High Availability, RDS, DynamoDB, Elasticity, AWS Virtual Machines, AWS Caching, AWS Containers, AWS Architecture, AWS Ec2, AWS S3, AWS Security, AWS Lambda, Load Balancing, DynamoDB, EBS, Multi-AZ RDS, Aurora, EFS, DynamoDB, NLB, ALB, Aurora, Auto Scaling, DynamoDB(latency), Aurora(performance), Multi-AZ RDS(high availability), Throughput Optimized EBS (highly sequential), SAA-CO1, SAA-CO2, Cloudwatch, CloudTrail, KMS, ElasticBeanstalk, OpsWorks, RPO vs RTO, HA vs FT, Undifferentiated Heavy Lifting, Access Management Basics, Shared Responsibility Model, Cloud Service Models, etc…

The resources sections cover the following areas: Certification, AWS training, Mock Exam Preparation Tips, Cloud Architect Training, Cloud Architect Knowledge, Cloud Technology, cloud certification, cloud exam preparation tips, cloud solution architect associate exam, certification practice exam, learn aws free, amazon cloud solution architect, question dumps, acloud guru links, tutorial dojo links, linuxacademy links, latest aws certification tweets, and post from reddit, quota, linkedin, medium, cloud exam preparation tips, aws cloud solution architect associate exam, aws certification practice exam, cloud exam questions, learn aws free, amazon cloud solution architect, amazon cloud certified solution architect associate exam questions, as certification dumps, google cloud, azure cloud, acloud, learn google cloud, learn azure cloud, cloud comparison, etc.

Abilities Validated by the Certification:

Effectively demonstrate knowledge of how to architect and deploy secure and robust applications on AWS technologies
Define a solution using architectural design principles based on customer requirements
Provide implementation guidance based on best practices to the organization throughout the life cycle of the project

Recommended Knowledge for the Certification:

One year of hands-on experience designing available, cost-effective, fault-tolerant, and scalable distributed systems on AWS.
Hands-on experience using compute, networking, storage, and database AWS services.
Hands-on experience with AWS deployment and management services.
Ability to identify and define technical requirements for an AWS-based application.
bility to identify which AWS services meet a given technical requirement.
Knowledge of recommended best practices for building secure and reliable applications on the AWS platform.
An understanding of the basic architectural principles of building in the AWS Cloud.
An understanding of the AWS global infrastructure.
An understanding of network technologies as they relate to AWS.
An understanding of security features and tools that AWS provides and how they relate to traditional services.

Note and disclaimer: We are not affiliated with AWS or Amazon or Microsoft or Google. The questions are put together based on the certification study guide and materials available online. We also receive questions and answers from anonymous users and we vet to make sure they are legitimate. The questions in this app should help you pass the exam but it is not guaranteed. We are not responsible for any exam you did not pass.

Important: To succeed with the real exam, do not memorize the answers in this app. It is very important that you understand why a question is right or wrong and the concepts behind it by carefully reading the reference documents in the answers.

Top

What is the AWS Certified Solution Architect Associate Exam?

This exam validates an examinee’s ability to effectively demonstrate knowledge of how to architect and deploy secure and robust applications on AWS technologies. It validates an examinee’s ability to:

Define a solution using architectural design principles based on customer requirements.
Multiple-response: Has two correct responses out of five options.

There are two types of questions on the examination:

Multiple-choice: Has one correct response and three incorrect responses (distractors).
Provide implementation guidance based on best practices to the organization throughout the lifecycle of the project.

Select one or more responses that best complete the statement or answer the question. Distractors, or incorrect answers, are response options that an examinee with incomplete knowledge or skill would likely choose. However, they are generally plausible responses that fit in the content area defined by the test objective. Unanswered questions are scored as incorrect; there is no penalty for guessing.

Top

AWS Certified Solution Architect Associate info and details

The AWS Certified Solution Architect Associate Exam is a multiple choice, multiple answer exam. Here is the Exam Overview:

Certification Name: AWS Certified Solution Architect Associate.
Prerequisites for the Exam: None.
Exam Pattern: Multiple Choice Questions
Duration: 130 mins
Exam fees: US $150
Exam Guide on AWS Website
Available languages for tests: English, Japanese, Korean, Simplified Chinese
Read AWS whitepapers
Register for certification account here.
Prepare for Certification Here

Top

Additional Information for reference

Below are some useful reference links that would help you to learn about AWS Practitioner Exam.

Other Relevant and Recommended AWS Certifications

Top 60 AWS Solution Architect Associate Exam Tips — AWS Certification Exams Roadmap

AWS Certification Exams Roadmap[/caption]

AWS Solution Architect Associate Exam Whitepapers:

AWS has provided whitepapers to help you understand the technical concepts. Below are the recommended whitepapers.

AWS Well-Architected webpage (various whitepapers linked)

Top

Online Training and Labs for AWS Certified Solution Architect Associate Exam

Top

AWS Certified Solution Architect Associate Jobs

AWS Certification and Training Apps for all platforms:

AWS Cloud practitioner FREE version:

AWS Certified Cloud practitioner for the web:pwa

AWS Certified Cloud practitioner Exam Prep App for iOS

AWS Certified Cloud practitioner Exam Prep App for Microsoft/Windows10

AWS Certified Cloud practitioner Exam Prep App for Android (Google Play Store)

AWS Certified Cloud practitioner Exam Prep App for Android (Amazon App Store)

AWS Certified Cloud practitioner Exam Prep App for Android (Huawei App Gallery)

AWS Solution Architect FREE version:

AWS Certified Solution Architect Associate Exam Prep App for iOS: https://apps.apple.com/ca/app/solution-architect-assoc-quiz/id1501225766

Solution Architect Associate for Android Google Play

AWS Certified Solution Architect Associate Exam Prep App for the eb: Pwa

AWS Certified Solution Architect Associate Exam Prep App for Amazon android

‪‬

AWS Certified Cloud practitioner Exam Prep App for Microsoft/Windows10

AWS Certified Cloud practitioner Exam Prep App for Huawei App Gallery

AWS Cloud Practitioner PRO Versions:

AWS Certified Cloud practitioner PRO Exam Prep App for iOS

AWS Certified Cloud Practitioner PRO Associate Exam Prep App for android google

AWS Certified Cloud practitioner Exam Prep App for Amazon android

AWS Certified Cloud practitioner Exam Prep App for Windows 10

AWS Certified Cloud practitioner Exam Prep PRO App for Android (Huawei App Gallery) Coming soon

AWS Solution Architect PRO

AWS Certified Solution Architect Associate PRO versions for iOS

AWS Certified Solution Architect Associate PRO Exam Prep App for Android google

AWS Certified Solution Architect Associate PRO Exam Prep App for Windows10

AWS Certified Solution Architect Associate PRO Exam Prep App for Amazon android

Huawei App Gallery: Coming soon

AWS Certified Developer Associates Free version:

AWS Certified Developer Associates for Android (Google Play)

AWS Certified Developer Associates Web/PWA

AWS Certified Developer Associates for iOs

AWS Certified Developer Associates for Android (Huawei App Gallery)

AWS Certified Developer Associates for windows 10 (Microsoft App store)

Amazon App Store: Coming soon

AWS Developer Associates PRO version

PRO version with mock exam for android (Google Play)

PRO version with mock exam ios

AWS Certified Developer Associates PRO for Android (Amazon App Store): Coming Soon

AWS Certified Developer Associates PRO for Android (Huawei App Gallery): Coming soon

June 7, 2019January 24, 2023

AWS Certification Exams Prep: Serverless Facts and Summaries and Question and Answers

AWS Serverless – Facts and summaries, Top 20 AWS Serverless Questions and Answers Dump

Definition 1: Serverless computing is a cloud-computing execution model in which the cloud provider runs the server, and dynamically manages the allocation of machine resources. Pricing is based on the actual amount of resources consumed by an application, rather than on pre-purchased units of capacity. It can be a form of utility computing.
Definition 2: AWS Serverless is the native architecture of the cloud that enables you to shift more of your operational responsibilities to AWS, increasing your agility and innovation. Serverless allows you to build and run applications and services without thinking about servers. It eliminates infrastructure management tasks such as server or cluster provisioning, patching, operating system maintenance, and capacity provisioning.

AWS Serverless Facts and summaries

Get the Read Aloud Wonderland Bedtime Adventure iBook here

The AWS Serverless Application Model (AWS SAM) is a model to define serverless applications. AWS SAM is natively supported by AWS CloudFormation and provides a simplified way of defining the Amazon API Gateway APIs, AWS Lambda functions, and Amazon DynamoDB tables needed by your serverless application.
You can use AWS CodePipeline with the AWS Serverless Application Model to automate building, testing, and deploying serverless applications. AWS CodeBuild integrates with CodePipeline to provide automated builds. You can use AWS CodeDeploy to gradually roll out and test new Lambda function versions.
You can monitor and troubleshoot the performance of your serverless applications and AWS Lambda functions with AWS services and third-party tools. Amazon CloudWatch helps you see real-time reporting metrics and logs for your serverless applications. You can use AWS X-Ray to debug and trace your serverless applications and AWS Lambda.
The AWS Serverless Application Repository is a managed repository for serverless applications. It enables teams, organizations, and individual developers to store and share reusable applications, and easily assemble and deploy serverless architectures in powerful new ways. Using the Serverless Application Repository, you don’t need to clone, build, package, or publish source code to AWS before deploying it. Instead, you can use pre-built applications from the Serverless Application Repository in your serverless architectures, helping you and your teams reduce duplicated work, ensure organizational best practices, and get to market faster.
Anyone with an AWS account can publish a serverless application to the Serverless Application Repository. Applications can be privately shared with specific AWS accounts. Applications that are shared publicly include a link to the application’s source code so others can view what the application does and how it works.
What kinds of applications are available in the AWS Serverless Application Repository? The AWS Serverless Application Repository includes applications for Alexa Skills, chatbots, data processing, IoT, real time stream processing, web and mobile back-ends, social media trend analysis, image resizing, and more from publishers on AWS.
The AWS Serverless Application Repository enables developers to publish serverless applications developed in a GitHub repository. Using AWS CodePipeline to link a GitHub source with the AWS Serverless Application Repository can make the publishing process even easier, and the process can be set up in minutes.
What two arguments does a Python Lambda handler function require?
Event, Context
A Lambda deployment package contains Function code and libraries not included within the runtime environment
When referencing the remaining time left for a Lambda function to run within the function’s code you would use The context object.
Long-running memory-intensive workloads is LEAST suited to AWS Lambda
The maximum execution duration of your Lambda functions is Fifteen Minutes
Logs for Lambda functions are Stored in AWS CloudWatch
Docker Container Images are constructed using instructions in a file called Dockerfile
The ECS Task Agent Is responsible for starting and stopping tasks. It runs inside the EC2 instance and reports on information like running tasks and resource utilization
AWS ECR Stores Container Images.
Elastic Beanstalk is used to Deploy and scale web applications and services developed with a supported platform
When deploying a simple Python web application with Elastic Beanstalk which of the following AWS resources will be created and managed for you by Elastic Beanstalk?
An Elastic Load Balancer, an S3 Bucket, an Ec2 instance.
When using Elastic Beanstalk you can deploy your web applications by:
- Configuring a git repository with Elastic Beanstalk so that changes will be detected and your application will be updated.
- Uploading code files to the Elastic Beanstalk service

Top
Reference: AWS Serverless

AWS LAMBDA EXPLAINED GRAPHICALLY:

Advertise with us - Post Your Good Content Here
We are ranked in the Top 20 on Google

AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version

Invest in your future today by enrolling in this Azure Fundamentals - Pass the Azure Fundamentals Exam with Ease: Master the AZ-900 Certification with the Comprehensive Exam Preparation Guide!

AWS Serverless: Top 20 Questions and Answers Dump

Q00: You have created a serverless application which converts text in to speech using a combination of S3, API Gateway, Lambda, Polly, DynamoDB and SNS. Your users complain that only some text is being converted, whereas longer amounts of text does not get converted. What could be the cause of this problem?

A. Polly has built in censorship, so if you try and send it text that is deemed offensive, it will not generate an MP3.
B. You’ve placed your DynamoDB table in a single availability zone, which is currently down, causing an outage.
C. Your lambda function needs a longer execution time. You should check how long is needed in the fringe cases and increase the timeout inside the function to slightly longer than that.
D. AWS X-ray service is interfering with the application and should be disabled.

Answer: C

Reference: AWS Lambda limits

Top

Q1: How does API Gateway deal with legacy SOAP applications?

A. Converts the response from the application to REST
B. Converts the response from the application to HTML
C. Provides webservice passthrough for SOAP applications
D. Converts the response from the application to XML

Answer: C
SOAP Applications send their responses in XML format. API Gateway supports SOAP applications, but only provides passthrough. API Gateway does not transform or convert the responses.
Reference: How to configure Amazon API Gateway as a SOAP webservice passthrough in minutes

Top

Q3: You have launched a new web application on AWS using API Gateway, Lambda and S3. Someone post a thread to reddit about your application and it starts to go viral.
Your start receiving 100000 requests every second and you notice that most requests are similar.
Your web application begins to struggle. What can you do to optimize performance of your application?

A. Enable API Gateway Accelerator
B. Enable API Gateway caching to cache frequent requests.
C. Change your route53 allias record to point to AWS Neptune and then configure Neptune to filter your API requests to genuine requests only.
D. Migrate your API Gateway to an Network Load Balancer and enable session stickiness for all sessions.

Answer: B.

Reference: Amazon API Gateway FAQs

Top

Q4: Which of the following services does X-ray integrate with? (Choose 3)

A. Elastic Load Balancer
B. Lambda
C. S3
D. API Gateway

Answer: A. B. and D.
AWS X-Ray helps developers analyze and debug production, distributed applications, such as those built using a microservices architecture. With X-Ray, you can understand how your application and its underlying services are performing to identify and troubleshoot the root cause of performance issues and errors.
You can use X-Ray with applications running on EC2, ECS, Lambda, and Elastic Beanstalk. In addition, the X-Ray SDK automatically captures metadata for API calls made to AWS services using the AWS SDK. In addition, the X-Ray SDK provides add-ons for MySQL and PostgreSQL drivers.

"Pass the AWS Cloud Practitioner Certification with flying colors: Master the Exam with 300+ Quizzes, Cheat Sheets, Flashcards, and Illustrated Study Guides - 2024 Edition"

Reference: AWS X-Ray

Top

Q5: You are a developer for a busy real estate company and you want to enable other real estate agents to the
ability to show properties on your books, but skinned so that it looks like their own website. You decide the most efficient way to do this is to
expose your API to the public. The project works well, however one of your competitors starts abusing this, sending your API tens of thousands
of requests per second. This generates a HTTP 429 error. Each agent connects to your API using individual API Keys. What actions can you take to stop this behavior?

A. Use AWS Shield Advanced API protection to block the requests.
B. Deploy multiple API Gateways and give the agent access to another API Gateway.
C. Place an AWS Web Application Firewall in front of API gateway and filter requests.
D. Throttle the agents API access using the individual API Keys

"Become a Canada Expert: Ace the Citizenship Test and Impress Everyone with Your Knowledge of Canadian History, Geography, Government, Culture, People, Languages, Travel, Wildlife, Hockey, Tourism, Sceneries, Arts, and Data Visualization. Get the Top 1000 Canada Quiz Now!"

Answer: D.
Throttling ensures that API traffic is controlled to help your backend services maintain performance and availability.
How can I protect my backend systems and applications from traffic spikes?
Amazon API Gateway provides throttling at multiple levels including global and by service call. Throttling limits can be set for standard rates and bursts. For example, API owners can set a rate limit of 1,000 requests per second for a specific method in their REST APIs, and also configure Amazon API Gateway to handle a burst of 2,000 requests per second for a few seconds. Amazon API Gateway tracks the number of requests per second. Any requests over the limit will receive a 429 HTTP response. The client SDKs generated by Amazon API Gateway retry calls automatically when met with this response.

Reference: Amazon API Gateway FAQs

Top

Q6: You are developing a new application using serverless infrastructure and are using services such as S3, DynamoDB, Lambda, API Gateway, CloudFront, CloudFormation and Polly.
You deploy your application to production and your end users begin complaining about receiving a HTTP 429 error. What could be the cause of the error?

A. You enabled API throttling for a rate limit of 1000 requests per second while in development and now that you have deployed to production your API Gateway is being throttled.
B. Your cloudFormation stack is not valid and is failling to deploy properly which is causing a HTTP 429 error.
C. Your lambda function does not have sufficient permissions to read to DynamoDB and this is generating a HTTP 429 error.
D. You have an S3 bucket policy which is preventing lambda from being able to write tyo your bucket, generating a HTTP 429 error.

Answer: A.
Amazon API Gateway provides throttling at multiple levels including global and by service call. Throttling limits can be set for standard rates and bursts. For example, API owners can set a rate limit of 1,000 requests per second for a specific method in their REST APIs, and also configure Amazon API Gateway to handle a burst of 2,000 requests per second for a few seconds. Amazon API Gateway tracks the number of requests per second. Any requests over the limit will receive a 429 HTTP response. The client SDKs generated by Amazon API Gateway retry calls automatically when met with this response.

Reference: Amazon API Gateway FAQs

Unlock the Secrets of Africa: Master African History, Geography, Culture, People, Cuisine, Economics, Languages, Music, Wildlife, Football, Politics, Animals, Tourism, Science and Environment with the Top 1000 Africa Quiz and Trivia. Get Yours Now!

Top

Q7: What is the format of structured notification messages sent by Amazon SNS?

A. An XML object containing MessageId, UnsubscribeURL, Subject, Message and other
values
B. An JSON object containing MessageId, DuplicateFlag, Message and other values
C. An XML object containing MessageId, DuplicateFlag, Message and other values
D. An JSON object containing MessageId, unsubscribeURL, Subject, Message and other
values

Answer: D.

The notification message sent by Amazon SNS for deliveries over HTTP, HTTPS, Email-JSON and SQS transport protocols will consist of a simple JSON object, which will include the following information: MessageId: A Universally Unique Identifier, unique for each notification published.Reference: Format of structured notification messages sent by Amazon SNS

Top

Dive into a comprehensive AWS Cloud Practitioner CLF-C02 Certification guide, masterfully weaving insights from Tutorials Dojo, Adrian Cantrill, Stephane Maarek, and AWS Skills Builder into one unified resource.

AWS Certification Exam Prep: DynamoDB Facts, Summaries and Questions/Answers.

AWS Certification Exam Prep: DynamoDB facts and summaries, AWS DynamoDB Top 10 Questions and Answers Dump

Definition 1: Amazon DynamoDB is a fully managed proprietary NoSQL database service that supports key-value and document data structures and is offered by Amazon.com as part of the Amazon Web Services portfolio. DynamoDB exposes a similar data model to and derives its name from Dynamo, but has a different underlying implementation. Dynamo had a multi-master design requiring the client to resolve version conflicts and DynamoDB uses synchronous replication across multiple datacenters for high durability and availability.

Definition 2: DynamoDB is a fast and flexible non-relational database service for any scale. DynamoDB enables customers to offload the administrative burdens of operating and scaling distributed databases to AWS so that they don’t have to worry about hardware provisioning, setup and configuration, throughput capacity planning, replication, software patching, or cluster scaling.

Amazon DynamoDB explained

Fully Managed
Fast, consistent Performance
Fine-grained access control
Flexible

AWS DynamoDB Facts and Summaries

Amazon DynamoDB is a low-latency NoSQL database.
DynamoDB consists of Tables, Items, and Attributes
DynamoDb supports both document and key-value data models
DynamoDB Supported documents formats are JSON, HTML, XML
DynamoDB has 2 types of Primary Keys: Partition Key and combination of Partition Key + Sort Key (Composite Key)
DynamoDB has 2 consistency models: Strongly Consistent / Eventually Consistent
DynamoDB Access is controlled using IAM policies.
DynamoDB has fine grained access control using IAM Condition parameter dynamodb:LeadingKeys to allow users to access only the items where the partition key vakue matches their user ID.
DynamoDB Indexes enable fast queries on specific data columns
DynamoDB indexes give you a different view of your data based on alternative Partition / Sort Keys.
DynamoDB Local Secondary indexes must be created when you create your table, they have same partition Key as your table, and they have a different Sort Key.
DynamoDB Global Secondary Index Can be created at any time: at table creation or after. They have a different partition Key as your table and a different sort key as your table.
A DynamoDB query operation finds items in a table using only the primary Key attribute: You provide the Primary Key name and a distinct value to search for.
A DynamoDB Scan operation examines every item in the table. By default, it return data attributes.
DynamoDB Query operation is generally more efficient than a Scan.
With DynamoDB, you can reduce the impact of a query or scan by setting a smaller page size which uses fewer read operations.
To optimize DynamoDB performance, isolate scan operations to specific tables and segregate them from your mission-critical traffic.
To optimize DynamoDB performance, try Parallel scans rather than the default sequential scan.
To optimize DynamoDB performance: Avoid using scan operations if you can: design tables in a way that you can use Query, Get, or BatchGetItems APIs.
When you scan your table in Amazon DynamoDB, you should follow the DynamoDB best practices for avoiding sudden bursts of read activity.
DynamoDb Provisioned Throughput is measured in Capacity Units.
- 1 Write Capacity Unit = 1 x 1KB Write per second.
- 1 Read Capacity Unit = 1 x 4KB Strongly Consistent Read Or 2 x 4KB Eventually Consistent Reads per second. Eventual consistent reads give us the maximum performance with the read operation.
What is the maximum throughput that can be provisioned for a single DynamoDB table?
DynamoDB is designed to scale without limits. However, if you want to exceed throughput rates of 10,000 write capacity units or 10,000 read capacity units for an individual table, you must Contact AWS to increase it.
If you want to provision more than 20,000 write capacity units or 20,000 read capacity units from a single subscriber account, you must first contact AWS to request a limit increase.
Dynamo Db Performance: DAX is a DynamoDB-compatible caching service that enables you to benefit from fast in-memory performance for demanding applications.
- As an in-memory cache, DAX reduces the response times of eventually-consistent read workloads by an order of magnitude, from single-digit milliseconds to microseconds
- DAX improves response times for Eventually Consistent reads only.
- With DAX, you point your API calls to the DAX cluster instead of your table.
- If the item you are querying is on the cache, DAX will return it; otherwise, it will perform and Eventually Consistent GetItem operation to your DynamoDB table.
- DAX reduces operational and application complexity by providing a managed service that is API compatible with Amazon DynamoDB, and thus requires only minimal functional changes to use with an existing application.
- DAX is not suitable for write-intensive applications or applications that require Strongly Consistent reads.
- For read-heavy or bursty workloads, DAX provides increased throughput and potential operational cost savings by reducing the need to over-provision read capacity units. This is especially beneficial for applications that require repeated reads for individual keys.
Dynamo Db Performance: ElastiCache
- In-memory cache sits between your application and database
- 2 different caching strategies: Lazy loading and Write Through: Lazy loading only caches the data when it is requested
- Elasticache Node failures are not fatal, just lots of cache misses
- Avoid stale data by implementing a TTL.
- Write-Through strategy writes data into cache whenever there is a change to the database. Data is never stale
- Write-Through penalty: Each write involves a write to the cache. Elasticache node failure means that data is missing until added or updated in the database.
- Elasticache is wasted resources if most of the data is never used.
Time To Live (TTL) for DynamoDB allows you to define when items in a table expire so that they can be automatically deleted from the database. TTL is provided at no extra cost as a way to reduce storage usage and reduce the cost of storing irrelevant data without using provisioned throughput. With TTL enabled on a table, you can set a timestamp for deletion on a per-item basis, allowing you to limit storage usage to only those records that are relevant.
DynamoDB Security: DynamoDB uses the CMK to generate and encrypt a unique data key for the table, known as the table key. With DynamoDB, AWS Owned, or AWS Managed CMK can be used to generate & encrypt keys. AWS Owned CMK is free of charge while AWS Managed CMK is chargeable. Customer managed CMK’s are not supported with encryption at rest.
Amazon DynamoDB offers fully managed encryption at rest. DynamoDB encryption at rest provides enhanced security by encrypting your data at rest using an AWS Key Management Service (AWS KMS) managed encryption key for DynamoDB. This functionality eliminates the operational burden and complexity involved in protecting sensitive data.
DynamoDB is a alternative solution which can be used for storage of session management. The latency of access to data is less , hence this can be used as a data store for session management
DynamoDB Streams Use Cases and Design Patterns:
How do you set up a relationship across multiple tables in which, based on the value of an item from one table, you update the item in a second table?
How do you trigger an event based on a particular transaction?
How do you audit or archive transactions?
How do you replicate data across multiple tables (similar to that of materialized views/streams/replication in relational data stores)?
As a NoSQL database, DynamoDB is not designed to support transactions. Although client-side libraries are available to mimic the transaction capabilities, they are not scalable and cost-effective. For example, the Java Transaction Library for DynamoDB creates 7N+4 additional writes for every write operation. This is partly because the library holds metadata to manage the transactions to ensure that it’s consistent and can be rolled back before commit.
You can use DynamoDB Streams to address all these use cases. DynamoDB Streams is a powerful service that you can combine with other AWS services to solve many similar problems. When enabled, DynamoDB Streams captures a time-ordered sequence of item-level modifications in a DynamoDB table and durably stores the information for up to 24 hours. Applications can access a series of stream records, which contain an item change, from a DynamoDB stream in near real time.
AWS maintains separate endpoints for DynamoDB and DynamoDB Streams. To work with database tables and indexes, your application must access a DynamoDB endpoint. To read and process DynamoDB Streams records, your application must access a DynamoDB Streams endpoint in the same Region
20 global secondary indexes are allowed per table? (by default)
What is one key difference between a global secondary index and a local secondary index?
A local secondary index must have the same partition key as the main table
How many tables can an AWS account have per region? 256
How many secondary indexes (global and local combined) are allowed per table? (by default): 25
You can define up to 5 local secondary indexes and 20 global secondary indexes per table (by default) – for a total of 25.
How can you increase your DynamoDB table limit in a region?
By contacting AWS and requesting a limit increase
For any AWS account, there is an initial limit of 256 tables per region.
The minimum length of a partition key value is 1 byte. The maximum length is 2048 bytes.
The minimum length of a sort key value is 1 byte. The maximum length is 1024 bytes.
For tables with local secondary indexes, there is a 10 GB size limit per partition key value. A table with local secondary indexes can store any number of items, as long as the total size for any one partition key value does not exceed 10 GB.
The following diagram shows a local secondary index named LastPostIndex. Note that the partition key is the same as that of the Thread table, but the sort key is LastPostDateTime.
AWS DynamoDB secondary indexes example
Relational vs Non Relational (SQL vs NoSQL)

Get the Read Aloud Wonderland Bedtime Adventure iBook here

Top
Reference: AWS DynamoDB

AWS DynamoDB Questions and Answers Dumps

Q0: What should the Developer enable on the DynamoDB table to optimize performance and minimize costs?

A. Amazon DynamoDB auto scaling
B. Amazon DynamoDB cross-region replication
C. Amazon DynamoDB Streams
D. Amazon DynamoDB Accelerator

D. DAX is a DynamoDB-compatible caching service that enables you to benefit from fast in-memory performance for demanding applications. DAX addresses three core scenarios:

As an in-memory cache, DAX reduces the response times of eventually-consistent read workloads by an order of magnitude, from single-digit milliseconds to microseconds.
DAX reduces operational and application complexity by providing a managed service that is API-compatible with Amazon DynamoDB, and thus requires only minimal functional changes to use with an existing application.
For read-heavy or bursty workloads, DAX provides increased throughput and potential operational cost savings by reducing the need to over-provision read capacity units. This is especially beneficial for applications that require repeated reads for individual keys.

Reference: AWS DAX

Top

Q2: A security system monitors 600 cameras, saving image metadata every 1 minute to an Amazon DynamoDb table. Each sample involves 1kb of data, and the data writes are evenly distributed over time. How much write throughput is required for the target table?

A. 6000
B. 10
C. 3600
D. 600

B. When you mention the write capacity of a table in Dynamo DB, you mention it as the number of 1KB writes per second. So in the above question, since the write is happening every minute, we need to divide the value of 600 by 60, to get the number of KB writes per second. This gives a value of 10.

You can specify the Write capacity in the Capacity tab of the DynamoDB table.

Reference: AWS working with tables

Advertise with us - Post Your Good Content Here
We are ranked in the Top 20 on Google

AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version

Q3: You are developing an application that will interact with a DynamoDB table. The table is going to take in a lot of read and write operations. Which of the following would be the ideal partition key for the DynamoDB table to ensure ideal performance?

A. CustomerID
B. CustomerName
C. Location
D. Age

Answer- A
Use high-cardinality attributes. These are attributes that have distinct values for each item, like e-mailid, employee_no, customerid, sessionid, orderid, and so on..
Use composite attributes. Try to combine more than one attribute to form a unique key.
Reference: Choosing the right DynamoDB Partition Key

Top

Q4: A DynamoDB table is set with a Read Throughput capacity of 5 RCU. Which of the following read configuration will provide us the maximum read throughput?

A. Read capacity set to 5 for 4KB reads of data at strong consistency
B. Read capacity set to 5 for 4KB reads of data at eventual consistency
C. Read capacity set to 15 for 1KB reads of data at strong consistency
D. Read capacity set to 5 for 1KB reads of data at eventual consistency

Answer: B.
The calculation of throughput capacity for option B would be:
Read capacity(5) * Amount of data(4) = 20.
Since its required at eventual consistency , we can double the read throughput to 20*2=40

Invest in your future today by enrolling in this Azure Fundamentals - Pass the Azure Fundamentals Exam with Ease: Master the AZ-900 Certification with the Comprehensive Exam Preparation Guide!

Reference: Read/Write Capacity Mode

Top

Q5: Your team is developing a solution that will make use of DynamoDB tables. Due to the nature of the application, the data is needed across a couple of regions across the world. Which of the following would help reduce the latency of requests to DynamoDB from different regions?

A. Enable Multi-AZ for the DynamoDB table
B. Enable global tables for DynamoDB
C. Enable Indexes for the table
D. Increase the read and write throughput for the tablez

Answer: B
Amazon DynamoDB global tables provide a fully managed solution for deploying a multi-region, multimaster database, without having to build and maintain your own replication solution. When you create a global table, you specify the AWS regions where you want the table to be available. DynamoDB performs all of the necessary tasks to create identical tables in these regions, and propagate ongoing data changes to all of them.
Reference: Global Tables

Top

Q6: An application is currently accessing a DynamoDB table. Currently the tables queries are performing well. Changes have been made to the application and now the performance of the application is starting to degrade. After looking at the changes , you see that the queries are making use of an attribute which is not the partition key? Which of the following would be the adequate change to make to resolve the issue?

A. Add an index for the DynamoDB table
B. Change all the queries to ensure they use the partition key
C. Enable global tables for DynamoDB
D. Change the read capacity on the table

Answer: A
Amazon DynamoDB provides fast access to items in a table by specifying primary key values. However, many applications might benefit from having one or more secondary (or alternate) keys available, to allow efficient access to data with attributes other than the primary key. To address this, you can create one or more secondary indexes on a table, and issue Query or Scan requests against these indexes.

A secondary index is a data structure that contains a subset of attributes from a table, along with an alternate key to support Query operations. You can retrieve data from the index using a Query, in much the same way as you use Query with a table. A table can have multiple secondary indexes, which gives your applications access to many different query patterns.

Reference: Improving Data Access with Secondary Indexes

Top

Q7: Company B has created an e-commerce site using DynamoDB and is designing a products table that includes items purchased and the users who purchased the item.
When creating a primary key on a table which of the following would be the best attribute for the partition key? Select the BEST possible answer.

A. None of these are correct.
B. user_id where there are many users to few products
C. category_id where there are few categories to many products
D. product_id where there are few products to many users

Answer: B.
When designing tables it is important for the data to be distributed evenly across the entire table. It is best practice for performance to set your primary key where there are many primary keys to few rows. An example would be many users to few products. An example of bad design would be a primary key of product_id where there are few products but many users.
When designing tables it is important for the data to be distributed evenly across the entire table. It is best practice for performance to set your primary key where there are many primary keys to few rows. An example would be many users to few products. An example of bad design would be a primary key of product_id where there are few products but many users.
Reference: Partition Keys and Sort Keys

Top

Q8: Which API call can be used to retrieve up to 100 items at a time or 16 MB of data from a DynamoDB table?

A. BatchItem
B. GetItem
C. BatchGetItem
D. ChunkGetItem

Answer: C. BatchGetItem

"Pass the AWS Cloud Practitioner Certification with flying colors: Master the Exam with 300+ Quizzes, Cheat Sheets, Flashcards, and Illustrated Study Guides - 2024 Edition"

The BatchGetItem operation returns the attributes of one or more items from one or more tables. You identify requested items by primary key.

A single operation can retrieve up to 16 MB of data, which can contain as many as 100 items. BatchGetItem will return a partial result if the response size limit is exceeded, the table’s provisioned throughput is exceeded, or an internal processing failure occurs. If a partial result is returned, the operation returns a value for UnprocessedKeys. You can use this value to retry the operation starting with the next item to get.Reference: API-Specific Limits

Top

"Become a Canada Expert: Ace the Citizenship Test and Impress Everyone with Your Knowledge of Canadian History, Geography, Government, Culture, People, Languages, Travel, Wildlife, Hockey, Tourism, Sceneries, Arts, and Data Visualization. Get the Top 1000 Canada Quiz Now!"

Q9: Which DynamoDB limits can be raised by contacting AWS support?

A. The number of hash keys per account
B. The maximum storage used per account
C. The number of tables per account
D. The number of local secondary indexes per account
E. The number of provisioned throughput units per account

Answer: C. and E.

For any AWS account, there is an initial limit of 256 tables per region.
AWS places some default limits on the throughput you can provision.
These are the limits unless you request a higher amount.
To request a service limit increase see https://aws.amazon.com/support.

Reference: Limits in DynamoDB

Top

Q10: Which approach below provides the least impact to provisioned throughput on the “Product”
table?

A. Create an “Images” DynamoDB table to store the Image with a foreign key constraint to
the “Product” table
B. Add an image data type to the “Product” table to store the images in binary format
C. Serialize the image and store it in multiple DynamoDB tables
D. Store the images in Amazon S3 and add an S3 URL pointer to the “Product” table item
for each image

Answer: D.

Amazon DynamoDB currently limits the size of each item that you store in a table (see Limits in DynamoDB). If your application needs to store more data in an item than the DynamoDB size limit permits, you can try compressing one or more large attributes, or you can store them as an object in Amazon Simple Storage Service (Amazon S3) and store the Amazon S3 object identifier in your DynamoDB item.
Compressing large attribute values can let them fit within item limits in DynamoDB and reduce your storage costs. Compression algorithms such as GZIP or LZO produce binary output that you can then store in a Binary attribute type.
Reference: Best Practices for Storing Large Items and Attributes

Top

Unlock the Secrets of Africa: Master African History, Geography, Culture, People, Cuisine, Economics, Languages, Music, Wildlife, Football, Politics, Animals, Tourism, Science and Environment with the Top 1000 Africa Quiz and Trivia. Get Yours Now!

Q11: You’re creating a forum DynamoDB database for hosting forums. Your “thread” table contains the forum name and each “forum name” can have one or more “subjects”. What primary key type would you give the thread table in order to allow more than one subject to be tied to the forum primary key name?

A. Hash
B. Range and Hash
C. Primary and Range
D. Hash and Range

Answer: D.
Each forum name can have one or more subjects. In this case, ForumName is the hash attribute and Subject is the range attribute.

Reference: DynamoDB keys

Top

Dive into a comprehensive AWS Cloud Practitioner CLF-C02 Certification guide, masterfully weaving insights from Tutorials Dojo, Adrian Cantrill, Stephane Maarek, and AWS Skills Builder into one unified resource.

Amazon Aurora explained:

High scalability
High availability and durability
High Performance
Multi Region

Amazon ElastiCache Explained

In-Memory data store
High availability and reliability
Fully managed
Supports two pop
Open source engine

Amazon Redshift explained

Fast, fully managed, petabyte-scale data warehouse
Supports wide range of open data formats
Allows you to run SQL queries against large unstructured data in Amazon Simple Storage Service
Integrates with popular Business Intelligence (BI) and extract, Transform, Load (ETL) solutions.

Amazon Neptune Explained

Fully managed graph database
Supports open graph APIs
Used in Social Networking
Amazon Neptune Explained

AWS Certification Exam Prep: S3 Facts, Summaries, Questions and Answers

AWS S3 Facts and summaries, AWS S3 Top 10 Questions and Answers Dump

Definition 1: Amazon S3 or Amazon Simple Storage Service is a “simple storage service” offered by Amazon Web Services that provides object storage through a web service interface. Amazon S3 uses the same scalable storage infrastructure that Amazon.com uses to run its global e-commerce network.

Definition 2: Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance.

AWS S3 Explained graphically:

Amazon S3 Explained in pictures — Amazon S3 Explained

Get the Read Aloud Wonderland Bedtime Adventure iBook here

AWS S3 Facts and summaries

S3 is a universal namespace, meaning each S3 bucket you create must have a unique name that is not being used by anyone else in the world.
S3 is object based: i.e allows you to upload files.
Files can be from 0 Bytes to 5 TB
What is the maximum length, in bytes, of a DynamoDB range primary key attribute value?
The maximum length of a DynamoDB range primary key attribute value is 2048 bytes (NOT 256 bytes).
S3 has unlimited storage.
Files are stored in Buckets.
Read after write consistency for PUTS of new Objects
Eventual Consistency for overwrite PUTS and DELETES (can take some time to propagate)
S3 Storage Classes/Tiers:
- S3 Standard (durable, immediately available, frequently accesses)
- Amazon S3 Intelligent-Tiering (S3 Intelligent-Tiering): It works by storing objects in two access tiers: one tier that is optimized for frequent access and another lower-cost tier that is optimized for infrequent access.
- S3 Standard-Infrequent Access – S3 Standard-IA (durable, immediately available, infrequently accessed)
- S3 – One Zone-Infrequent Access – S3 One Zone IA: Same ad IA. However, data is stored in a single Availability Zone only
- S3 – Reduced Redundancy Storage (data that is easily reproducible, such as thumbnails, etc.)
- Glacier – Archived data, where you can wait 3-5 hours before accessing
You can have a bucket that has different objects stored in S3 Standard, S3 Intelligent-Tiering, S3 Standard-IA, and S3 One Zone-IA.
The default URL for S3 hosted websites lists the bucket name first followed by s3-website-region.amazonaws.com . Example: enoumen.com.s3-website-us-east-1.amazonaws.com
Core fundamentals of an S3 object
- Key (name)
- Value (data)
- Version (ID)
- Metadata
- Sub-resources (used to manage bucket-specific configuration)
  - Bucket Policies, ACLs,
  - CORS
  - Transfer Acceleration
Object-based storage only for files
Not suitable to install OS on.
Successful uploads will generate a HTTP 200 status code.
S3 Security – Summary
- By default, all newly created buckets are PRIVATE.
- You can set up access control to your buckets using:
  - Bucket Policies – Applied at the bucket level
  - Access Control Lists – Applied at an object level.
- S3 buckets can be configured to create access logs, which log all requests made to the S3 bucket. These logs can be written to another bucket.
S3 Encryption
- Encryption In-Transit (SSL/TLS)
- Encryption At Rest:
  - Server side Encryption (SSE-S3, SSE-KMS, SSE-C)
  - Client Side Encryption
- Remember that we can use a Bucket policy to prevent unencrypted files from being uploaded by creating a policy which only allows requests which include the x-amz-server-side-encryption parameter in the request header.
S3 CORS (Cross Origin Resource Sharing):
CORS defines a way for client web applications that are loaded in one domain to interact with resources in a different domain.
- Used to enable cross origin access for your AWS resources, e.g. S3 hosted website accessing javascript or image files located in another bucket. By default, resources in one bucket cannot access resources located in another. To allow this we need to configure CORS on the bucket being accessed and enable access for the origin (bucket) attempting to access.
- Always use the S3 website URL, not the regular bucket URL. E.g.: https://s3-eu-west-2.amazonaws.com/acloudguru
S3 CloudFront:
- Edge locations are not just READ only – you can WRITE to them too (i.e put an object on to them.)
- Objects are cached for the life of the TTL (Time to Live)
- You can clear cached objects, but you will be charged. (Invalidation)
S3 Performance optimization – 2 main approaches to Performance Optimization for S3:
- GET-Intensive Workloads – Use Cloudfront
- Mixed Workload – Avoid sequencial key names for your S3 objects. Instead, add a random prefix like a hex hash to the key name to prevent multiple objects from being stored on the same partition.
  - mybucket/7eh4-2019-03-04-15-00-00/cust1234234/photo1.jpg
  - mybucket/h35d-2019-03-04-15-00-00/cust1234234/photo2.jpg
  - mybucket/o3n6-2019-03-04-15-00-00/cust1234234/photo3.jpg
The best way to handle large objects uploads to the S3 service is to use the Multipart upload API. The Multipart upload API enables you to upload large objects in parts.
You can enable versioning on a bucket, even if that bucket already has objects in it. The already existing objects, though, will show their versions as null. All new objects will have version IDs.
Bucket names cannot start with a . or – characters. S3 bucket names can contain both the . and – characters. There can only be one . or one – between labels. E.G mybucket-com mybucket.com are valid names but mybucket–com and mybucket..com are not valid bucket names.
What is the maximum number of S3 buckets allowed per AWS account (by default)? 100
You successfully upload an item to the us-east-1 region. You then immediately make another API call and attempt to read the object. What will happen?
All AWS regions now have read-after-write consistency for PUT operations of new objects. Read-after-write consistency allows you to retrieve objects immediately after creation in Amazon S3. Other actions still follow the eventual consistency model (where you will sometimes get stale results if you have recently made changes)
S3 bucket policies require a Principal be defined. Review the access policy elements here
What checksums does Amazon S3 employ to detect data corruption?

Amazon S3 uses a combination of Content-MD5 checksums and cyclic redundancy checks (CRCs) to detect data corruption. Amazon S3 performs these checksums on data at rest and repairs any corruption using redundant data. In addition, the service calculates checksums on all network traffic to detect corruption of data packets when storing or retrieving data.

Top
Reference: AWS S3

AWS S3 Top 10 Questions and Answers Dump

Q0: You’ve written an application that uploads objects onto an S3 bucket. The size of the object varies between 200 – 500 MB. You’ve seen that the application sometimes takes a longer than expected time to upload the object. You want to improve the performance of the application. Which of the following would you consider?

A. Create multiple threads and upload the objects in the multiple threads
B. Write the items in batches for better performance
C. Use the Multipart upload API
D. Enable versioning on the Bucket

C. All other options are invalid since the best way to handle large object uploads to the S3 service is to use the Multipart upload API. The Multipart upload API enables you to upload large objects in parts. You can use this API to upload new large objects or make a copy of an existing object. Multipart uploading is a three-step process: You initiate the upload, you upload the object parts, and after you have uploaded all the parts, you complete the multipart upload. Upon receiving the complete multipart upload request, Amazon S3 constructs the object from the uploaded parts, and you can then access the object just as you would any other object in your bucket.

Reference: https://docs.aws.amazon.com/AmazonS3/latest/dev/mpuoverview.html

Top

Q2: You are using AWS SAM templates to deploy a serverless application. Which of the following resource will embed application from Amazon S3 buckets?

A. AWS::Serverless::Api
B. AWS::Serverless::Application
C. AWS::Serverless::Layerversion
D. AWS::Serverless::Function

Answer – B
AWS::Serverless::Application resource in AWS SAm template is used to embed application frm Amazon S3 buckets.
Reference: Declaring Serverless Resources

Top

Q3: A static web site has been hosted on a bucket and is now being accessed by users. One of the web pages javascript section has been changed to access data which is hosted in another S3 bucket. Now that same web page is no longer loading in the browser. Which of the following can help alleviate the error?

A. Enable versioning for the underlying S3 bucket.
B. Enable Replication so that the objects get replicated to the other bucket
C. Enable CORS for the bucket
D. Change the Bucket policy for the bucket to allow access from the other bucket

Answer – C

Cross-origin resource sharing (CORS) defines a way for client web applications that are loaded in one domain to interact with resources in a different domain. With CORS support, you can build rich client-side web applications with Amazon S3 and selectively allow cross-origin access to your Amazon S3 resources.

Cross-Origin Resource Sharing: Use-case Scenarios The following are example scenarios for using CORS:

Advertise with us - Post Your Good Content Here
We are ranked in the Top 20 on Google

AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version

Scenario 1: Suppose that you are hosting a website in an Amazon S3 bucket named website as described in Hosting a Static Website on Amazon S3. Your users load the website endpoint http://website.s3-website-us-east-1.amazonaws.com. Now you want to use JavaScript on the webpages that are stored in this bucket to be able to make authenticated GET and PUT requests against the same bucket by using the Amazon S3 API endpoint for the bucket, website.s3.amazonaws.com. A browser would normally block JavaScript from allowing those requests, but with CORS you can congure your bucket to explicitly enable cross-origin requests from website.s3-website-us-east-1.amazonaws.com.

Scenario 2: Suppose that you want to host a web font from your S3 bucket. Again, browsers require a CORS check (also called a preight check) for loading web fonts. You would congure the bucket that is hosting the web font to allow any origin to make these requests.

Invest in your future today by enrolling in this Azure Fundamentals - Pass the Azure Fundamentals Exam with Ease: Master the AZ-900 Certification with the Comprehensive Exam Preparation Guide!

Reference: Cross-Origin Resource Sharing (CORS)

Top

Q4: Your mobile application includes a photo-sharing service that is expecting tens of thousands of users at launch. You will leverage Amazon Simple Storage Service (S3) for storage of the user Images, and you must decide how to authenticate and authorize your users for access to these images. You also need to manage the storage of these images. Which two of the following approaches should you use? Choose two answers from the options below

A. Create an Amazon S3 bucket per user, and use your application to generate the S3 URL for the appropriate content.
B. Use AWS Identity and Access Management (IAM) user accounts as your application-level user database, and offload the burden of authentication from your application code.
C. Authenticate your users at the application level, and use AWS Security Token Service (STS)to grant token-based authorization to S3 objects.
D. Authenticate your users at the application level, and send an SMS token message to the user. Create an Amazon S3 bucket with the same name as the SMS message token, and move the user’s objects to that bucket.

Answer- C
The AWS Security Token Service (STS) is a web service that enables you to request temporary, limited-privilege credentials for AWS Identity and Access Management (IAM) users or for users that you authenticate (federated users). The token can then be used to grant access to the objects in S3.
You can then provides access to the objects based on the key values generated via the user id.

Reference: The AWS Security Token Service (STS)

Top

Q5: Both ACLs and Bucket Policies can be used to grant access to S3 buckets. Which of the following statements is true about ACLs and Bucket policies?

A. Bucket Policies are Written in JSON and ACLs are written in XML
B. ACLs can be attached to S3 objects or S3 Buckets
C. Bucket Policies and ACLs are written in JSON
D. Bucket policies are only attached to s3 buckets, ACLs are only attached to s3 objects

Answer: A. and B.
Only Bucket Policies are written in JSON, ACLs are written in XML.
While Bucket policies are indeed only attached to S3 buckets, ACLs can be attached to S3 Buckets OR S3 Objects.
Reference:

Top

Q6: What are good options to improve S3 performance when you have significantly high numbers of GET requests?

A. Introduce random prefixes to S3 objects
B. Introduce random suffixes to S3 objects
C. Setup CloudFront for S3 objects
D. Migrate commonly used objects to Amazon Glacier

Answer: C
CloudFront caching is an excellent way to avoid putting extra strain on the S3 service and to improve the response times of reqeusts by caching data closer to users at CloudFront locations.
S3 Transfer Acceleration optimizes the TCP protocol and adds additional intelligence between the client and the S3 bucket, making S3 Transfer Acceleration a better choice if a higher throughput is desired. If you have objects that are smaller than 1GB or if the data set is less than 1GB in size, you should consider using Amazon CloudFront’s PUT/POST commands for optimal performance.
Reference: Amazon S3 Transfer Acceleration

Top

Q7: If an application is storing hourly log files from thousands of instances from a high traffic
web site, which naming scheme would give optimal performance on S3?

A. Sequential
B. HH-DD-MM-YYYY-log_instanceID
C. YYYY-MM-DD-HH-log_instanceID
D. instanceID_log-HH-DD-MM-YYYY
E. instanceID_log-YYYY-MM-DD-HH

Answer: A. B. C. D. and E.
Amazon S3 now provides increased performance to support at least 3,500 requests per second to add data and 5,500 requests per second to retrieve data, which can save significant processing time for no additional charge. Each S3 prefix can support these request rates, making it simple to increase performance significantly.
This S3 request rate performance increase removes any previous guidance to randomize object prefixes to achieve faster performance. That means you can now use logical or sequential naming patterns in S3 object naming without any performance implications.

Reference: Amazon S3 Announces Increased Request Rate Performance

Top

"Pass the AWS Cloud Practitioner Certification with flying colors: Master the Exam with 300+ Quizzes, Cheat Sheets, Flashcards, and Illustrated Study Guides - 2024 Edition"

Q8: You are working with the S3 API and receive an error message: 409 Conflict. What is the possible cause of this error

A. You’re attempting to remove a bucket without emptying the contents of the bucket first.
B. You’re attempting to upload an object to the bucket that is greater than 5TB in size.
C. Your request does not contain the proper metadata.
D. Amazon S3 is having internal issues.

Answer:A.

Reference: S3 Error codes

"Become a Canada Expert: Ace the Citizenship Test and Impress Everyone with Your Knowledge of Canadian History, Geography, Government, Culture, People, Languages, Travel, Wildlife, Hockey, Tourism, Sceneries, Arts, and Data Visualization. Get the Top 1000 Canada Quiz Now!"

Top

Q9: You created three S3 buckets – “mywebsite.com”, “downloads.mywebsite.com”, and “www.mywebsite.com”. You uploaded your files and enabled static website hosting. You specified both of the default documents under the “enable static website hosting” header. You also set the “Make Public” permission for the objects in each of the three buckets. You create the Route 53 Aliases for the three buckets. You are going to have your end users test your websites by browsing to http://mydomain.com/error.html, http://downloads.mydomain.com/index.html, and http://www.mydomain.com. What problems will your testers encounter?

A. http://mydomain.com/error.html will not work because you did not set a value for the error.html file
B. There will be no problems, all three sites should work.
C. http://www.mywebsite.com will not work because the URL does not include a file name at the end of it.
D. http://downloads.mywebsite.com/index.html will not work because the “downloads” prefix is not a supported prefix for S3 websites using Route 53 aliases

Answer: B.
It used to be that the only allowed domain prefix when creating Route 53 Aliases for S3 static websites was the “www” prefix. However, this is no longer the case. You can now use other subdomain.

Reference: Hosting a Static Website on Amazon S3

Top

Unlock the Secrets of Africa: Master African History, Geography, Culture, People, Cuisine, Economics, Languages, Music, Wildlife, Football, Politics, Animals, Tourism, Science and Environment with the Top 1000 Africa Quiz and Trivia. Get Yours Now!

Q10: Which of the following is NOT a common S3 API call?

A. UploadPart
B. ReadObject
C. PutObject
D. DownloadBucket

Answer: D.

Reference: s3api

Top

Dive into a comprehensive AWS Cloud Practitioner CLF-C02 Certification guide, masterfully weaving insights from Tutorials Dojo, Adrian Cantrill, Stephane Maarek, and AWS Skills Builder into one unified resource.

Other AWS Facts and Summaries

AWS S3 facts and summaries
AWS DynamoDB facts and summaries
AWS EC2 facts and summaries
AWS Lambda facts and summaries
AWS SQS facts and summaries
AWS RDS facts and summaries
AWS ECS facts and summaries
AWS CloudWatch facts and summaries
AWS SES facts and summaries
AWS EBS facts and summaries
AWS Serverless facts and summaries
AWS ELB facts and summaries
AWS Autoscaling facts and summaries
AWS VPC facts and summaries
AWS KMS facts and summaries
AWS Elastic Beanstalk facts and summaries
AWS CodeBuild facts and summaries
AWS CodeDeploy facts and summaries
AWS CodePipeline facts and summaries

Active Hydrating Toner, Anti-Aging Replenishing Advanced Face Moisturizer, with Vitamins A, C, E & Natural Botanicals to Promote Skin Balance & Collagen Production, 6.7 Fl Oz

Age Defying 0.3% Retinol Serum, Anti-Aging Dark Spot Remover for Face, Fine Lines & Wrinkle Pore Minimizer, with Vitamin E & Natural Botanicals

Firming Moisturizer, Advanced Hydrating Facial Replenishing Cream, with Hyaluronic Acid, Resveratrol & Natural Botanicals to Restore Skin's Strength, Radiance, and Resilience, 1.75 Oz

Skin Stem Cell Serum

Smartphone 101 - Pick a smartphone for me - android or iOS - Apple iPhone or Samsung Galaxy or Huawei or Xaomi or Google Pixel

Can AI Really Predict Lottery Results? We Asked an Expert.

Ace the 2023 AWS Solutions Architect Associate SAA-C03 Exam with Confidence

Pass the 2023 AWS Certified Machine Learning Specialty MLS-C01 Exam with Flying Colors

Football/Soccer World Cup 2022 Guide and Past World Cups History and Quiz illustrated

Djamgatech

Read Photos and PDFs Aloud for me iOS
Read Photos and PDFs Aloud for me android
Read Photos and PDFs Aloud For me Windows 10/11
Read Photos and PDFs Aloud For Amazon

Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE (Email us for more)

Get 20% off Google Google Workspace (Google Meet) Standard Plan with the following codes: 96DRHDRA9J7GTN6(Email us for more)

FREE 10000+ Quiz Trivia and and Brain Teasers for All Topics including Cloud Computing, General Knowledge, History, Television, Music, Art, Science, Movies, Films, US History, Soccer Football, World Cup, Data Science, Machine Learning, Geography, etc....

taimienphi.vn

List of Freely available programming books - What is the single most influential book every Programmers should read

#BlackOwned #BlackEntrepreneurs #BlackBuniness #AWSCertified #AWSCloudPractitioner #AWSCertification #AWSCLFC02 #CloudComputing #AWSStudyGuide #AWSTraining #AWSCareer #AWSExamPrep #AWSCommunity #AWSEducation #AWSBasics #AWSCertified #AWSMachineLearning #AWSCertification #AWSSpecialty #MachineLearning #AWSStudyGuide #CloudComputing #DataScience #AWSCertified #AWSSolutionsArchitect #AWSArchitectAssociate #AWSCertification #AWSStudyGuide #CloudComputing #AWSArchitecture #AWSTraining #AWSCareer #AWSExamPrep #AWSCommunity #AWSEducation #AzureFundamentals #AZ900 #MicrosoftAzure #ITCertification #CertificationPrep #StudyMaterials #TechLearning #MicrosoftCertified #AzureCertification #TechBooks

Top 1000 Canada Quiz and trivia: CANADA CITIZENSHIP TEST- HISTORY - GEOGRAPHY - GOVERNMENT- CULTURE - PEOPLE - LANGUAGES - TRAVEL - WILDLIFE - HOCKEY - TOURISM - SCENERIES - ARTS - DATA VISUALIZATION

Top 1000 Africa Quiz and trivia: HISTORY - GEOGRAPHY - WILDLIFE - CULTURE - PEOPLE - LANGUAGES - TRAVEL - TOURISM - SCENERIES - ARTS - DATA VISUALIZATION

Exploring the Pros and Cons of Visiting All Provinces and Territories in Canada.

Exploring the Advantages and Disadvantages of Visiting All 50 States in the USA

Health Health, a science-based community to discuss health news and the coronavirus (COVID-19) pandemic

Fentanyl’s deadly chemistry: How criminals make illicit opioids
by /u/Maxcactus on July 26, 2024 at 8:14 am
submitted by /u/Maxcactus [link] [comments]
A New Artificial Heart in Houston Might Save Millions
by /u/zsreport on July 25, 2024 at 11:13 pm
submitted by /u/zsreport [link] [comments]
Medical boards punished these doctors and nurses. South Carolina prisons hired them anyway.
by /u/jms1225 on July 25, 2024 at 8:06 pm
submitted by /u/jms1225 [link] [comments]
US infant mortality increased in 2022 for the first time in decades, CDC report shows
by /u/cnn on July 25, 2024 at 6:37 pm
submitted by /u/cnn [link] [comments]
Study raises hopes that shingles vaccine may delay onset of dementia | Dementia | The Guardian
by /u/chilladipa on July 25, 2024 at 3:38 pm
submitted by /u/chilladipa [link] [comments]

Today I Learned (TIL) You learn something new every day; what did you learn today? Submit interesting and specific facts about something that you just found out here.

TIL that George Takei respectfully declined an offer to voice himself in the Simpsons episode Marge vs. the Monorail because he didn't want to ridicule public transportation
by /u/Lucy_loveebby on July 26, 2024 at 7:08 am
submitted by /u/Lucy_loveebby [link] [comments]
TIL about Shizo Kanakuri, a Japanese runner who dropped out of the 1912 Olympic race (he fell asleep during his rest mid-race), but returned 54 years later to officially finish it. Kanakuri joked: “It was a long trip. Along the way, I got married, had six children and 10 grandchildren.”
by /u/EnigmaticScience on July 26, 2024 at 6:57 am
submitted by /u/EnigmaticScience [link] [comments]
TIL Gerhard Domagk’s discovery of the antibacterial properties of Prontosil won him the 1939 Nobel Prize in Medicine but German citizens were forbidden to accept the prize under Hitler. Domagk was forced to reject it by the Gespato before receiving it in 1947 but without the prize money
by /u/ubcstaffer123 on July 26, 2024 at 3:56 am
submitted by /u/ubcstaffer123 [link] [comments]
Today I learned that the actor Keegan-Michael Key had two biological half-siblings (one of whom was comic book writer Dwayne McDuffie) and Key didn't know about either of them until after both of them had died.
by /u/wimpykidfan37 on July 26, 2024 at 3:31 am
submitted by /u/wimpykidfan37 [link] [comments]
TIL that average human height went down from 5'10" (178 cm) for men and 5'6" (168 cm) for women to 5'5" (165 cm) and 5'1" (155 cm) 10,000 years ago and it took until the 20th century for average human height to match pre-Neolithic Revolution levels.
by /u/quadrahex on July 26, 2024 at 2:04 am
submitted by /u/quadrahex [link] [comments]

Reddit Science This community is a place to share and discuss new scientific research. Read about the latest advances in astronomy, biology, medicine, physics, social science, and more. Find and submit new publications and popular science coverage of current research.

New study uses Game of Thrones to advance understanding of face blindness. In people with prosopagnosia, the effect of familiarity was not found in the same regions of the brain as it was in neurotypical participants while watching the TV show. Face blindness affects approximately 1 in 50 people.
by /u/mvea on July 26, 2024 at 7:45 am
submitted by /u/mvea [link] [comments]
MIT scientists have discovered an intriguing new way to produce hydrogen fuel, using just soda cans, seawater and coffee grounds. The team says the chemical reaction could be put to work powering engines or fuel cells in marine vehicles like submarines that suck in seawater.
by /u/mvea on July 26, 2024 at 7:37 am
submitted by /u/mvea [link] [comments]
Science Impact: Experimental Warn-On-Forecast System Yields 75-Minute Lead Time on Violent Tornado {NOAA Press Release}
by /u/Elijah-Joyce-Weather on July 26, 2024 at 4:55 am
submitted by /u/Elijah-Joyce-Weather [link] [comments]
65 million Americans now own firearms for protection, suggests the results of a nationally representative survey carried out in 2023
by /u/FunnyGamer97 on July 26, 2024 at 2:16 am
submitted by /u/FunnyGamer97 [link] [comments]
Examining tornado exposure, post-tornado distress, and gender following the March 2020 tornado in Nashville, Tennessee
by /u/Elijah-Joyce-Weather on July 25, 2024 at 11:49 pm
submitted by /u/Elijah-Joyce-Weather [link] [comments]

Reddit Sports Sports News and Highlights from the NFL, NBA, NHL, MLB, MLS, and leagues around the world.

Canada's men's and women's soccer teams have relied on drones and spying for years, sources say
by /u/FireLychee on July 26, 2024 at 3:37 am
submitted by /u/FireLychee [link] [comments]
After a grueling Tour de France, top riders are racing to recover for Paris Olympics time trial
by /u/Oldtimer_2 on July 26, 2024 at 12:54 am
submitted by /u/Oldtimer_2 [link] [comments]
Canada WNT coach Bev Priestman suspended, sent home from Olympics amid spying scandal
by /u/Oldtimer_2 on July 26, 2024 at 12:49 am
submitted by /u/Oldtimer_2 [link] [comments]
'End of an Era': TNT's 'Inside the NBA' Ending After NBA Chooses Amazon
by /u/lame_building14 on July 26, 2024 at 12:36 am
submitted by /u/lame_building14 [link] [comments]
Canada axes coach from Olympics over drone use
by /u/chief_sitass on July 26, 2024 at 12:31 am
submitted by /u/chief_sitass [link] [comments]

Turn your dream into reality with Google Workspace: It’s free for the first 14 days.
Get 20% off Google Google Workspace (Google Meet) Standard Plan with the following codes:
96DRHDRA9J7GTN6
63F733CLLY7R7MM
63F7D7CPD9XXUVT
63FLKQHWV3AEEE6
63JGLWWK36CP7WM
63KKR9EULQRR7VE
63KNY4N7VHCUA9R
63LDXXFYU6VXDG9
63MGNRCKXURAYWC
63NGNDVVXJP4N99
63P4G3ELRPADKQU
With Google Workspace, Get custom email @yourcompany, Work from anywhere; Easily scale up or down
Google gives you the tools you need to run your business like a pro. Set up custom email, share files securely online, video chat from any device, and more.
Google Workspace provides a platform, a common ground, for all our internal teams and operations to collaboratively support our primary business goal, which is to deliver quality information to our readers quickly.
Get 20% off Google Workspace (Google Meet) Business Plan (AMERICAS): M9HNXHX3WC9H7YE
C37HCAQRVR7JTFK
C3AE76E7WATCTL9
C3C3RGUF9VW6LXE
C3D9LD4L736CALC
C3EQXV674DQ6PXP
C3G9M3JEHXM3XC7
C3GGR3H4TRHUD7L
C3LVUVC3LHKUEQK
C3PVGM4CHHPMWLE
C3QHQ763LWGTW4C
Even if you’re small, you want people to see you as a professional business. If you’re still growing, you need the building blocks to get you where you want to be. I’ve learned so much about business through Google Workspace—I can’t imagine working without it. (Email us for more codes)

Top 100 AWS Certified Data Analytics Specialty Certification Questions and Answers Dumps

The AWS Certified Data Analytics – Specialty (DAS-C01) covers the following domains:

Below are the Top 100 AWS Certified Data Analytics – Specialty Questions and Answers Dumps and References –

Question1: What combination of services do you need for the following requirements: accelerate petabyte-scale data transfers, load streaming data, and the ability to create scalable, private connections. Select the correct answer order.

AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version

Question 10: You need to filter and transform incoming messages coming from a smart sensor you have connected with AWS. Once messages are received, you need to store them as time series data in DynamoDB. Which AWS service can you use?

Question 12: You currently have databases running on-site and in another data center off-site. What service allows you to consolidate to one database in Amazon?

Question 14: You have been hired as a consultant to provide a solution to integrate a client’s on-premises data center to AWS. The customer requires a 300 Mbps dedicated, private connection to their VPC. Which AWS tool do you need?

Question 16: A data engineer needs to create a dashboard to display social media trends during the last hour of a large company event. The dashboard needs to display the associated metrics with a latency of less than 1 minute.Which solution meets these requirements?

Question 18: You need to migrate data to AWS. It is estimated that the data transfer will take over a month via the current AWS Direct Connect connection your company has set up. Which AWS tool should you use?

Question 19: You currently have an on-premises Oracle database and have decided to leverage AWS and use Aurora. You need to do this as quickly as possible. How do you achieve this?

Question 22: A data lake is a central repository that enables which operation?

Question 23: What is the most cost-effective storage option for your data lake?

Question 24: Which services are used in the processing layer of a data lake architecture? (SELECT TWO)

Question 25: Which services can be used for data ingestion into your data lake? (SELECT TWO)

Question 26: Which service uses continuous data replication with high availability to consolidate databases into a petabyte-scale data warehouse by streaming data to amazon Redshift and Amazon S3?

Question 27: What is the AWS Glue Data Catalog?

Questions 28: What AWS Glue feature “catalogs” your data?

Question 29: During your data preparation stage, the raw data has been enriched to support additional insights. You need to improve query performance and reduce costs of the final analytics solution.

Which data formats meet these requirements (SELECT TWO)

Question 30: In which scenario would you use AWS Glue jobs?

Question 31: Your data resides in multiple data stores, including Amazon S3, Amazon RDS, and Amazon DynamoDB. You need to efficiently query the combined datasets.

Which tool can achieve this, using a single query, without moving data?

Question 32: Which benefit do you achieve by using AWS Lake Formation to build data lakes?

Question 33: What are the three stages to set up a data lake using AWS Lake Formation? (SELECT THREE)

Question 35: A digital media customer needs to quickly build a data lake solution for the data housed in a PostgreSQL database. As a solutions architect, what service and feature would meet this requirement?

Question 36: AWS Lake Formation has a set of suggested personas and IAM permissions. Which is a required persona?

Question 37: Which three types of blueprints does AWS Lake Formation support? (SELECT THREE)

Question 38: Which one of the following is the best description of the capabilities of Amazon QuickSight?

Question 39: Which benefits are provided by Amazon Redshift? (Select TWO)

Djamga Data Sciences Big Data – Data Analytics Youtube Playlist

Big Data – Data Analytics Jobs:

Big Data – Data Analytics – Data Sciences Latest News:

DATA ANALYTICS Q&A:

What Is a Data Scientist?

Do All Data Scientists Hold Graduate Degrees?

How Are Data Scientists Different From Data Analysts?

How Is Data Science Used in Medicine?

What are data scientists paid?

What is the workday like for a data scientist?

How do I become a Data Scientist?

What Should I Learn? What Order Do I Learn Them?

How Do I Learn Python?

How Do I Learn R?

How Do I Learn SQL?

Curated Threads & Resources

How Do I Learn Calculus?

Curated Threads & Resources about Data Science and Data Analytics

How Do I Learn Probability?

How Do I Learn Linear Algebra?

Data visualization example: Crypto race: DOGE (red) versus BTC (blue), 5/6/2020 – 5/5/2021

Basic Data Lake Architecture

Data Analytics Architecture on AWS

Data Analytics Process

Data Lake Storage:

Event Driven Data Analytics Workflow on AWS

What is a Data Lake?

What is a Data Warehouse?

What are benefits of a data warehouse?

Data Lake vs Data Warehouse – Comparison

Transactional Data Ingestion

Structured Query Language (SQL)

Comparison of OLTP and OLAP

What is Amazon Macie?

What is AWS Glue DataBrew?

What is AWS Glue?

What is Amazon Athena?

What is AWS Lake Formation?

What is Amazon Quicksight?

What is SPICE?

Serverless data lake reference architecture:

What are Data Lakes Best Practices?

References:

Top 30 AWS Certified Developer Associate Exam Tips

AWS Certified Developer Associate Exam Prep Urls

AI Dashboard is available on the Web, Apple, Google, and Microsoft, PRO version

What is the AWS Certified Developer Associate Exam?

The AWS Certified Developer – Associate examination is intended for individuals who perform a development role and have one or more years of hands-on experience developing and maintaining an AWS-based application. It validates an examinee’s ability to:

There are two types of questions on the examination:

AWS Certified Developer Associate info and details

Question 16: A data engineer needs to create a dashboard to display social media trends during the last hour of a large company event. The dashboard needs to display the associated metrics with a latency of less than 1 minute.
Which solution meets these requirements?