In this blog, we talk about big data and data analytics; we also give you the last updated top 20 AWS Certified Data Analytics – Specialty Questions and Answers Dumps
I- The AWS Certified Data Analytics – Specialty (DAS-C01) examination is intended for individuals who perform in a data analytics-focused role. This exam validates an examinee’s comprehensive understanding of using AWS services to design, build, secure, and maintain analytics solutions that provide insight from data.
The AWS Certified Data Analytics – Specialty (DAS-C01) covers the following domains:
Domain 1: Collection 18%
Domain 2: Storage and Data Management 22%
Domain 3: Processing 24%
Domain 4: Analysis and Visualization 18%
Domain 5: Security 18%
Below are the Top 20 AWS Certified Data Analytics – Specialty Questions and Answers Dumps and References –
Question1:What combination of services do you need for the following requirements: accelerate petabyte-scale data transfers, load streaming data, and the ability to create scalable, private connections. Select the correct answer order.
A) Snowball, Kinesis Firehose, Direct Connect
B) Data Migration Services, Kinesis Firehose, Direct Connect
C) Snowball, Data Migration Services, Direct Connect
D) Snowball, Direct Connection, Kinesis Firehose
AWS has many options to help get data into the cloud, including secure devices like AWS Import/Export Snowball to accelerate petabyte-scale data transfers, Amazon Kinesis Firehose to load streaming data, and scalable private connections through AWS Direct Connect.
Question 3: There is a five-day car rally race across Europe. The race coordinators are using a Kinesis stream and IoT sensors to monitor the movement of the cars. Each car has a sensor and data is getting back to the stream with the default stream settings. On the last day of the rally, data is sent to S3. When you go to interpret the data in S3, there is only data for the last day and nothing for the first 4 days. Which of the following is the most probable cause of this?
A) You did not have versioning enabled and would need to create individual buckets to prevent the data from being overwritten.
B) Data records are only accessible for a default of 24 hours from the time they are added to a stream.
C) One of the sensors failed, so there was no data to record.
D) You needed to use EMR to send the data to S3; Kinesis Streams are only compatible with DynamoDB.
Streams support changes to the data record retention period of your stream. An Amazon Kinesis stream is an ordered sequence of data records, meant to be written to and read from in real-time. Data records are therefore stored in shards in your stream temporarily. The period from when a record is added to when it is no longer accessible is called the retention period. An Amazon Kinesis stream stores records for 24 hours by default, up to 168 hours.
Question 4: A publisher website captures user activity and sends clickstream data to Amazon Kinesis Data Streams. The publisher wants to design a cost-effective solution to process the data to create a timeline of user activity within a session. The solution must be able to scale depending on the number of active sessions.
Which solution meets these requirements?
A) Include a variable in the clickstream data from the publisher website to maintain a counter for the number of active user sessions. Use a timestamp for the partition key for the stream. Configure the consumer application to read the data from the stream and change the number of processor threads based upon the counter. Deploy the consumer application on Amazon EC2 instances in an EC2 Auto Scaling group.
B) Include a variable in the clickstream to maintain a counter for each user action during their session. Use the action type as the partition key for the stream. Use the Kinesis Client Library (KCL) in the consumer application to retrieve the data from the stream and perform the processing. Configure the consumer application to read the data from the stream and change the number of processor threads based upon the
counter. Deploy the consumer application on AWS Lambda.
C) Include a session identifier in the clickstream data from the publisher website and use as the partition key for the stream. Use the Kinesis Client Library (KCL) in the consumer application to retrieve the data from the stream and perform the processing. Deploy the consumer application on Amazon EC2 instances in an
EC2 Auto Scaling group. Use an AWS Lambda function to reshard the stream based upon Amazon CloudWatch alarms.
D) Include a variable in the clickstream data from the publisher website to maintain a counter for the number of active user sessions. Use a timestamp for the partition key for the stream. Configure the consumer application to read the data from the stream and change the number of processor threads based upon the counter. Deploy the consumer application on AWS Lambda.
Partitioning by the session ID will allow a single processor to process all the actions for a user session in order. An AWS Lambda function can call the UpdateShardCount API action to change the number of shards in the stream. The KCL will automatically manage the number of processors to match the number of shards. Amazon EC2 Auto Scaling will assure the correct number of instances are running to meet the processing load.
Question 5: Your company has two batch processing applications that consume financial data about the day’s stock transactions. Each transaction needs to be stored durably and guarantee that a record of each application is delivered so the audit and billing batch processing applications can process the data. However, the two applications run separately and several hours apart and need access to the same transaction information. After reviewing the transaction information for the day, the information no longer needs to be stored. What is the best way to architect this application?
A) Use SQS for storing the transaction messages; when the billing batch process performs first and consumes the message, write the code in a way that does not remove the message after consumed, so it is available for the audit application several hours later. The audit application can consume the SQS message and remove it from the queue when completed.
B) Use Kinesis to store the transaction information. The billing application will consume data from the stream and the audit application can consume the same data several hours later.
C) Store the transaction information in a DynamoDB table. The billing application can read the rows while the audit application will read the rows then remove the data.
D) Use SQS for storing the transaction messages. When the billing batch process consumes each message, have the application create an identical message and place it in a different SQS for the audit application to use several hours later.
SQS would make this more difficult because the data does not need to persist after a full day.
Kinesis appears to be the best solution that allows multiple consumers to easily interact with the records.
Question 6: A company is currently using Amazon DynamoDB as the database for a user support application. The company is developing a new version of the application that will store a PDF file for each support case ranging in size from 1–10 MB. The file should be retrievable whenever the case is accessed in the application.
How can the company store the file in the MOST cost-effective manner?
A) Store the file in Amazon DocumentDB and the document ID as an attribute in the DynamoDB table.
B) Store the file in Amazon S3 and the object key as an attribute in the DynamoDB table.
C) Split the file into smaller parts and store the parts as multiple items in a separate DynamoDB table.
D) Store the file as an attribute in the DynamoDB table using Base64 encoding.
Use Amazon S3 to store large attribute values that cannot fit in an Amazon DynamoDB item. Store each file as an object in Amazon S3 and then store the object path in the DynamoDB item.
Question 7: Your client has a web app that emits multiple events to Amazon Kinesis Streams for reporting purposes. Critical events need to be immediately captured before processing can continue, but informational events do not need to delay processing. What solution should your client use to record these types of events without unnecessarily slowing the application?
A) Log all events using the Kinesis Producer Library.
B) Log critical events using the Kinesis Producer Library, and log informational events using the PutRecords API method.
C) Log critical events using the PutRecords API method, and log informational events using the Kinesis Producer Library.
D) Log all events using the PutRecords API method.
The PutRecords API can be used in code to be synchronous; it will wait for the API request to complete before the application continues. This means you can use it when you need to wait for the critical events to finish logging before continuing. The Kinesis Producer Library is asynchronous and can send many messages without needing to slow down your application. This makes the KPL ideal for the sending of many non-critical alerts asynchronously.
Question 8: You work for a start-up that tracks commercial delivery trucks via GPS. You receive coordinates that are transmitted from each delivery truck once every 6 seconds. You need to process these coordinates in near real-time from multiple sources and load them into Elasticsearch without significant technical overhead to maintain. Which tool should you use to digest the data?
A) Amazon SQS
B) Amazon EMR
C) AWS Data Pipeline
D) Amazon Kinesis Firehose
Amazon Kinesis Firehose is the easiest way to load streaming data into AWS. It can capture, transform, and load streaming data into Amazon S3, Amazon Redshift, and Amazon Elasticsearch Service, enabling near real-time analytics with existing business intelligence tools and dashboards.
Question 9: A company needs to implement a near-real-time fraud prevention feature for its ecommerce site. User and order details need to be delivered to an Amazon SageMaker endpoint to flag suspected fraud. The amount of input data needed for the inference could be as much as 1.5 MB.
Which solution meets the requirements with the LOWEST overall latency?
A) Create an Amazon Managed Streaming for Kafka cluster and ingest the data for each order into a topic. Use a Kafka consumer running on Amazon EC2 instances to read these messages and invoke the Amazon SageMaker endpoint.
B) Create an Amazon Kinesis Data Streams stream and ingest the data for each order into the stream. Create an AWS Lambda function to read these messages and invoke the Amazon SageMaker endpoint.
C) Create an Amazon Kinesis Data Firehose delivery stream and ingest the data for each order into the stream. Configure Kinesis Data Firehose to deliver the data to an Amazon S3 bucket. Trigger an AWS Lambda function with an S3 event notification to read the data and invoke the Amazon SageMaker endpoint.
D) Create an Amazon SNS topic and publish the data for each order to the topic. Subscribe the Amazon SageMaker endpoint to the SNS topic.
An Amazon Managed Streaming for Kafka cluster can be used to deliver the messages with very low latency. It has a configurable message size that can handle the 1.5 MB payload.
Question 10: You need to filter and transform incoming messages coming from a smart sensor you have connected with AWS. Once messages are received, you need to store them as time series data in DynamoDB. Which AWS service can you use?
A) IoT Device Shadow Service
D) IoT Rules Engine
The IoT rules engine will allow you to send sensor data over to AWS services like DynamoDB
Question 11: A media company is migrating its on-premises legacy Hadoop cluster with its associated data processing scripts and workflow to an Amazon EMR environment running the latest Hadoop release. The developers want to reuse the Java code that was written for data processing jobs for the on-premises cluster.
Which approach meets these requirements?
A) Deploy the existing Oracle Java Archive as a custom bootstrap action and run the job on the EMR cluster.
B) Compile the Java program for the desired Hadoop version and run it using a CUSTOM_JAR step on the EMR cluster.
C) Submit the Java program as an Apache Hive or Apache Spark step for the EMR cluster.
D) Use SSH to connect the master node of the EMR cluster and submit the Java program using the AWS CLI.
A CUSTOM JAR step can be configured to download a JAR file from an Amazon S3 bucket and execute it. Since the Hadoop versions are different, the Java application has to be recompiled.
Question 12: You currently have databases running on-site and in another data center off-site. What service allows you to consolidate to one database in Amazon?
A) AWS Kinesis
B) AWS Database Migration Service
C) AWS Data Pipeline
D) AWS RDS Aurora
AWS Database Migration Service can migrate your data to and from most of the widely used commercial and open source databases. It supports homogeneous migrations such as Oracle to Oracle, as well as heterogeneous migrations between different database platforms, such as Oracle to Amazon Aurora. Migrations can be from on-premises databases to Amazon RDS or Amazon EC2, databases running on EC2 to RDS, or vice versa, as well as from one RDS database to another RDS database.
Question 13: An online retail company wants to perform analytics on data in large Amazon S3 objects using Amazon EMR. An Apache Spark job repeatedly queries the same data to populate an analytics dashboard. The analytics team wants to minimize the time to load the data and create the dashboard.
Which approaches could improve the performance? (Select TWO.)
A) Copy the source data into Amazon Redshift and rewrite the Apache Spark code to create analytical reports by querying Amazon Redshift.
B) Copy the source data from Amazon S3 into Hadoop Distributed File System (HDFS) using s3distcp.
C) Load the data into Spark DataFrames.
D) Stream the data into Amazon Kinesis and use the Kinesis Connector Library (KCL) in multiple Spark jobs to perform analytical jobs.
E) Use Amazon S3 Select to retrieve the data necessary for the dashboards from the S3 objects.
C and E
One of the speed advantages of Apache Spark comes from loading data into immutable dataframes, which can be accessed repeatedly in memory. Spark DataFrames organizes distributed data into columns. This makes summaries and aggregates much quicker to calculate. Also, instead of loading an entire large Amazon S3 object, load only what is needed using Amazon S3 Select. Keeping the data in Amazon S3 avoids loading the large dataset into HDFS.
Question 14: You have been hired as a consultant to provide a solution to integrate a client’s on-premises data center to AWS. The customer requires a 300 Mbps dedicated, private connection to their VPC. Which AWS tool do you need?
A) VPC peering
B) Data Pipeline
C) Direct Connect
Direct Connect will provide a dedicated and private connection to an AWS VPC.
Question 15: Your organization has a variety of different services deployed on EC2 and needs to efficiently send application logs over to a central system for processing and analysis. They’ve determined it is best to use a managed AWS service to transfer their data from the EC2 instances into Amazon S3 and they’ve decided to use a solution that will do what?
A) Installs the AWS Direct Connect client on all EC2 instances and uses it to stream the data directly to S3.
B) Leverages the Kinesis Agent to send data to Kinesis Data Streams and output that data in S3.
C) Ingests the data directly from S3 by configuring regular Amazon Snowball transactions.
D) Leverages the Kinesis Agent to send data to Kinesis Firehose and output that data in S3.
Kinesis Firehose is a managed solution, and log files can be sent from EC2 to Firehose to S3 using the Kinesis agent.
Question 16: A data engineer needs to create a dashboard to display social media trends during the last hour of a large company event. The dashboard needs to display the associated metrics with a latency of less than 1 minute.
Which solution meets these requirements?
A) Publish the raw social media data to an Amazon Kinesis Data Firehose delivery stream. Use Kinesis Data Analytics for SQL Applications to perform a sliding window analysis to compute the metrics and output the results to a Kinesis Data Streams data stream. Configure an AWS Lambda function to save the stream data to an Amazon DynamoDB table. Deploy a real-time dashboard hosted in an Amazon S3 bucket to read and display the metrics data stored in the DynamoDB table.
B) Publish the raw social media data to an Amazon Kinesis Data Firehose delivery stream. Configure the stream to deliver the data to an Amazon Elasticsearch Service cluster with a buffer interval of 0 seconds. Use Kibana to perform the analysis and display the results.
C) Publish the raw social media data to an Amazon Kinesis Data Streams data stream. Configure an AWS Lambda function to compute the metrics on the stream data and save the results in an Amazon S3 bucket. Configure a dashboard in Amazon QuickSight to query the data using Amazon Athena and display the results.
D) Publish the raw social media data to an Amazon SNS topic. Subscribe an Amazon SQS queue to the topic. Configure Amazon EC2 instances as workers to poll the queue, compute the metrics, and save the results to an Amazon Aurora MySQL database. Configure a dashboard in Amazon QuickSight to query the data in Aurora and display the results.
Question 17: A real estate company is receiving new property listing data from its agents through .csv files every day and storing these files in Amazon S3. The data analytics team created an Amazon QuickSight visualization report that uses a dataset imported from the S3 files. The data analytics team wants the visualization report to reflect the current data up to the previous day. How can a data analyst meet these requirements?
A) Schedule an AWS Lambda function to drop and re-create the dataset daily.
B) Configure the visualization to query the data in Amazon S3 directly without loading the data into SPICE.
C) Schedule the dataset to refresh daily.
D) Close and open the Amazon QuickSight visualization.
Datasets created using Amazon S3 as the data source are automatically imported into SPICE. The Amazon QuickSight console allows for the refresh of SPICE data on a schedule.
Question 18: You need to migrate data to AWS. It is estimated that the data transfer will take over a month via the current AWS Direct Connect connection your company has set up. Which AWS tool should you use?
A) Establish additional Direct Connect connections.
B) Use Data Pipeline to migrate the data in bulk to S3.
C) Use Kinesis Firehose to stream all new and existing data into S3.
As a general rule, if it takes more than one week to upload your data to AWS using the spare capacity of your existing Internet connection, then you should consider using Snowball. For example, if you have a 100 Mb connection that you can solely dedicate to transferring your data and need to transfer 100 TB of data, it takes more than 100 days to complete a data transfer over that connection. You can make the same transfer by using multiple Snowballs in about a week.
Question 19: You currently have an on-premises Oracle database and have decided to leverage AWS and use Aurora. You need to do this as quickly as possible. How do you achieve this?
A) It is not possible to migrate an on-premises database to AWS at this time.
B) Use AWS Data Pipeline to create a target database, migrate the database schema, set up the data replication process, initiate the full load and a subsequent change data capture and apply, and conclude with a switchover of your production environment to the new database once the target database is caught up with the source database.
C) Use AWS Database Migration Services and create a target database, migrate the database schema, set up the data replication process, initiate the full load and a subsequent change data capture and apply, and conclude with a switch-over of your production environment to the new database once the target database is caught up with the source database.
D) Use AWS Glue to crawl the on-premises database schemas and then migrate them into AWS with Data Pipeline jobs.
DMS can efficiently support this sort of migration using the steps outlined. While AWS Glue can help you crawl schemas and store metadata on them inside of Glue for later use, it isn't the best tool for actually transitioning a database over to AWS itself. Similarly, while Data Pipeline is great for ETL and ELT jobs, it isn't the best option to migrate a database over to AWS.
Question 20: A financial company uses Amazon EMR for its analytics workloads. During the company’s annual security audit, the security team determined that none of the EMR clusters’ root volumes are encrypted. The security team recommends the company encrypt its EMR clusters’ root volume as soon as possible.
Which solution would meet these requirements?
A) Enable at-rest encryption for EMR File System (EMRFS) data in Amazon S3 in a security configuration. Re-create the cluster using the newly created security configuration.
B) Specify local disk encryption in a security configuration. Re-create the cluster using the newly created security configuration.
C) Detach the Amazon EBS volumes from the master node. Encrypt the EBS volume and attach it back to the master node.
D) Re-create the EMR cluster with LZO encryption enabled on all volumes.
Local disk encryption can be enabled as part of a security configuration to encrypt root and storage volumes.
What are unique distinctions and similarities between AWS, Azure and Google Cloud services? For each AWS service, what is the equivalent Azure and Google Cloud service? For each Azure service, what is the corresponding Google Service? AWS Services vs Azure vs Google Services? Side by side comparison between AWS, Google Cloud and Azure Service?
Category: Marketplace Easy-to-deploy and automatically configured third-party applications, including single virtual machine or multiple virtual machine solutions. References: [AWS]:AWS Marketplace [Azure]:Azure Marketplace [Google]:Google Cloud Marketplace Tags: #AWSMarketplace, #AzureMarketPlace, #GoogleMarketplace Differences: They are both digital catalog with thousands of software listings from independent software vendors that make it easy to find, test, buy, and deploy software that runs on their respective cloud platform.
Tags: #AmazonLex, #CogintiveServices, #AzureSpeech, #Api.ai, #DialogFlow, #Tensorflow Differences: api.ai provides us with such a platform which is easy to learn and comprehensive to develop conversation actions. It is a good example of the simplistic approach to solving complex man to machine communication problem using natural language processing in proximity to machine learning. Api.ai supports context based conversations now, which reduces the overhead of handling user context in session parameters. On the other hand in Lex this has to be handled in session. Also, api.ai can be used for both voice and text based conversations (assistant actions can be easily created using api.ai).
Category: Serverless Description: Integrate systems and run backend processes in response to events or schedules without provisioning or managing servers. References: [AWS]:AWS Lambda [Azure]:Azure Functions [Google]:Google Cloud Functions Tags:#AWSLAmbda, #AzureFunctions, #GoogleCloudFunctions Differences: Both AWS Lambda and Microsoft Azure Functions and Google Cloud Functions offer dynamic, configurable triggers that you can use to invoke your functions on their platforms. AWS Lambda, Azure and Google Cloud Functions support Node.js, Python, and C#. The beauty of serverless development is that, with minor changes, the code you write for one service should be portable to another with little effort – simply modify some interfaces, handle any input/output transforms, and an AWS Lambda Node.JS function is indistinguishable from a Microsoft Azure Node.js Function. AWS Lambda provides further support for Python and Java, while Azure Functions provides support for F# and PHP. AWS Lambda is built from the AMI, which runs on Linux, while Microsoft Azure Functions run in a Windows environment. AWS Lambda uses the AWS Machine architecture to reduce the scope of containerization, letting you spin up and tear down individual pieces of functionality in your application at will.
Category:Caching Description:An in-memory–based, distributed caching service that provides a high-performance store typically used to offload non transactional work from a database. References: [AWS]:AWS ElastiCache (works as an in-memory data store and cache to support the most demanding applications requiring sub-millisecond response times.) [Azure]:Azure Cache for Redis (based on the popular software Redis. It is typically used as a cache to improve the performance and scalability of systems that rely heavily on backend data-stores.) [Google]:Memcache (In-memory key-value store, originally intended for caching) Tags:#Redis, #Memcached <Differences: They all support horizontal scaling via sharding.They all improve the performance of web applications by allowing you to retrive information from fast, in-memory caches, instead of relying on slower disk-based databases.”, “Differences”: “ElastiCache supports Memcached and Redis. Memcached Cloud provides various data persistence options as well as remote backups for disaster recovery purposes. Redis offers persistence to disk, Memcache does not. This can be very helpful if you cache lots of data, since you remove the slowness around having a fully cold cache. Redis also offers several extra data structures that Memcache doesn’t— Lists, Sets, Sorted Sets, etc. Memcache only has Key/Value pairs. Memcache is multi-threaded. Redis is single-threaded and event driven. Redis is very fast, but it’ll never be multi-threaded. At hight scale, you can squeeze more connections and transactions out of Memcache. Memcache tends to be more memory efficient. This can make a big difference around the magnitude of 10s of millions or 100s of millions of keys. ElastiCache supports Memcached and Redis. Memcached Cloud provides various data persistence options as well as remote backups for disaster recovery purposes. Redis offers persistence to disk, Memcache does not. This can be very helpful if you cache lots of data, since you remove the slowness around having a fully cold cache. Redis also offers several extra data structures that Memcache doesn’t— Lists, Sets, Sorted Sets, etc. Memcache only has Key/Value pairs. Memcache is multi-threaded. Redis is single-threaded and event driven. Redis is very fast, but it’ll never be multi-threaded. At hight scale, you can squeeze more connections and transactions out of Memcache. Memcache tends to be more memory efficient. This can make a big difference around the magnitude of 10s of millions or 100s of millions of keys.
Category: Enterprise application services Description:Fully integrated Cloud service providing communications, email, document management in the cloud and available on a wide variety of devices. References: [AWS]:Amazon WorkMail, Amazon WorkDocs, Amazon Kendra (Sync and Index) [Azure]:Office 365 [Google]:G Suite Tags: #AmazonWorkDocs, #Office365, #GoogleGSuite Differences: G suite document processing applications like Google Docs are far behind Office 365 popular Word and Excel software, but G Suite User interface is intuite, simple and easy to navigate. Office 365 is too clunky. Get 20% off G-Suite Business Plan with Promo Code: PCQ49CJYK7EATNC
Category: Management Description: A unified management console that simplifies building, deploying, and operating your cloud resources. References: [AWS]:AWS Management Console, Trusted Advisor, AWS Usage and Billing Report, AWS Application Discovery Service, Amazon EC2 Systems Manager, AWS Personal Health Dashboard, AWS Compute Optimizer (Identify optimal AWS Compute resources) [Azure]:Azure portal, Azure Advisor, Azure Billing API, Azure Migrate, Azure Monitor, Azure Resource Health [Google]:Google CLoud Platform, Cost Management, Security Command Center, StackDriver Tags: #AWSConsole, #AzurePortal, #GoogleCloudConsole, #TrustedAdvisor, #AzureMonitor, #SecurityCommandCenter Differences: AWS Console categorizes its Infrastructure as a Service offerings into Compute, Storage and Content Delivery Network (CDN), Database, and Networking to help businesses and individuals grow. Azure excels in the Hybrid Cloud space allowing companies to integrate onsite servers with cloud offerings. Google has a strong offering in containers, since Google developed the Kubernetes standard that AWS and Azure now offer. GCP specializes in high compute offerings like Big Data, analytics and machine learning. It also offers considerable scale and load balancing – Google knows data centers and fast response time.
Enables both Speech to Text, and Text into Speech capabilities. The Speech Services are the unification of speech-to-text, text-to-speech, and speech-translation into a single Azure subscription. It’s easy to speech enable your applications, tools, and devices with the Speech SDK, Speech Devices SDK, or REST APIs. Amazon Polly is a Text-to-Speech (TTS) service that uses advanced deep learning technologies to synthesize speech that sounds like a human voice. With dozens of lifelike voices across a variety of languages, you can select the ideal voice and build speech-enabled applications that work in many different countries. Amazon Transcribe is an automatic speech recognition (ASR) service that makes it easy for developers to add speech-to-text capability to their applications. Using the Amazon Transcribe API, you can analyze audio files stored in Amazon S3 and have the service return a text file of the transcribed speech.
Computer Vision: Extract information from images to categorize and process visual data. Amazon Rekognition is a simple and easy to use API that can quickly analyze any image or video file stored in Amazon S3. Amazon Rekognition is always learning from new data, and we are continually adding new labels and facial recognition features to the service.
Face: Detect, identy, and analyze faces in photos.
The Virtual Assistant Template brings together a number of best practices we’ve identified through the building of conversational experiences and automates integration of components that we’ve found to be highly beneficial to Bot Framework developers.
Redeploy and extend your VMware-based enterprise workloads to Azure with Azure VMware Solution by CloudSimple. Keep using the VMware tools you already know to manage workloads on Azure without disrupting network, security, or data protection policies.
Fully managed service that enables developers to deploy microservices applications without managing virtual machines, storage, or networking. AWS App Mesh is a service mesh that provides application-level networking to make it easy for your services to communicate with each other across multiple types of compute infrastructure. App Mesh standardizes how your services communicate, giving you end-to-end visibility and ensuring high-availability for your applications.
Integrate systems and run backend processes in response to events or schedules without provisioning or managing servers. AWS Lambda is an event-driven, serverless computing platform provided by Amazon as a part of the Amazon Web Services. It is a computing service that runs code in response to events and automatically manages the computing resources required by that code
Managed relational database service where resiliency, scale, and maintenance are primarily handled by the platform. Amazon Relational Database Service is a distributed relational database service by Amazon Web Services. It is a web service running “in the cloud” designed to simplify the setup, operation, and scaling of a relational database for use in applications. Administration processes like patching the database software, backing up databases and enabling point-in-time recovery are managed automatically. Scaling storage and compute resources can be performed by a single API call as AWS does not offer an ssh connection to RDS instances.
An in-memory–based, distributed caching service that provides a high-performance store typically used to offload non transactional work from a database. Amazon ElastiCache is a fully managed in-memory data store and cache service by Amazon Web Services. The service improves the performance of web applications by retrieving information from managed in-memory caches, instead of relying entirely on slower disk-based databases. ElastiCache supports two open-source in-memory caching engines: Memcached and Redis.
Migration of database schema and data from one database format to a specific database technology in the cloud. AWS Database Migration Service helps you migrate databases to AWS quickly and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database. The AWS Database Migration Service can migrate your data to and from most widely used commercial and open-source databases.
Comprehensive solution for collecting, analyzing, and acting on telemetry from your cloud and on-premises environments. Amazon CloudWatch is a monitoring and observability service built for DevOps engineers, developers, site reliability engineers (SREs), and IT managers. CloudWatch provides you with data and actionable insights to monitor your applications, respond to system-wide performance changes, optimize resource utilization, and get a unified view of operational health. CloudWatch collects monitoring and operational data in the form of logs, metrics, and events, providing you with a unified view of AWS resources, applications, and services that run on AWS and on-premises servers. AWS X-Ray is an application performance management service that enables a developer to analyze and debug applications in the Amazon Web Services (AWS) public cloud. A developer can use AWS X-Ray to visualize how a distributed application is performing during development or production, and across multiple AWS regions and accounts.
A cloud service for collaborating on code development. AWS CodeDeploy is a fully managed deployment service that automates software deployments to a variety of compute services such as Amazon EC2, AWS Fargate, AWS Lambda, and your on-premises servers. AWS CodeDeploy makes it easier for you to rapidly release new features, helps you avoid downtime during application deployment, and handles the complexity of updating your applications. AWS CodePipeline is a fully managed continuous delivery service that helps you automate your release pipelines for fast and reliable application and infrastructure updates. CodePipeline automates the build, test, and deploy phases of your release process every time there is a code change, based on the release model you define. AWS CodeCommit is a source code storage and version-control service for Amazon Web Services’ public cloud customers. CodeCommit was designed to help IT teams collaborate on software development, including continuous integration and application delivery.
Collection of tools for building, debugging, deploying, diagnosing, and managing multiplatform scalable apps and services. The AWS Developer Tools are designed to help you build software like Amazon. They facilitate practices such as continuous delivery and infrastructure as code for serverless, containers, and Amazon EC2.
Built on top of the native REST API across all cloud services, various programming language-specific wrappers provide easier ways to create solutions. The AWS Command Line Interface (CLI) is a unified tool to manage your AWS services. With just one tool to download and configure, you can control multiple AWS services from the command line and automate them through scripts.
Configures and operates applications of all shapes and sizes, and provides templates to create and manage a collection of resources. AWS OpsWorks is a configuration management service that provides managed instances of Chef and Puppet. Chef and Puppet are automation platforms that allow you to use code to automate the configurations of your servers.
Provides a way for users to automate the manual, long-running, error-prone, and frequently repeated IT tasks. AWS CloudFormation provides a common language for you to describe and provision all the infrastructure resources in your cloud environment. CloudFormation allows you to use a simple text file to model and provision, in an automated and secure manner, all the resources needed for your applications across all regions and accounts.
Provides an isolated, private environment in the cloud. Users have control over their virtual networking environment, including selection of their own IP address range, creation of subnets, and configuration of route tables and network gateways.
Azure Digital Twins is an IoT service that helps you create comprehensive models of physical environments. Create spatial intelligence graphs to model the relationships and interactions between people, places, and devices. Query data from a physical space rather than disparate sensors.
Allows users to securely control access to services and resources while offering data security and protection. Create and manage users and groups, and use permissions to allow and deny access to resources.
Azure Policy is a service in Azure that you use to create, assign, and manage policies. These policies enforce different rules and effects over your resources, so those resources stay compliant with your corporate standards and service level agreements.
Azure management groups provide a level of scope above subscriptions. You organize subscriptions into containers called “management groups” and apply your governance conditions to the management groups. All subscriptions within a management group automatically inherit the conditions applied to the management group. Management groups give you enterprise-grade management at a large scale, no matter what type of subscriptions you have.
Easily join your distributed microservice architectures into a single global application using HTTP load balancing and path-based routing rules. Automate turning up new regions and scale-out with API-driven global actions, and independent fault-tolerance to your back end microservices in Azure—or anywhere.
Azure Stack is a hybrid cloud platform that enables you to run Azure services in your company’s or service provider’s datacenter. As a developer, you can build apps on Azure Stack. You can then deploy them to either Azure Stack or Azure, or you can build truly hybrid apps that take advantage of connectivity between an Azure Stack cloud and Azure.
Basically, it all comes down to what your organizational needs are and if there’s a particular area that’s especially important to your business (ex. serverless, or integration with Microsoft applications).
Some of the main things it comes down to is compute options, pricing, and purchasing options.
Here’s a brief comparison of the compute option features across cloud providers:
Here’s an example of a few instances’ costs (all are Linux OS):
Each provider offers a variety of options to lower costs from the listed On-Demand prices. These can fall under reservations, spot and preemptible instances and contracts.
Both AWS and Azure offer a way for customers to purchase compute capacity in advance in exchange for a discount: AWS Reserved Instances and Azure Reserved Virtual Machine Instances. There are a few interesting variations between the instances across the cloud providers which could affect which is more appealing to a business.
Another discounting mechanism is the idea of spot instances in AWS and low-priority VMs in Azure. These options allow users to purchase unused capacity for a steep discount.
With AWS and Azure, enterprise contracts are available. These are typically aimed at enterprise customers, and encourage large companies to commit to specific levels of usage and spend in exchange for an across-the-board discount – for example, AWS EDPs and Azure Enterprise Agreements.
You can read more about the differences between AWS and Azure to help decide which your business should use in this blog post