Download the AI & Machine Learning For Dummies PRO App: iOS - Android Our AI and Machine Learning For Dummies PRO App can help you Ace the following AI and Machine Learning certifications:
Definition 1: Amazon S3 or Amazon Simple Storage Service is a “simple storage service” offered by Amazon Web Services that provides object storage through a web service interface. Amazon S3 uses the same scalable storage infrastructure that Amazon.com uses to run its global e-commerce network.
Definition 2: Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance.
AWS S3 Explained graphically:
AWS S3 Facts and summaries
S3 is a universal namespace, meaning each S3 bucket you create must have a unique name that is not being used by anyone else in the world.
S3 is object based: i.e allows you to upload files.
Files can be from 0 Bytes to 5 TB
What is the maximum length, in bytes, of a DynamoDB range primary key attribute value? The maximum length of a DynamoDB range primary key attribute value is 2048 bytes (NOT 256 bytes).
S3 has unlimited storage.
Files are stored in Buckets.
Read after write consistency for PUTS of new Objects
Eventual Consistency for overwrite PUTS and DELETES (can take some time to propagate)
S3 Standard (durable, immediately available, frequently accesses)
Amazon S3 Intelligent-Tiering (S3 Intelligent-Tiering): It works by storing objects in two access tiers: one tier that is optimized for frequent access and another lower-cost tier that is optimized for infrequent access.
S3 – One Zone-Infrequent Access – S3 One Zone IA: Same ad IA. However, data is stored in a single Availability Zone only
S3 – Reduced Redundancy Storage (data that is easily reproducible, such as thumbnails, etc.)
Glacier – Archived data, where you can wait 3-5 hours before accessing
You can have a bucket that has different objects stored in S3 Standard, S3 Intelligent-Tiering, S3 Standard-IA, and S3 One Zone-IA.
The default URL for S3 hosted websites lists the bucket name first followed by s3-website-region.amazonaws.com . Example: enoumen.com.s3-website-us-east-1.amazonaws.com
Core fundamentals of an S3 object
Key (name)
Value (data)
Version (ID)
Metadata
Sub-resources (used to manage bucket-specific configuration)
Bucket Policies, ACLs,
CORS
Transfer Acceleration
Object-based storage only for files
Not suitable to install OS on.
Successful uploads will generate a HTTP 200 status code.
S3 Security – Summary
By default, all newly created buckets are PRIVATE.
You can set up access control to your buckets using:
Bucket Policies – Applied at the bucket level
Access Control Lists – Applied at an object level.
S3 buckets can be configured to create access logs, which log all requests made to the S3 bucket. These logs can be written to another bucket.
S3 Encryption
Encryption In-Transit (SSL/TLS)
Encryption At Rest:
Server side Encryption (SSE-S3, SSE-KMS, SSE-C)
Client Side Encryption
Remember that we can use a Bucket policy to prevent unencrypted files from being uploaded by creating a policy which only allows requests which include the x-amz-server-side-encryption parameter in the request header.
S3 CORS (Cross Origin Resource Sharing): CORS defines a way for client web applications that are loaded in one domain to interact with resources in a different domain.
Used to enable cross origin access for your AWS resources, e.g. S3 hosted website accessing javascript or image files located in another bucket. By default, resources in one bucket cannot access resources located in another. To allow this we need to configure CORS on the bucket being accessed and enable access for the origin (bucket) attempting to access.
Always use the S3 website URL, not the regular bucket URL. E.g.: https://s3-eu-west-2.amazonaws.com/acloudguru
S3 CloudFront:
Edge locations are not just READ only – you can WRITE to them too (i.e put an object on to them.)
Objects are cached for the life of the TTL (Time to Live)
You can clear cached objects, but you will be charged. (Invalidation)
S3 Performance optimization – 2 main approaches to Performance Optimization for S3:
GET-Intensive Workloads – Use Cloudfront
Mixed Workload – Avoid sequencial key names for your S3 objects. Instead, add a random prefix like a hex hash to the key name to prevent multiple objects from being stored on the same partition.
The best way to handle large objects uploads to the S3 service is to use the Multipart upload API. The Multipart upload API enables you to upload large objects in parts.
You can enable versioning on a bucket, even if that bucket already has objects in it. The already existing objects, though, will show their versions as null. All new objects will have version IDs.
Bucket names cannot start with a . or – characters. S3 bucket names can contain both the . and – characters. There can only be one . or one – between labels. E.G mybucket-com mybucket.com are valid names but mybucket–com and mybucket..com are not valid bucket names.
What is the maximum number of S3 buckets allowed per AWS account (by default)? 100
You successfully upload an item to the us-east-1 region. You then immediately make another API call and attempt to read the object. What will happen? All AWS regions now have read-after-write consistency for PUT operations of new objects. Read-after-write consistency allows you to retrieve objects immediately after creation in Amazon S3. Other actions still follow the eventual consistency model (where you will sometimes get stale results if you have recently made changes)
S3 bucket policies require a Principal be defined. Review the access policy elements here
What checksums does Amazon S3 employ to detect data corruption? Amazon S3 uses a combination of Content-MD5 checksums and cyclic redundancy checks (CRCs) to detect data corruption. Amazon S3 performs these checksums on data at rest and repairs any corruption using redundant data. In addition, the service calculates checksums on all network traffic to detect corruption of data packets when storing or retrieving data.
Q0: You’ve written an application that uploads objects onto an S3 bucket. The size of the object varies between 200 – 500 MB. You’ve seen that the application sometimes takes a longer than expected time to upload the object. You want to improve the performance of the application. Which of the following would you consider?
A. Create multiple threads and upload the objects in the multiple threads
B. Write the items in batches for better performance
C. Use the Multipart upload API
D. Enable versioning on the Bucket
C. All other options are invalid since the best way to handle large object uploads to the S3 service is to use the Multipart upload API. The Multipart upload API enables you to upload large objects in parts. You can use this API to upload new large objects or make a copy of an existing object. Multipart uploading is a three-step process: You initiate the upload, you upload the object parts, and after you have uploaded all the parts, you complete the multipart upload. Upon receiving the complete multipart upload request, Amazon S3 constructs the object from the uploaded parts, and you can then access the object just as you would any other object in your bucket.
Q2: You are using AWS SAM templates to deploy a serverless application. Which of the following resource will embed application from Amazon S3 buckets?
A. AWS::Serverless::Api
B. AWS::Serverless::Application
C. AWS::Serverless::Layerversion
D. AWS::Serverless::Function
Answer – B AWS::Serverless::Application resource in AWS SAm template is used to embed application frm Amazon S3 buckets. Reference: Declaring Serverless Resources
Q3: A static web site has been hosted on a bucket and is now being accessed by users. One of the web pages javascript section has been changed to access data which is hosted in another S3 bucket. Now that same web page is no longer loading in the browser. Which of the following can help alleviate the error?
A. Enable versioning for the underlying S3 bucket.
B. Enable Replication so that the objects get replicated to the other bucket
C. Enable CORS for the bucket
D. Change the Bucket policy for the bucket to allow access from the other bucket
Answer – C
Cross-origin resource sharing (CORS) defines a way for client web applications that are loaded in one domain to interact with resources in a different domain. With CORS support, you can build rich client-side web applications with Amazon S3 and selectively allow cross-origin access to your Amazon S3 resources.
Cross-Origin Resource Sharing: Use-case Scenarios The following are example scenarios for using CORS:
Scenario 1: Suppose that you are hosting a website in an Amazon S3 bucket named website as described in Hosting a Static Website on Amazon S3. Your users load the website endpoint http://website.s3-website-us-east-1.amazonaws.com. Now you want to use JavaScript on the webpages that are stored in this bucket to be able to make authenticated GET and PUT requests against the same bucket by using the Amazon S3 API endpoint for the bucket, website.s3.amazonaws.com. A browser would normally block JavaScript from allowing those requests, but with CORS you can congure your bucket to explicitly enable cross-origin requests from website.s3-website-us-east-1.amazonaws.com.
Scenario 2: Suppose that you want to host a web font from your S3 bucket. Again, browsers require a CORS check (also called a preight check) for loading web fonts. You would congure the bucket that is hosting the web font to allow any origin to make these requests.
Q4: Your mobile application includes a photo-sharing service that is expecting tens of thousands of users at launch. You will leverage Amazon Simple Storage Service (S3) for storage of the user Images, and you must decide how to authenticate and authorize your users for access to these images. You also need to manage the storage of these images. Which two of the following approaches should you use? Choose two answers from the options below
A. Create an Amazon S3 bucket per user, and use your application to generate the S3 URL for the appropriate content.
B. Use AWS Identity and Access Management (IAM) user accounts as your application-level user database, and offload the burden of authentication from your application code.
C. Authenticate your users at the application level, and use AWS Security Token Service (STS)to grant token-based authorization to S3 objects.
D. Authenticate your users at the application level, and send an SMS token message to the user. Create an Amazon S3 bucket with the same name as the SMS message token, and move the user’s objects to that bucket.
Answer- C The AWS Security Token Service (STS) is a web service that enables you to request temporary, limited-privilege credentials for AWS Identity and Access Management (IAM) users or for users that you authenticate (federated users). The token can then be used to grant access to the objects in S3. You can then provides access to the objects based on the key values generated via the user id.
Q5: Both ACLs and Bucket Policies can be used to grant access to S3 buckets. Which of the following statements is true about ACLs and Bucket policies?
A. Bucket Policies are Written in JSON and ACLs are written in XML
B. ACLs can be attached to S3 objects or S3 Buckets
C. Bucket Policies and ACLs are written in JSON
D. Bucket policies are only attached to s3 buckets, ACLs are only attached to s3 objects
Answer: A. and B. Only Bucket Policies are written in JSON, ACLs are written in XML. While Bucket policies are indeed only attached to S3 buckets, ACLs can be attached to S3 Buckets OR S3 Objects. Reference:
Q6: What are good options to improve S3 performance when you have significantly high numbers of GET requests?
A. Introduce random prefixes to S3 objects
B. Introduce random suffixes to S3 objects
C. Setup CloudFront for S3 objects
D. Migrate commonly used objects to Amazon Glacier
Answer: C CloudFront caching is an excellent way to avoid putting extra strain on the S3 service and to improve the response times of reqeusts by caching data closer to users at CloudFront locations. S3 Transfer Acceleration optimizes the TCP protocol and adds additional intelligence between the client and the S3 bucket, making S3 Transfer Acceleration a better choice if a higher throughput is desired. If you have objects that are smaller than 1GB or if the data set is less than 1GB in size, you should consider using Amazon CloudFront’s PUT/POST commands for optimal performance. Reference: Amazon S3 Transfer Acceleration
Q7: If an application is storing hourly log files from thousands of instances from a high traffic web site, which naming scheme would give optimal performance on S3?
A. Sequential
B. HH-DD-MM-YYYY-log_instanceID
C. YYYY-MM-DD-HH-log_instanceID
D. instanceID_log-HH-DD-MM-YYYY
E. instanceID_log-YYYY-MM-DD-HH
Answer: A. B. C. D. and E. Amazon S3 now provides increased performance to support at least 3,500 requests per second to add data and 5,500 requests per second to retrieve data, which can save significant processing time for no additional charge. Each S3 prefix can support these request rates, making it simple to increase performance significantly. This S3 request rate performance increase removes any previous guidance to randomize object prefixes to achieve faster performance. That means you can now use logical or sequential naming patterns in S3 object naming without any performance implications.
Q9: You created three S3 buckets – “mywebsite.com”, “downloads.mywebsite.com”, and “www.mywebsite.com”. You uploaded your files and enabled static website hosting. You specified both of the default documents under the “enable static website hosting” header. You also set the “Make Public” permission for the objects in each of the three buckets. You create the Route 53 Aliases for the three buckets. You are going to have your end users test your websites by browsing to http://mydomain.com/error.html, http://downloads.mydomain.com/index.html, and http://www.mydomain.com. What problems will your testers encounter?
A. http://mydomain.com/error.html will not work because you did not set a value for the error.html file
B. There will be no problems, all three sites should work.
C. http://www.mywebsite.com will not work because the URL does not include a file name at the end of it.
D. http://downloads.mywebsite.com/index.html will not work because the “downloads” prefix is not a supported prefix for S3 websites using Route 53 aliases
Answer: B. It used to be that the only allowed domain prefix when creating Route 53 Aliases for S3 static websites was the “www” prefix. However, this is no longer the case. You can now use other subdomain.
Today I Learned (TIL) You learn something new every day; what did you learn today? Submit interesting and specific facts about something that you just found out here.
Reddit Science This community is a place to share and discuss new scientific research. Read about the latest advances in astronomy, biology, medicine, physics, social science, and more. Find and submit new publications and popular science coverage of current research.