principleMinor
For storing avatars in s3, which approach has the lowest cost?
Viewed 0 times
avatarsthelowesthasforwhichcoststoringapproach
Problem
On my site I have the ability for users to upload avatars (which they upload to s3). They can only have one avatar at the moment, so I currently just have the filename set to their user id.
However, they may replace their avatars, which causes issues. Because the s3 is propagating to CloudFront, I need to invalidate the cache for the filename when they upload their new avatar (because it replaces the old avatar file). But with CloudFront invalidations are expensive.
It almost seems like Amazon would prefer I just not delete the old files, and instead just give the new avatar a new name and leave the old one in there, even though it's no longer being used by anyone. That way I can avoid the invalidation.
The other option is not to use CloudFront at all, and just upload to s3, and make the s3 bucket public. But the problem with this is that CloudFront is much more generous with their GET request costs and amounts (2,000,000 per month with CloudFront and only 20,000 per month with S3 buckets).
Basically, I have these 3 options, and I'm not sure which to pick, given the above:
1) Replace the files in s3 and then invalidate the file in CloudFront
2) Just dump more files in the s3 bucket and use CloudFront
3) Don't use CloudFront, and just replace the file in s3 and then serve it from the bucket
Edit: In this case, I'm asking about AWS with s3 and CloudFront. I'm looking to most effectively automate the process by which a user can upload an avatar to my backend service. Specifically, the overwriting of old avatars has me confused. The general process flow of "PreSigned URL -> Upload to s3 -> Propagate via CloudFront" seems an adequate one, but I'm not sure how to accommodate the overwriting of old, no longer used files in this plan, considering the choice seems to
However, they may replace their avatars, which causes issues. Because the s3 is propagating to CloudFront, I need to invalidate the cache for the filename when they upload their new avatar (because it replaces the old avatar file). But with CloudFront invalidations are expensive.
It almost seems like Amazon would prefer I just not delete the old files, and instead just give the new avatar a new name and leave the old one in there, even though it's no longer being used by anyone. That way I can avoid the invalidation.
The other option is not to use CloudFront at all, and just upload to s3, and make the s3 bucket public. But the problem with this is that CloudFront is much more generous with their GET request costs and amounts (2,000,000 per month with CloudFront and only 20,000 per month with S3 buckets).
Basically, I have these 3 options, and I'm not sure which to pick, given the above:
1) Replace the files in s3 and then invalidate the file in CloudFront
- Problem: invalidations are expensive
2) Just dump more files in the s3 bucket and use CloudFront
- Problem: the old, unused avatars just sit there wasting s3 bucket space
3) Don't use CloudFront, and just replace the file in s3 and then serve it from the bucket
- Problem: CloudFront has cheaper GET requests and better user experience with latency
Edit: In this case, I'm asking about AWS with s3 and CloudFront. I'm looking to most effectively automate the process by which a user can upload an avatar to my backend service. Specifically, the overwriting of old avatars has me confused. The general process flow of "PreSigned URL -> Upload to s3 -> Propagate via CloudFront" seems an adequate one, but I'm not sure how to accommodate the overwriting of old, no longer used files in this plan, considering the choice seems to
Solution
For the best performance naming many small files in an s3 bucket may not be the best idea unless they user_id is a random string. See how the S3 hashing algorithm works https://aws.amazon.com/blogs/aws/amazon-s3-performance-tips-tricks-seattle-hiring-event/
So replacing it by a random hashed value may not be that bad. Of course you need to store the hash in the user profile or you need to calculate it from something. If you add something like profile id or last changed timestamp you can do sha1/md5 of (userid+lastchanged) and if you update timestamp you has the hash of the current avatar.
The older one you can simply delete. CloudFront will remove it from its cache after expiration https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Expiration.html
This will more complex algorithm, but if the costs are the concern, a little bit of extra work seems to be a reasonable approach. Not sure how do you handle the uploads, but it may be a good opportunity to play a bit with step functions which would make sure that the new avatar is uploaded correctly and then the old one is removed as well.
It's kind of a programming question, but I get your point - it is very specific to AWS environment and requires a particular knowledge about how the infrastructure works.
So replacing it by a random hashed value may not be that bad. Of course you need to store the hash in the user profile or you need to calculate it from something. If you add something like profile id or last changed timestamp you can do sha1/md5 of (userid+lastchanged) and if you update timestamp you has the hash of the current avatar.
The older one you can simply delete. CloudFront will remove it from its cache after expiration https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Expiration.html
This will more complex algorithm, but if the costs are the concern, a little bit of extra work seems to be a reasonable approach. Not sure how do you handle the uploads, but it may be a good opportunity to play a bit with step functions which would make sure that the new avatar is uploaded correctly and then the old one is removed as well.
It's kind of a programming question, but I get your point - it is very specific to AWS environment and requires a particular knowledge about how the infrastructure works.
Context
StackExchange DevOps Q#9826, answer score: 3
Revisions (0)
No revisions yet.