When working with Amazon S3 (Simple Storage Service), you’re probably using the S3 web console to download, copy, or upload file to S3 buckets. Using the console is perfectly fine, that’s what it was designed for, to begin with.
Not a reader? Watch this related video tutorial!Especially for admins who are used to more mouse-click than keyboard commands, the web console is probably the easiest. However, admins will eventually encounter the need to perform bulk file operations with Amazon S3, like an unattended file upload. The GUI is not the best tool for that.
For such automation requirements with Amazon Web Services, including Amazon S3, the AWS CLI tool provides admins with command-line options for managing Amazon S3 buckets and objects.
In this article, you will learn how to use the AWS CLI command-line tool to upload, copy, download, and synchronize files with Amazon S3. You will also learn the basics of providing access to your S3 bucket and configure that access profile to work with the AWS CLI tool.
Prerequisites
Since this a how-to article, there will be examples and demonstrations in the succeeding sections. For you to follow along successfully, you will need to meet several requirements.
- An AWS account. If you don’t have an existing AWS subscription, you can sign up for an AWS Free Tier.
- An AWS S3 bucket. You can use an existing bucket if you’d prefer. Still, it is recommended to create an empty bucket instead. Please refer to Creating a bucket.
- A Windows 10 computer with at least Windows PowerShell 5.1. In this article, PowerShell 7.0.2 will be used.
- The AWS CLI version 2 tool must be installed on your computer.
- Local folders and files that you will upload or synchronize with Amazon S3
Preparing Your AWS S3 Access
Suppose that you already have the requirements in place. You’d think you can already go and start operating AWS CLI with your S3 bucket. I mean, wouldn’t it be nice if it were that simple?
For those of you who are just beginning to work with Amazon S3 or AWS in general, this section aims to help you set up access to S3 and configure an AWS CLI profile.
The full documentation for creating an IAM user in AWS can be found in this link below. Creating an IAM User in Your AWS Account
Creating an IAM User with S3 Access Permission
When accessing AWS using the CLI, you will need to create one or more IAM users with enough access to the resources you intend to work with. In this section, you will create an IAM user with access to Amazon S3.
To create an IAM user with access to Amazon S3, you first need to login to your AWS IAM console. Under the Access management group, click on Users. Next, click on Add user.
Type in the IAM user’s name you are creating inside the User name* box such as s3Admin. In the Access type* selection, put a check on Programmatic access. Then, click the Next: Permissions button.
Next, click on Attach existing policies directly. Then, search for the AmazonS3FullAccess policy name and put a check on it. When done, click on Next: Tags.
Creating tags is optional in the Add tags page, and you can just skip this and click on the Next: Review button.
In the Review page, you are presented with a summary of the new account being created. Click Create user.
Finally, once the user is created, you must copy the Access key ID and the Secret access key values and save them for later user. Note that this is the only time that you can see these values.
Setting Up an AWS Profile On Your Computer
Now that you’ve created the IAM user with the appropriate access to Amazon S3, the next step is to set up the AWS CLI profile on your computer.
This section assumes that you already installed the AWS CLI version 2 tool as required. For the profile creation, you will need the following information:
- The Access key ID of the IAM user.
- The Secret access key associated with the IAM user.
- The Default region name is corresponding to the location of your AWS S3 bucket. You can check out the list of endpoints using this link. In this article, the AWS S3 bucket is located in the Asia Pacific (Sydney) region, and the corresponding endpoint is ap-southeast-2.
- The default output format. Use JSON for this.
To create the profile, open PowerShell, and type the command below and follow the prompts.
aws configure
Enter the Access key ID, Secret access key, Default region name, and default output name. Refer to the demonstration below.
Testing AWS CLI Access
After configuring the AWS CLI profile, you can confirm that the profile is working by running this command below in PowerShell.
aws s3 ls
The command above should list the Amazon S3 buckets that you have in your account. The demonstration below shows the command in action. The result shows that list of available S3 buckets indicates that the profile configuration was successful.
To learn about the AWS CLI commands specific to Amazon S3, you can visit the AWS CLI Command Reference S3 page.
Managing Files in S3
With AWS CLI, typical file management operations can be done like upload files to S3, download files from S3, delete objects in S3, and copy S3 objects to another S3 location. It’s all just a matter of knowing the right command, syntax, parameters, and options.
In the following sections, the environment used is consists of the following.
- Two S3 buckets, namely atasync1and atasync2. The screenshot below shows the existing S3 buckets in the Amazon S3 console.
- Local directory and files located under c:\sync.
Uploading Individual Files to S3
When you upload files to S3, you can upload one file at a time, or by uploading multiple files and folders recursively. Depending on your requirements, you may choose one over the other that you deem appropriate.
To upload a file to S3, you’ll need to provide two arguments (source and destination) to the aws s3 cp
command.
For example, to upload the file c:\sync\logs\log1.xml to the root of the atasync1 bucket, you can use the command below.
aws s3 cp c:\sync\logs\log1.xml s3://atasync1/
Note: S3 bucket names are always prefixed with S3:// when used with AWS CLI
Run the above command in PowerShell, but change the source and destination that fits your environment first. The output should look similar to the demonstration below.
The demo above shows that the file named c:\sync\logs\log1.xml was uploaded without errors to the S3 destination s3://atasync1/.
Use the command below to list the objects at the root of the S3 bucket.
aws s3 ls s3://atasync1/
Running the command above in PowerShell would result in a similar output, as shown in the demo below. As you can see in the output below, the file log1.xml is present in the root of the S3 location.
Uploading Multiple Files and Folders to S3 Recursively
The previous section showed you how to copy a single file to an S3 location. What if you need to upload multiple files from a folder and sub-folders? Surely you wouldn’t want to run the same command multiple times for different filenames, right?
The aws s3 cp
command has an option to process files and folders recursively, and this is the --recursive
option.
As an example, the directory c:\sync contains 166 objects (files and sub-folders).
Using the --recursive
option, all the contents of the c:\sync folder will be uploaded to S3 while also retaining the folder structure. To test, use the example code below, but make sure to change the source and destination appropriate to your environment.
You’ll notice from the code below, the source is c:\sync, and the destination is s3://atasync1/sync. The /sync key that follows the S3 bucket name indicates to AWS CLI to upload the files in the /sync folder in S3. If the /sync folder does not exist in S3, it will be automatically created.
aws s3 cp c:\sync s3://atasync1/sync --recursive
The code above will result in the output, as shown in the demonstration below.
Uploading Multiple Files and Folders to S3 Selectively
In some cases, uploading ALL types of files is not the best option. Like, when you only need to upload files with specific file extensions (e.g., *.ps1). Another two options available to the cp
command is the --include
and --exclude
.
While using the command in the previous section includes all files in the recursive upload, the command below will include only the files that match *.ps1 file extension and exclude every other file from the upload.
aws s3 cp c:\sync s3://atasync1/sync --recursive --exclude * --include *.ps1
The demonstration below shows how the code above works when executed.
Another example is if you want to include multiple different file extensions, you will need to specify the --include
option multiple times.
The example command below will include only the *.csv and *.png files to the copy command.
aws s3 cp c:\sync s3://atasync1/sync --recursive --exclude * --include *.csv --include *.png
Running the code above in PowerShell would present you with a similar result, as shown below.
Downloading Objects from S3
Based on the examples you’ve learned in this section, you can also perform the copy operations in reverse. Meaning, you can download objects from the S3 bucket location to the local machine.
Copying from S3 to local would require you to switch the positions of the source and the destination. The source being the S3 location, and the destination is the local path, like the one shown below.
aws s3 cp s3://atasync1/sync c:\sync
Note that the same options used when uploading files to S3 are also applicable when downloading objects from S3 to local. For example, downloading all objects using the command below with the --recursive
option.
aws s3 cp s3://atasync1/sync c:\sync --recursive
Copying Objects Between S3 Locations
Apart from uploading and downloading files and folders, using AWS CLI, you can also copy or move files between two S3 bucket locations.
You’ll notice the command below using one S3 location as the source, and another S3 location as the destination.
aws s3 cp s3://atasync1/Log1.xml s3://atasync2/
The demonstration below shows you the source file being copied to another S3 location using the command above.
Synchronizing Files and Folders with S3
You’ve learned how to upload, download, and copy files in S3 using the AWS CLI commands so far. In this section, you’ll learn about one more file operation command available in AWS CLI for S3, which is the sync
command. The sync
command only processes the updated, new, and deleted files.
There are some cases where you need to keep the contents of an S3 bucket updated and synchronized with a local directory on a server. For example, you may have a requirement to keep transaction logs on a server synchronized to S3 at an interval.
Using the command below, *.XML log files located under the c:\sync folder on the local server will be synced to the S3 location at s3://atasync1.
aws s3 sync C:\sync\ s3://atasync1/ --exclude * --include *.xml
The demonstration below shows that after running the command above in PowerShell, all *.XML files were uploaded to the S3 destination s3://atasync1/.
Synchronizing New and Updated Files with S3
In this next example, it is assumed that the contents of the log file Log1.xml were modified. The sync
command should pick up that modification and upload the changes done on the local file to S3, as shown in the demo below.
The command to use is still the same as the previous example.
As you can see from the output above, since only the file Log1.xml was changed locally, it was also the only file synchronized to S3.
Synchronizing Deletions with S3
By default, the sync
command does not process deletions. Any file deleted from the source location is not removed at the destination. Well, not unless you use the --delete
option.
In this next example, the file named Log5.xml has been deleted from the source. The command to synchronize the files will be appended with the --delete
option, as shown in the code below.
aws s3 sync C:\sync\ s3://atasync1/ --exclude * --include *.xml --delete
When you run the command above in PowerShell, the deleted file named Log5.xml should also be deleted at the destination S3 location. The sample result is shown below.
Summary
Amazon S3 is an excellent resource for storing files in the cloud. With the use of the AWS CLI tool, the way you utilize Amazon S3 is further expanded and opens the opportunity to automate your processes.
In this article, you’ve learned how to use the AWS CLI tool to upload, download, and synchronize files and folders between local locations and S3 buckets. You’ve also learned that S3 buckets’ contents can also be copied or moved to other S3 locations, too.
There can be many more use-case scenarios for using the AWS CLI tool to automate file management with Amazon S3. You can even try to combine it with PowerShell scripting and build your own tools or modules that are reusable. It is up to you to find those opportunities and show off your skills.