If you need to copy files to an Amazon Web Services (AWS) S3 bucket, copy files from bucket to bucket, and automate the process, the AWS software development kit (SDK) for Python called Boto3 is your best friend. Combining Boto3 and S3 allows move files around with ease in AWS.
In this tutorial, you will learn how to get started using the Boto3 Python library with S3 via an example-driven approach.
Let’s get started!
Prerequisites
This post will be a step-by-step tutorial. If you’d like to follow along, ensure you have the following in place:
- An AWS account
- An IAM user with an access key ID and secret key set up on your local machine. You can find steps to set up IAM users Access key and Secret key from here. This tutorial will use an IAM user called myboto3user.
Ensure the IAM user is set up for programmatic access and that you assign it to the existing policy of AmazonS3FullAccess.
- Python v3.6 or later installed on your local machine. This tutorial will be using Python v3.9.2 on a Windows 10 machine.
- The Python package manager, pip.
- A code editor. Even though you can use any text editor to work with Python files. This tutorial will be using Visual Studio (VS) Code.
Installing Boto3
Before you can begin managing S3 with Boto3, you must install it first. Let’s start off this tutorial by downloading and installing Boto3 on your local computer.
The easiest ways to install Boto3 is to use the pip Python package manager. To install Boto3 with pip:
1. Open a cmd/Bash/PowerShell on your computer.
2. Run the pip install
command as shown below passing the name of the Python module (boto3
) to install.
pip install boto3
pip is a Python package manager which installs software that is not present in Pythons standard library.
Boto3 should now successfully installed!
Creating an AWS S3 Bucket with Boto3
Once you have Boto3 installed, it’s time to see what it can do! Let’s now dive into some examples of working with AWS S3 starting with creating a new S3 bucket.
Boto3 supports two types of interactions with AWS; resource or client levels. The client level provides low-level service access while the resource level provides higher-level, more abstracted level access. This tutorial will use client access.
1. Open your favorite code editor.
2. Copy and paste the following Python script into your code editor and save the file as main.py. The tutorial will save the file as ~\main.py. The following code snippet creates an S3 bucket called first-us-east-1-bucket and prints out a message to the console once complete.
# Importing boto3 library to make functionality available
import boto3
# Creating a client connection with AWS S3
s3 = boto3.client('s3')
# Creating a bucket
s3.create_bucket(Bucket='first-us-east-1-bucket')
print("Bucket created succesfully")
3. Open your terminal and execute the main.tf script using python
. If successful, you should see a single message of Bucket created successfully
.
python ~\main.py
4. Now, open your favorite web browser, navigate to the AWS Management Console and log in.
5. Click on the search bar at the top of the console, search for ‘S3’, and click on the S3 menu item.
On the S3 page, you should now see your newly-created bucket as shown below.
The bucket is in the AWS Region US East because the default region is set to us-east-1 in the AWS profile.
How to List S3 buckets in AWS
Now that you have at least one S3 bucket in your account, now confirm that not by using the AWS Management Console but by using Boto3. Boto3 can also list all S3 buckets in your account.
With your code editor open:
1. Copy and paste the following Python code into your code editor and save it as list_s3_buckets.py. This script queries AWS and saves the response to the list_buckets()
method in a file, loops (for
) through an array (response['Buckets']
), and returns the name (bucket["Name"]
) of each S3 bucket it finds.
# Importing boto3 library
import boto3
# Creating a client connection with AWS S3
s3 = boto3.client('s3')
# Storing the client connection within rep
response = s3.list_buckets()
# Output the bucket names
print('Existing buckets:')
for bucket in response['Buckets']: # For Loop to list all the buckets
print(f'{bucket["Name"]}')
2. Execute the script and you should see each S3 bucket name displayed in your account.
python list_s3_buckets.py
How to Upload A File to an AWS S3 Bucket
Now that you have an S3 bucket to work with, let’s begin creating some objects in it. To start, upload a file to your S3 bucket.
1. Create a new file or pick an existing one on your local computer to upload. This tutorial will use a file called ATA.txt.
2. Assuming you still have your code editor open, create a new Python script and save it as upload_s3_file.py by copying/pasting the following code. The script below opens the ~/ATA.txt file for reading (rb
) and upload the file (upload_fileobjj()
) to the first-us-east-1-bucket
.
# Importing boto3 library
import boto3
# Creating a client connection with AWS S3
s3 = boto3.client('s3')
# Read the file stored on your local machine
with open('~/ATA.txt', 'rb') as data:
# Upload the file ATA.txt within the Myfolder on S3
s3.upload_fileobj(data, 'first-us-east-1-bucket', '~/ATA.txt')
3. Execute the script which should upload the file.
python upload_s3_file.py
4. Assuming you still have the S3 page open in a browser, click on the bucket created earlier, and you should see the file has been uploaded successfully!
How to Upload Entire Folders with Boto3
Previously you uploaded a single file to the AWS S3 bucket but what if you need to upload a folder? There is nothing in the boto3
library itself that would allow you to upload an entire directory. But, you can still make it happen with a Python script.
1. Ensure you have a folder on your local computer with some files in it. This tutorial will use a folder called ATA.
2. Assuming you still have your code editor open, create a new Python script and save it as upload_s3_folder.py by copying/pasting the following code. The script below uses the os module
to walk the directory tree using os.walk
and recursively add all the files of the folder to a zip file called ATA.zip using the ZipFile
module and upload it to the first-us-east-1-bucket
.
# Importing boto3, zipfile, and os library
import boto3
import os
import zipfile
# Creating a client connection with AWS S3
s3 = boto3.client('s3')
def zipdir(path, ziph):
# ziph is zipfile handle
for root, dirs, files in os.walk(path):
for file in files:
ziph.write(os.path.join(root, file))
zipf = zipfile.ZipFile('ATA.zip', 'w', zipfile.ZIP_DEFLATED)
zipdir('ATA', zipf) # Passing the ATA folder in the arguments
zipf.close()
# Upload the Zip file ATA.zip within the folder2 on S3
with open('ATA.zip', 'rb') as data:
s3.upload_fileobj(data, 'first-us-east-1-bucket', 'ATA.zip')g
3. Execute the script which should upload the zip file of the ATA
folder containing all your files in the bucket. python upload_s3_folder.py
python upload_s3_folder.py
How to Copy Files Between S3 Buckets with Boto3
Previously, you worked with S3 from on-prem. Let’s now stay in the cloud and copy files from one S3 bucket to another.
1. Create a second S3 bucket to transfer files to using the know-how you received from the earlier section. This tutorial will be using a bucket called first-us-east-1-bucket-2 as the destination bucket.
2. Create a new Python script and save it as copy*_s3_to_s3.py.* Copy and paste in the following code. The script assumes you still have the ATA.txt file in the S3 bucket uploaded earlier. It will find the ATA.txt file in the first-us-east-1-bucket
and copy it to the first-us-east-1-bucket2
S3 bucket.
The Resource() API provides a higher-level abstraction than the raw, low-level calls made by service clients. At times, you need to connect to resources directly rather than Service API.
# Importing boto3 library
import boto3
# Creating the connection with the resource
s3 = boto3.resource('s3')
# Declaring the source to be copied
copy_source = {
'Bucket': 'first-us-east-1-bucket',
'Key': 'ATA.txt'
}
bucket = s3.Bucket('first-us-east-1-bucket-2')
# Copying the files to another bucket
bucket.copy(copy_source, 'ATA.txt')
3. Execute the script which should upload the file.
python copy_s3_to_s3.py
You should now see the file has been copied from one bucket to the other.
Conclusion
In this tutorial, you learned how to install the Boto3 AWS SDK for Python and work with the AWS S3 service. Although many automation tools manage and work with various AWS services, if you need to interact with AWS APIs with Python, Boto3 is the answer.
Now that you have a AWS S3 bucket set up with a Boto3, what do you plan to manage next?