Introduction to Google Cloud Storage Transfer Service


The Google Cloud Storage Transfer Service is a fully managed service that enables you to automate and schedule data transfers between different storage locations, including from on-premises data centers to Google Cloud Storage. In this guide, we'll explore the basics of the Google Cloud Storage Transfer Service and provide a sample Python code snippet for initiating a transfer job.


Key Concepts

Before we dive into the code, let's understand some key concepts related to the Google Cloud Storage Transfer Service:

  • Transfer Job: A transfer job is a configuration that defines what data to transfer and where to transfer it. It includes source and destination specifications, scheduling, and other parameters.
  • Sources and Destinations: You can transfer data from various sources, including on-premises locations, other cloud providers, and within Google Cloud Storage itself. The destination can be Google Cloud Storage buckets.
  • Scheduling: You can schedule transfer jobs to run periodically, which is useful for automating data transfers.

Sample Code: Initiating a Transfer Job

Here's a sample Python code snippet for initiating a Google Cloud Storage Transfer Service job to transfer data from an on-premises source to a Google Cloud Storage bucket:


from google.oauth2 import service_account
from googleapiclient.discovery import build
# Set the service account credentials JSON file path
credentials_file = 'your-service-account-credentials.json'
# Set the project ID and location
project_id = 'your-project-id'
location = 'us-central1'
# Initialize the Storage Transfer Service API client
credentials = service_account.Credentials.from_service_account_file(credentials_file)
storage_transfer = build('storagetransfer', 'v1', credentials=credentials)
# Define the transfer specification
transfer_spec = {
"gcsDataSource": {
"bucketName": "source-bucket-name"
},
"gcsDataSink": {
"bucketName": "destination-bucket-name"
},
"transferOptions": {
"overwriteObjectsAlreadyExistingInSink": True
}
}
# Define the schedule
schedule = {
"scheduleStartDate": {
"year": 2023,
"month": 10,
"day": 1
},
"startTimeOfDay": {
"hours": 12,
"minutes": 0,
"seconds": 0
}
}
# Create the transfer job
transfer_job = {
"description": "Transfer job description",
"status": "ENABLED",
"projectId": project_id,
"transferSpec": transfer_spec,
"schedule": schedule,
"notificationConfig": {
"pubsubTopic": "projects/{}/topics/transfer-job-notifications".format(project_id)
}
}
# Initiate the transfer job
response = storage_transfer.transferJobs().create(
projectId=project_id, body=transfer_job, location=location).execute()
print("Transfer job created with ID:", response['name'])

Make sure to replace

your-service-account-credentials.json
,
your-project-id
,
source-bucket-name
,
destination-bucket-name
, and other values with your specific credentials and configuration details.


Conclusion

The Google Cloud Storage Transfer Service simplifies the process of automating data transfers between different storage locations. By configuring transfer jobs, you can efficiently move data to and from Google Cloud Storage, making it an essential tool for data management and data migration tasks.