Background
When you access a Google service programmatically, you may need to use a Google Service Account (GSA). Historically, it was relatively easy to authenticate with Google services. All you needed was a simple API key and you were up and running. But Google has since implemented more robust security for services needing a GSA such as the Indexing API.
To gain access to a Google service using your GSA, you need to use OAuth 2.0 protocol with either Workload Identity Federation or service account keys. Which path you go down will largely be determined by who you're using for cloud. If you're using cloud providers like Google, AWS, or Azure, then Google recommends using Workload Identity Federation due to better security than with service account keys as well as more granular permissions between projects.
I don't use a provider where Workload Identity Federation is an option so I'm going to use service account keys.
This article will look at how to obtain a service account key, store and retrieve environment variables in development and production, and how to authenticate a request with a Google service.
Google Service Account Key
For completeness, I'll touch briefly on obtaining a GSA but mostly I'll just refer to the docs since it's a pretty simple task.
First you need to create a service account if you don't already have one set up for your project. A service account is created from the "IAM & Admin" section of the Google developer Cloud Management Platform.
On the sidebar you'll see "Service Accounts." Select this option followed by "Create Service Account" and follow the prompts. For the name itself, I generally use a sort of "slugified" (spaces to dashes without special characters) version of the project name with a postfix "gsa." For example, <my-project-name>-gsa.
From your service account, navigate to the "Keys" tab (IAM & Admin > Service Accounts > Keys) and select "Add Key" followed by "Create New Key." From the dialogue box, choose the key format you desire. For my purposes, I prefer to work with JSON. Select "Create" and your key file will download.
If your project configuration isn't stateless, i.e. not Kubernetes, then you can simply store this key file somewhere secure on your machine. I use Kubernetes for all my projects now so I'll instead use environment variables.
Environment Variables
The service account key provides these key-value pairs:
{
"type": "<sensitive-data>",
"project_id": "<sensitive-data>",
"private_key_id": "<sensitive-data>",
"private_key": "<sensitive-data>",
"client_email": "<sensitive-data>",
"client_id": "<sensitive-data>",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/<sensitive-data>.iam.gserviceaccount.com",
"universe_domain": "googleapis.com"
}
You'll notice not everything in the JSON key is sensitive data. I'm only going to go through the effort of managing environment variables for values where there's a security concern.
Why not just store the entire JSON as a single environment variable, e.g. something like GOOGLE_CREDS
? After all, these values are used together as a group and are unlikely to be accessed individually. You could theoretically serialize/deserialize the complete object as an environment variable without fear of truncation. Environment variables are generally allowed to be pretty long. See this discussion on Stack Overflow on environment variable length.
That said, storing complex objects as environment variables isn't advised. Environment variables are meant to be simple key/value pairs of strings storing primitive values. Each should be semantically atomic.
Let's move on to storing the key. In development, I'll use an .env
file so the data is available for testing. That will look something like:
GOOGLE_PROJECT_ID=my-project
GOOGLE_PRIVATE_KEY_ID=private-key-id
GOOGLE_PRIVATE_KEY=a-super-secret-long-private-key-string
GOOGLE_CLIENT_EMAIL=<my-service-account-email>
GOOGLE_CLIENT_ID=my-client-id
GOOGLE_CLIENT_X509_CERT_URL=https://www.googleapis.com/robot/v1/metadata/x509/<sensitive-data>.iam.gserviceaccount.com
Notice again, no need to store anything that's not sensitive as an environment variable. Save yourself the effort. Instead, we'll drop those values like auth_uri
and token_uri
when and where we need them.
In production, it's not good practice to use an .env
file. File-based environment variable storage is problematic from a security standpoint. Instead, I'll set each variable individually to the process environment. On MacOS/Linux this is export MY_VAR=my-value
. From command line in Windows, you can use set MY_VAR=my-value
.
In practice, I'll use some sort of secrets vault like GitHub Secrets and inject the environment variables on deployment as part of a continuous integration and continuous delivery (CI/CD) pipeline. Checkout my article Securely Update Kubernetes Secrets with Manual GitHub Workflows on how this can be accomplished for a Docker / Kubernetes deployment.
Authenticate With Google
With the environment variables containing our GSA credentials set in the process environment, we're finally ready to authenticate our request to a Google service. To do this, we need to create a Credentials
instance from the google.oauth2.service_account
module. First, let's install the necessary package google-api-python-client:
pip install --upgrade google-api-python-client
With the requirement installed, we're now ready to grab the GSA credentials from the environment variables and build the Credentials
instance.
import os
from google.oauth2.service_account import Credentials
# scopes for the indexing api for purposes of this example -> swap out for the scopes relevant to your task
scopes = ["https://www.googleapis.com/auth/indexing"]
google_key = {
"type": "service_account",
"project_id": os.environ.get("GOOGLE_PROJECT_ID"),
"private_key_id": os.environ.get("GOOGLE_PRIVATE_KEY_ID"),
"private_key": os.environ.get("GOOGLE_PRIVATE_KEY").replace("\\n", "\n"),
"client_email": os.environ.get("GOOGLE_CLIENT_EMAIL"),
"client_id": os.environ.get("GOOGLE_CLIENT_ID"),
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": os.environ.get("GOOGLE_CLIENT_X509_CERT_URL"),
"universe_domain": "googleapis.com"
}
credentials = Credentials.from_service_account_info(google_key, scopes=scopes)
# if the key was stored as a file, this is how we'd approach this instead
# credentials = Credentials.from_service_account_file(/<location-of-key-file>/, scopes=scopes)
And with that, we're now ready to pass the Credentials instance to whatever service we're looking to connect to.
Example Using Indexing API
With the credentials created in the previous example, we'll connect to the Google Indexing API used for submitting requests to Google Search asking for its bots to crawl the URLs submitted for indexing. This is a simple two-step process. First you establish the service connection, then you make your request(s) to the service.
from googleapiclient import discovery
from service_account_credentials import credentials # previous example
# this would be in a try/exept block in practice
service = discovery.build("indexing", "v3", credentials=credentials)
payload = {"url": "https://ianwaldron.com", "type": "URL_UPDATED"}
response = service.urlNotifications().publish(body=payload).execute()
print(response)
A note regarding the Indexing API. To use this service, you have to prove that you're the owner of the URL/domain. This is done with Google's Search Console which I believe used to be called Webmaster Tools. You prove that you're the owner by adding a "property" using "Add Property" and following the steps from the property dropdown in the upper left of the portal.
Note, there's a potential to get into trouble here because simply verifying ownership of the property is not enough. If you stop here, you'll receive a "403" permission denied status code when calling the service.
googleapiclient.errors.HttpError: <HttpError 403 when requesting
https://indexing.googleapis.com/v3/urlNotifications:publish?alt=json returned
"Permission denied. Failed to verify the URL ownership.". Details: "Permission denied.
Failed to verify the URL ownership.">
To correct this, you need to add the service account as a user to the property with "Owner" privileges. This can be accomplished by navigating to "Settings" on the sidebar of the Search Console portal and then selecting "Users and permissions." From the dialogue box that's produced, enter the email address of the Google Service Account. This is the email that google produces for you and not your actual email associated with your Google account. It's also the same email in the service account key we stored as an environment variable GOOGLE_CLIENT_EMAIL
.
Last, and this is important, select "Owner" and not the default of "Full" for permissions. Select "Add" to commit the user to the property and you'll now be able to use the Indexing API service.
Summary
Here we explore how to authenticate Google Service Accounts with OAuth 2.0 protocol service account keys. After creating a service account and downloading the service account keys, we import these values as environment variables. Then we retrieve the values when we need to authenticate with a Google service. And last, we wrap up with an example using the Google Indexing API to demonstrate how authenticating with Google services is done in practice.