IAN WALDRON IAN WALDRON

How To Authenticate Your Google Service Account In Python Using Environment Variables

Securely store your Google service account credentials with environment variables and use python to authenticate with Google services.
May 30, 2024

Background

When you access a Google service programmatically, you may need to use a Google Service Account (GSA). Historically, it was relatively easy to authenticate with Google services. All you needed was a simple API key and you were up and running. But Google has since implemented more robust security for services needing a GSA such as the Indexing API.

To gain access to a Google service using your GSA, you need to use OAuth 2.0 protocol with either Workload Identity Federation or service account keys. Which path you go down will largely be determined by who you're using for cloud. If you're using cloud providers like Google, AWS, or Azure, then Google recommends using Workload Identity Federation due to better security than with service account keys as well as more granular permissions between projects.

I don't use a provider where Workload Identity Federation is an option so I'm instead going to use service account keys.

This article will look at how to obtain a service account key, store and retrieve environment variables in development and production, and how to authenticate a request with a Google service.

Google Service Account Key

For completeness, I'll touch briefly on obtaining a GSA but mostly I'll just refer to the docs since it's a pretty simple task.

First you need to create a service account if you don't already have one set up for your project. A service account is created from the IAM & Admin section of the Google developer Cloud Management Platform. Once upon a time, I believe this was referred to as the Developer Console if I'm not mistaken.

On the sidebar you'll see Service Accounts. Select this option followed by Create Service Account and follow the prompts. For the name itself, I generally use a sort of 'slugified' (spaces to dashes without special characters) version of the project name with a postfix 'gsa.' For example, <my-project-name>-gsa.

From your service account, navigate to the Keys tab (IAM & Admin > Service Accounts > Keys) and select Add Key followed by Create New Key. From the dialogue box, choose the key format you desire. For my purposes, I prefer to work with JSON. Select Create and your key file will download.

If your project configuration isn't stateless, i.e. not Kubernetes, then you can simply store this key file somewhere secure on your machine. I use Kubernetes for all my projects now so I'll instead use environment variables.

Environment Variables

The service account key provides these key-value pairs:

{
  "type": "<sensitive-data>",
  "project_id": "<sensitive-data>",
  "private_key_id": "<sensitive-data>",
  "private_key": "<sensitive-data>",
  "client_email": "<sensitive-data>",
  "client_id": "<sensitive-data>",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/<sensitive-data>.iam.gserviceaccount.com",
  "universe_domain": "googleapis.com"
}

GitHub

You'll notice not everything in the JSON key is sensitive data. I'm only going to go through the effort of managing environment variables for values where there's a security concern.

Why not just store the entire JSON as a single environment variable, e.g. something like GOOGLE_CREDS? I think that's a valid thing to do given that you're not likely to retrieve any single value alone. Rather, we're sending the entire group to Google to authenticate our request. However, there are system constraints that need to be managed when storing large environment variables.

My thought was to use Base-64 to encode/decode the entire JSON. When implementing, I ran into a problem where the value was being truncated and data was being lost. While I still think the approach is valid, it wasn't worth it to me to try to debug the issue to avoid creating a half-dozen environment variables instead of a single. The additional effort in getting the solution working wasn't worth it in my opinion. So instead, I'm storing each sensitive piece of information separately.

That said, I imagine best practice to store each value individually anyways. Environment variables aren't meant for storing complex data types like JSON. You'll be adding fragility to your project by asking your system to do something it was meant to do. But, I still find encoding/decoding the JSON key file as a string tempting nonetheless. If someone has a working solution to this, let me know.

Let's move on to storing the key. In development, I'll use a .env file so the data is available for testing. That will look something like:

GOOGLE_PROJECT_ID=my-project
GOOGLE_PRIVATE_KEY_ID=private-key-id
GOOGLE_PRIVATE_KEY=a-super-secret-long-private-key-string
GOOGLE_CLIENT_EMAIL=<my-service-account-email>
GOOGLE_CLIENT_ID=my-client-id
GOOGLE_CLIENT_X509_CERT_URL=https://www.googleapis.com/robot/v1/metadata/x509/<sensitive-data>.iam.gserviceaccount.com

GitHub

Notice again, no need to store anything that's not sensitive as an environment variable. Save yourself the effort. Instead, we'll drop those values like auth_uri and token_uri where and when we need them.

In production, I'm not going to use a .env file. Instead, I'll store each variable in GitHub Secrets. I'm using a continuous integration and continuous delivery (CI/CD) pipeline where when I push to a production repo branch, I use GitHub Actions to build my Docker container image, push the image to my registry, deploy the image to my Kubernetes cluster and update the environment secrets.

First, the secrets themselves need to be stored in GitHub. Navigate to the Settings tab of your repository. Then from the sidebar, select Secrets and variables > Actions. From the Secrets tab of this view, Select New repository secret and enter your sensitive data here as key-value pairs repeating the process as necessary until all your data is stored. The sensitive data stored here will be available to GitHub actions to pull in as environment variables.

I'll create a file build.yml to store my GitHub Actions directives:

name: Deploy to production

on:
  workflow_call:  # use if called by another workflow
  workflow_dispatch:
  push:
    branches: [ "main" ]  # deployment branch

jobs:
  build:
    runs-on: ubuntu-latest
  env:
    # env variables necessary for the workflow (perhaps a cluster name that's used repeatedly, etc.)
  steps:
    ...
    # stuff that gets you logged into your container registry, installs necessary utilities, etc.
    ...
    name: Deployment Secrets
    # creates an file for environment variables assuming your .Dockerfile establishes a directory 'app'
    run: |
      cat << EOF >> app/.env.prod
      GOOGLE_PROJECT_ID=$
      GOOGLE_PRIVATE_KEY_ID=$
      GOOGLE_PRIVATE_KEY=$
      GOOGLE_CLIENT_EMAIL=$
      GOOGLE_CLIENT_ID=$
      GOOGLE_CLIENT_X509_CERT_URL=$
      EOF

      # only try to delete the secret if it exists to avoid raising an exception
      if kubectl get secret <secret-name>; then
        # the secret exists, delete it
        kubectl delete secret <secret-name>
      fi
        
      kubectl create secret generic <secret-name> --from-env-file=app/.env.prod

      # make sure you clean up your secrets
      rm app/.env.prod

GitHub

A couple things to note here. First for our build.yml GitHub Action to be called, it needs to be located at:

. 
`-- .github/ 
      `-- workflows/ 
            `-- .build.yml

Next, I'm creating (or appending since '>>' is used instead of '>' which overwrites/creates) the environment file used to populate our secrets at /app. This is generally the directory I use to copy my source code into the Docker container. Make sure whatever location you use is consistent with how you structured your .Dockerfile.

Next, I check with an if statement if the secret already exists and only delete if that's true so we don't raise an exception if the secret isn't available when we try to delete (like when we deploy the image for the first time).

And last, I remove the file used to populate the secrets. While the file exists only on the runner with how currently structured, I just want to take an extra step out of an abundance of caution so I don't accidentally mishandle the file downstream. With this we now have environment variables available to authenticate with Google.

Authenticate With Google

We're finally ready to authenticate our request to a Google service. To do this, we need to create a Credentials instance from the google.oauth2.service_account module. First, let's install the necessary package google-api-python-client:

pip install --upgrade google-api-python-client

With the requirement installed, we're now ready to grab our key from environment variables and build the Credentials instance.

import os

from google.oauth2.service_account import Credentials

# scopes for the indexing api for purposes of this example -> swap out for the scopes relevant to your task
scopes = ["https://www.googleapis.com/auth/indexing"]

google_key = {
    "type": "service_account",
    "project_id": os.environ.get("GOOGLE_PROJECT_ID"),
    "private_key_id": os.environ.get("GOOGLE_PRIVATE_KEY_ID"),
    "private_key": os.environ.get("GOOGLE_PRIVATE_KEY").replace("\\n", "\n"),
    "client_email": os.environ.get("GOOGLE_CLIENT_EMAIL"),
    "client_id": os.environ.get("GOOGLE_CLIENT_ID"),
    "auth_uri": "https://accounts.google.com/o/oauth2/auth",
    "token_uri": "https://oauth2.googleapis.com/token",
    "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
    "client_x509_cert_url": os.environ.get("GOOGLE_CLIENT_X509_CERT_URL"),
    "universe_domain": "googleapis.com"
}

credentials = Credentials.from_service_account_info(google_key, scopes=scopes)

# if the key was stored as a file, this is how we'd approach this instead
# credentials = Credentials.from_service_account_file(/<location-of-key-file>/, scopes=scopes)

GitHub

And with that, we're now ready to pass the credentials instance to whatever service we're looking to connect to. 

Example Using Indexing API

With the credentials created in the previous example, we'll connect to the Google Indexing API used for submitting requests to Google Search asking for its bots to crawl the URLs submitted for indexing. This is a simple two-step process. First you establish the service connection, then you make your request(s) to the service.

from googleapiclient import discovery

from service_account_credentials import credentials  # previous example


# this would be in a try/exept block in practice
service = discovery.build("indexing", "v3", credentials=credentials)

payload = {"url": "https://ianwaldron.com", "type": "URL_UPDATED"}
response = service.urlNotifications().publish(body=payload).execute()
print(response)

GitHub

A note regarding the Indexing API. To use this service, you have to prove that you're the owner of the URL/domain. This is done with Google's Search Console which I believe used to be called Webmaster Tools. You prove that you're the owner by adding a 'property' using Add Property and following the steps from the property dropdown in the upper left of the portal.

Where this will get you into trouble is simply verifying your ownership of the property is not enough and you'll receive a 403 when calling the service.

googleapiclient.errors.HttpError: <HttpError 403 when requesting 
https://indexing.googleapis.com/v3/urlNotifications:publish?alt=json returned 
"Permission denied. Failed to verify the URL ownership.". Details: "Permission denied. 
Failed to verify the URL ownership.">

To correct this, you need to add the service account as a user to the property with 'Owner' privileges. This can be accomplished by navigating to Settings on the sidebar of the Search Console portal and then selecting Users and permissions. From the dialogue box that's produced, enter the email address of the Google Service Account. This is the email that google produces for you and not your actual email associated with your Google account. It's also the same email in the service account key we stored as an environment variable 'GOOGLE_CLIENT_EMAIL.'

Last, and this is important, select 'Owner' and not the default of 'Full' for permissions. Select 'Add' to commit the user to the property and you'll now be able to use the Indexing API service.

Summary

Here we explore how to authenticate Google Service Accounts with OAuth 2.0 protocol service account keys. After creating a service account and downloading the service account keys, we import these values as environment variables using GitHub Actions. Then we retrieve the key values when we need to authenticate with a Google service. And last, we cap with an example using the Google Indexing API to demonstrate how authenticating with Google services is done in practice.

Code for this article is available at my GitHub.