Skip to main content

Google Search Console

This page contains the setup guide and reference information for the Google Search Console source connector.

Prerequisites

  • Google Account
  • A verified property in Google Search Console (or the list of the Site URLs (Website URL Property))
  • Google Search Console API enabled for your project (Airbyte Open Source only)

Setup guide

Step 1: Set up Google Search Console

To authenticate the Google Search Console connector, you will need to use one of the following methods:

You can authenticate using your Google Account with OAuth if you are the owner of the Google Search Console property or have view permissions. Follow Google's instructions to ensure that your account has the necessary permissions (Owner or Full User) to view the Google Search Console property. This option is recommended for Airbyte Cloud users, as it significantly simplifies the setup process and allows you to authenticate the connection directly from the Airbyte UI.

To authenticate with OAuth in Airbyte Open Source, you will need to create an authentication app and obtain the following credentials and tokens:

  • Client ID
  • Client Secret
  • Refresh Token
  • Access Token

More information on the steps to create an OAuth app to access Google APIs and obtain these credentials can be found in Google's documentation.

You can authenticate the connection using a JSON key file associated with a Google service account. This option is recommended for Airbyte Open Source users. Follow the steps below to create a service account and generate the JSON key file:

  1. Open the Service Accounts page.
  2. Select an existing project, or create a new project.
  3. At the top of the page, click + Create service account.
  4. Enter a name and description for the service account, then click Create and Continue.
  5. Under Service account permissions, select the roles to grant to the service account, then click Continue. We recommend the Viewer role.
    • Optional: Under Grant users access to this service account, you may specify the users or groups that are allowed to use and manage the service account.
  6. Go to the API Console/Credentials and click on the email address of the service account you just created.
  7. In the Keys tab, click + Add key, then click Create new key.
  8. Select JSON as the Key type. This will generate and download the JSON key file that you'll use for authentication. Click Continue.
caution

This file serves as the only copy of your JSON service key, and you will not be able to re-download it. Be sure to store it in a secure location.

note

You can return to the API Console/Credentials at any time to manage your service account or generate additional JSON keys. For more details about service account credentials, see Google's IAM documentation.

Note on delegating domain-wide authority to the service account

Domain-wide delegation is a powerful feature that allows service accounts to access users' data across an organization's Google Workspace environment through 'impersonation'. This authority is necessary in certain use cases, such as when a service account needs broad access across multiple users and services within a domain.

note

Only the super admin of your Google Workspace domain can enable domain-wide delegation of authority to a service account.

To enable delegated domain-wide authority, follow the steps listed in the Google documentation. Please make sure to grant the following OAuth scopes to the service account:

  • https://www.googleapis.com/auth/webmasters.readonly

For more information on this topic, please refer to this Google article.

Step 2: Set up the Google Search Console connector in Airbyte

For Airbyte Cloud:

  1. Log into your Airbyte Cloud account.
  2. Click Sources and then click + New source.
  3. On the Set up the source page, select Google Search Console from the Source type dropdown.
  4. Enter a name for the Google Search Console connector.
  5. For Website URL Property, enter the specific website property in Google Seach Console with data you want to replicate.
  6. For Start Date, by default the 2021-01-01 is set, use the provided datepicker or enter a date in the format YYYY-MM-DD. Any data created on or after this date will be replicated.
  7. To authenticate the connection:

For Airbyte Open Source:

  1. Navigate to the Airbyte Open Source dashboard.
  2. Click Sources and then click + New source.
  3. On the Set up the source page, select Google Search Console from the Source type dropdown.
  4. Enter a name for the Google Search Console connector.
  • For Airbyte Cloud:
    • Select Oauth from the Authentication dropdown, then click Sign in with Google to authorize your account.
  • For Airbyte Open Source:
    • (Recommended) Select Service Account Key Authorization from the Authentication dropdown, then enter the Admin Email and Service Account JSON Key. For the key, copy and paste the JSON key you obtained during the service account setup. It should begin with {"type": "service account", "project_id": YOUR_PROJECT_ID, "private_key_id": YOUR_PRIVATE_KEY, ...}
    • Select Oauth from the Authentication dropdown, then enter your Client ID, Client Secret, Access Token and Refresh Token.
  1. (Optional) For End Date, you may optionally provide a date in the format YYYY-MM-DD. Any data created between the defined Start Date and End Date will be replicated. Leaving this field blank will replicate all data created on or after the Start Date to the present.
  2. (Optional) For Custom Reports, you may optionally provide an array of JSON objects representing any custom reports you wish to query the API with. Refer to the Custom reports section below for more information on formulating these reports.
  3. (Optional) For Data Freshness, you may choose whether to include "fresh" data that has not been finalized by Google, and may be subject to change. Please note that if you are using Incremental sync mode, we highly recommend leaving this option to its default value of final. Refer to the Data Freshness section below for more information on this parameter.
  4. Click Set up source and wait for the tests to complete.

Supported sync modes

The Google Search Console source connector supports the following sync modes:

note

The granularity for the cursor is 1 day, so Incremental Sync in Append mode may result in duplicating the data.

Supported Streams

Entity-Relationship Diagram (ERD)

Connector-specific configurations

Custom reports

Custom reports allow you to query the API with a custom set of dimensions to group results by. Results are grouped in the order that you supply these dimensions. Each custom report should be constructed like following:

  1. Click Add under the Custom Reports section
  2. Enter the Name of the report, this will be the name of the stream
  3. Select one or more Dimensions from the available dropdown list

The available Dimensions are:

  • country
  • date
  • device
  • page
  • query

For example, to query the API for a report that groups results by country, then by date, you could enter the following custom report:

  • Name: country_date
  • Dimensions: ["country", "date"]

Please note, that for technical reasons date is the default dimension which will be included in your query whether you specify it or not. By specifying it you can change the order the results are grouped in. Primary key will consist of your custom dimensions and the default dimension along with site_url and search_type.

The information you provide via UI Custom report builder will then be transformed into the custom stream by it's Name

You can use the Google APIS Explorer to build and test the reports you want to use.

Data Freshness

The Data Freshness parameter deals with the "freshness", or finality of the data that is being queried.

  • final: The query will include only finalized, stable data. This is data that has been processed, verified, and is unlikely to change. When you select this option, you are querying for the definitive statistics and information that Google has analyzed and confirmed.
  • all: The query will return both finalized data and what Google terms "fresh" data. Fresh data includes more recent data that hasn't gone through the full processing and verification that finalized data has. This option can give you more up-to-the-minute insights, but it may be subject to change as Google continues to process and analyze it.
caution

When using Incremental Sync mode, we recommend leaving this parameter to its default state of final, as the all option may cause discrepancies between the data in your destination table and the finalized data in Google Search Console.

Data type map

Integration TypeAirbyte Type
stringstring
numbernumber
arrayarray
objectobject

Limitations & Troubleshooting

Expand to see details about Google Search Console connector limitations and troubleshooting.

Connector limitations

Rate limiting

This connector attempts to back off gracefully when it hits Reports API's rate limits. To find more information about limits, see Usage Limits documentation.

Data retention

Google Search Console only retains data for websites from the last 16 months. Any data prior to this cutoff point will not be accessible. Please see this article for more information.

Troubleshooting

  • Check out common troubleshooting issues for the Google Search Console source connector on our Airbyte Forum.

Reference

Config fields reference

Field
Type
Property name
array<string>
site_urls
object
authorization
string
start_date
string
end_date
string
custom_reports
array<object>
custom_reports_array
string
data_state

Changelog

Expand to review
VersionDatePull RequestSubject
1.5.52024-11-2543730Starting with this version, the Docker image is now rootless. Please note that this and future versions will not be compatible with Airbyte versions earlier than 0.64
1.5.42024-09-0645196Fix request body for report streams by keyword
1.5.32024-08-0343067Update dependencies
1.5.22024-07-2742786Update dependencies
1.5.12024-07-2042142Update dependencies
1.5.02024-07-1742073Migrate to CDK v1.8.0
1.4.132024-07-1341734Update dependencies
1.4.122024-07-1041440Update dependencies
1.4.112024-07-0941164Update dependencies
1.4.102024-07-0640981Update dependencies
1.4.92024-06-2740215Replaced deprecated AirbyteLogger with logging.Logger
1.4.82024-06-2640532Update dependencies
1.4.72024-06-2540312Update dependencies
1.4.62024-06-2240077Update dependencies
1.4.52024-06-1739516Update state handling for incremental streams
1.4.42024-06-0439059[autopull] Upgrade base image to v1.2.1
1.4.32024-05-2438649Update deprecated auth package
1.4.22024-04-1936639Updating to 0.80.0 CDK
1.4.12024-04-1236639Schema descriptions
1.4.02024-03-1936267Pin airbyte-cdk version to ^0
1.3.72024-02-1235163Manage dependencies with Poetry
1.3.62023-10-2631863Base image migration: remove Dockerfile and use the python-connector-base image
1.3.52023-09-2830822Fix primary key for custom reports
1.3.42023-09-2730785Do not migrate config for the newly created connections
1.3.32023-08-2929941Added primary key to each stream, added custom_report config migration
1.3.22023-08-2529829Make Start Date a non-required, added the suggested streams, corrected public docs
1.3.12023-08-2429329Update tooltip descriptions
1.3.02023-08-2429750Add new Keyword-Site-Report-By-Site stream
1.2.22023-08-2329741Handle HTTP-401, HTTP-403 errors
1.2.12023-07-0427952Removed deprecated searchType, added discover(Discover results) and googleNews(Results from news.google.com, etc.) types
1.2.02023-06-2927831Add new streams
1.1.02023-06-2627738License Update: Elv2
1.0.22023-06-1327307Fix data_state config typo
1.0.12023-05-3026746Remove authSpecification from connector spec in favour of advancedAuth
1.0.02023-05-2426452Add data_state parameter to specification
0.1.222023-03-2022295Update specification examples
0.1.212023-02-1422984Specified date formatting in specification
0.1.202023-02-0222334Turn on default HttpAvailabilityStrategy
0.1.192023-01-2722007Set AvailabilityStrategy for streams explicitly to None
0.1.182022-10-2718568Improved config validation: custom_reports.dimension
0.1.172022-10-0817751Improved config validation: start_date, end_date, site_urls
0.1.162022-09-2817304Migrate to per-stream state.
0.1.152022-09-1616819Check available site urls to avoid 403 error on sync
0.1.142022-09-0816433Add custom analytics stream.
0.1.132022-07-2114924Remove additionalProperties field from specs
0.1.122022-05-0412482Update input configuration copy
0.1.112022-01-059186Fix incremental sync: keep all urls in state object
0.1.102021-12-239073Add slicing by date range
0.1.92021-12-229047Add 'order' to spec.json props
0.1.82021-12-218248Enable Sentry for performance and errors tracking
0.1.72021-11-267431Add default end_date param value
0.1.62021-09-276460Update OAuth Spec File
0.1.42021-09-236394Update Doc link Spec File
0.1.32021-09-236405Correct Spec File
0.1.22021-09-176222Correct Spec File
0.1.12021-09-226315Verify access to all sites when performing connection check
0.1.0`2021-09-035350Initial Release