← Retour aux actualités

Automating SEO: Scraping and Reporting with Python

Publié : 2025-09-05 21:00:49
Automating SEO: Scraping and Reporting with Python
Introduction

Search Engine Optimization (SEO) is an essential part of any digital marketing strategy, and it can help businesses rank higher on search engines' results pages. However, manual SEO is a time-consuming and repetitive process, especially for larger websites. Fortunately, Python offers various tools and libraries to automate some tasks, such as scraping and reporting, making the process more efficient and accurate. In this article, we'll explore how to automate SEO with Python by scraping search engines and generating reports.

1. Scraping Google Search Console

Google Search Console (GSC) provides valuable insights into a website's performance, such as search queries, impressions, clicks, and crawl errors. We can scrape this data using Python and the requests-html library to automate the process. First, create a service account and download the JSON data using the Google Cloud Platform (GCP) or Google Developer Console. Then, authenticate with the Google API Client Library for Python and retrieve the data using the SearchAnalyticsServiceClient class.

```python
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from googleapiclient.discovery import build

# Authenticate
creds = None
# ...
client = build('searchconsole', 'v1', credentials=creds)

# Get reports
start_date = '2021-01-01-01'
end_date = '2021-01-31'
page_token = None

while True:
query = client.searchanalytics().searchanalytics().get(
siteUrl='https://example.com',
body={
'startDate': start_date,
'endDate': end_date,
'pageSize': 1000,
'pageToken': page_token,
).execute()
if not 'nextPageToken' in query['searchAnalytics']:
break
page_token = query['nextPageToken']
for report in query['property']['searchAnalyticsData']['rows']:
for row in report['dimensions'][0]['metricData']:
print(row['values'][0]['values'][0]['values'][0]['formattedValue']
page_token = query.get('nextPageToken')

# Save to CSV
with open('data.csv', 'a') as f:
writer = csv.DictWriter(f, fieldnames=['date', 'query', 'impressions', 'clicks'])
writer.writeheader()
for row in query['property']['searchAnalyticsData']['rows']:
for metric in row['dimensions'][0]['metricData']:
writer.writerow({'date': row['dimensions'][0]['values'][0]['values'][0]['formattedValue'],
'query': row['dimensions'][1]['values'][0]['formattedValue'],
'impressions': row['values'][1]['values'][0]['formattedValue'],
'clicks': row['values'][2]['values'][0]['formattedValue']
```

2. Scraping Google Analytics

Google Analytics (GA) provides website statistics, such as pageviews and bounce rate. We can use the GA3 API and the Google API Client Library to retrieve this data. First, create a service account and download the JSON credentials. Then, authenticate and retrieve the data using the DataClient class.

```python
from google.auth.transport.requests import Request
from googleapiclient.discovery import build

# Authenticate
creds = None
client = build('analytics', 'v3', credentials=creds)

# Get reports
view_id = 'GA_VIEW_ID'
start_date = '2021-01-01-01'
end_date = '2021-01-31'

request = client.data().ga().get(
'ga:1.2.7',
ids='ga:' + view_id,
'startDate': start_date,
'endDate': end_date,
'metrics='ga:pageviews',
'dimensions=ga:pagePath',
'dimensions=ga:pagePath,ga:bounces',
'sort=-ga:pageviews',
'maxResults=1000'
)

# Save to CSV
with open('ga.csv', 'a') as f:
writer = csv.DictWriter(f, fieldnames=['url', 'views', 'bounceRate'])
writer.writeheader()
for row in request.get('rows'):
writer.writerow({'url': row['ga:dimensions'][0]['values'][0]['ga:dimensions'][0]['values'][0]['ga:metrics'][0]['values'][0]['values'][0]['values'][0]['formattedValue'],
writer.writerow({'views': row['ga:metrics'][0]['values'][0]['values'][0]['formattedValue'],
writer.writerow({'bounceRate': row

Conclusion: briefly recap the key actions and remind about the importance of consistency.
Partager : Telegram
SEO Boost
Installer l’app
Installez l’app pour un accès rapide hors ligne