search-console-report

# Search Console SEO Report Generator Generate professional, chart-rich PDF reports from Google Search Console data. Covers traffic trends, top pages, top keywords, country/device distribution, growth analysis, and actionable SEO recommendations. ## Prerequisites Before running this skill, verify these requirements: ### 1. Service Account Key File You need a Google Cloud Service Account JSON key file with access to the Search Console properties. The file looks like: ```json { "type": "service_account", "project_id": "...", "private_key_id": "...", "private_key": "-----BEGIN PRIVATE KEY-----\n...", "client_email": "xxx@project.iam.gserviceaccount.com", "token_uri": "https://oauth2.googleapis.com/token", ... } ``` Ask the user for the path to their key file. Common locations: `~/Downloads/*.json`, project directory. If the user doesn't have one yet, guide them through: 1. Create a Service Account in Google Cloud Console (IAM & Admin > Service Accounts) 2. Create a JSON key for it (Keys tab > Add Key > JSON) 3. Add the service account email as a user in Search Console (Settings > Users and permissions > Add user, "Restricted" permission is sufficient) 4. Enable the "Google Search Console API" in the project's API Library ### 2. Python Environment The script requires these packages: `pyjwt`, `cryptography`, `requests`, `matplotlib`, `pandas`, `reportlab`. Set up a virtual environment to avoid system conflicts: ```bash python3 -m venv /tmp/sc-env /tmp/sc-env/bin/pip install pyjwt cryptography requests matplotlib pandas reportlab ``` **IMPORTANT**: Always use `/tmp/sc-env/bin/python` to run scripts, not the system Python. **Timeout warning**: Package installation and first matplotlib import can be slow (60-120s). Set bash timeout to 180000ms for these operations. ### 3. Chinese Font for PDF (macOS) The PDF uses STHeiti for proper CJK + Latin + symbol rendering. Register it like this: ```python from reportlab.pdfbase import pdfmetrics from reportlab.pdfbase.ttfonts import TTFont pdfmetrics.registerFont(TTFont('CNFont', '/System/Library/Fonts/STHeiti Medium.ttc', subfontIndex=0)) pdfmetrics.registerFont(TTFont('CNFontLight', '/System/Library/Fonts/STHeiti Light.ttc', subfontIndex=0)) pdfmetrics.registerFontFamily('CNFont', normal='CNFontLight', bold='CNFont') ``` **CRITICAL font rules:** - Do NOT use `UnicodeCIDFont('STSong-Light')` — it causes English letter spacing to be too narrow and Unicode symbols like `•` (U+2022) to render as garbage characters (e.g. "煉"). - Always use TrueType fonts registered via `TTFont` for proper mixed CJK/Latin rendering. - On non-macOS systems, find an available CJK TTF font: `fc-list :lang=zh file` or look for Noto Sans CJK / WenQuanYi. ### 4. Matplotlib Font for Charts (macOS) **CRITICAL**: STHeiti alone does NOT cover Korean glyphs — keywords or country names in Korean will show as hollow rectangles (□). Use **Arial Unicode MS** instead, which covers CJK + Korean + most other scripts: ```python import matplotlib.font_manager as fm matplotlib.rcParams['font.family'] = 'sans-serif' matplotlib.rcParams['axes.unicode_minus'] = False _unicode_font_path = '/Library/Fonts/Arial Unicode.ttf' if not os.path.exists(_unicode_font_path): _unicode_font_path = '/System/Library/Fonts/STHeiti Medium.ttc' # fallback fm.fontManager.addfont(_unicode_font_path) _ufname = fm.FontProperties(fname=_unicode_font_path).get_name() matplotlib.rcParams['font.sans-serif'] = [_ufname, 'DejaVu Sans'] ``` **Do NOT** try to combine STHeiti + AppleSDGothicNeo via `font.sans-serif` list — matplotlib uses a single font per render pass and does not do per-glyph fallback, so the list only helps if glyphs exist in the first font. ## Step-by-Step Instructions ### Step 1: Gather Input from User Ask for or determine: - **Key file path**: Path to the Service Account JSON key file - **Site URLs**: One or more Search Console property URLs (format: `https://www.example.com/`) - **Date range**: Default to last 90 days. The user may request a custom range. - **Output path**: Where to save the PDF and data files. Default to project directory. - **Language**: Report can be in Chinese (default) or English — match the user's language. ### Step 2: Authenticate with Google API Use JWT-based Service Account authentication. Here is the exact authentication code: ```python import json, time, jwt, requests def get_access_token(key_file): with open(key_file) as f: creds = json.load(f) now = int(time.time()) payload = { "iss": creds["client_email"], "scope": "https://www.googleapis.com/auth/webmasters.readonly", "aud": creds["token_uri"], "iat": now, "exp": now + 3600, } signed_jwt = jwt.encode(payload, creds["private_key"], algorithm="RS256") resp = requests.post(creds["token_uri"], data={ "grant_type": "urn:ietf:params:oauth:grant-type:jwt-bearer", "assertion": signed_jwt, }) resp.raise_for_status() return resp.json()["access_token"] ``` **Error handling**: If authentication fails with 403, the API may not be enabled or the service account may not have Search Console access. Tell the user which to check. ### Step 3: Fetch Data from Search Console API Use the Search Analytics API endpoint for each site. The base query function: ```python import datetime END_DATE = datetime.date.today() - datetime.timedelta(days=3) # Data has ~3 day lag START_DATE = END_DATE - datetime.timedelta(days=89) def query_sc(token, site_url, dimensions, start=None, end=None, row_limit=100): """Query Search Console Search Analytics API. Args: token: OAuth2 access token site_url: Full property URL, e.g. "https://www.example.com/" dimensions: List of dimensions. Valid values: - "date" — daily breakdown - "query" — search keywords - "page" — page URLs - "country" — ISO 3166-1 alpha-3 country codes (lowercase) - "device" — "DESKTOP", "MOBILE", "TABLET" - "searchAppearance" — rich result types Can combine: ["query", "page"] for keyword-page matrix start: Start date (datetime.date). Defaults to START_DATE. end: End date (datetime.date). Defaults to END_DATE. row_limit: Max rows (max 25000). Returns: List of row dicts with keys: "keys" (list), "clicks", "impressions", "ctr", "position" Note: "ctr" is a decimal (0.05 = 5%), multiply by 100 for display. """ url = f"https://www.googleapis.com/webmasters/v3/sites/{requests.utils.quote(site_url, safe='')}/searchAnalytics/query" body = { "startDate": (start or START_DATE).isoformat(), "endDate": (end or END_DATE).isoformat(), "dimensions": dimensions, "rowLimit": row_limit, } resp = requests.post(url, headers={"Authorization": f"Bearer {token}"}, json=body) resp.raise_for_status() return resp.json().get("rows", []) ``` For each site, fetch ALL of the following data (this is the complete list — do not skip any): | # | Query | Dimensions | row_limit | Purpose | |---|-------|-----------|-----------|---------| | 1 | Daily traffic trend | `["date"]` | 25000 | Time series for charts | | 2 | Top pages | `["page"]` | 50 | Most visited pages | | 3 | Top queries | `["query"]` | 50 | Most searched keywords | | 4 | Country distribution | `["country"]` | 30 | Geographic breakdown | | 5 | Device distribution | `["device"]` | 10 | Desktop/Mobile/Tablet split | | 6 | Search appearance | `["searchAppearance"]` | 20 | Rich result types | | 7 | Query-page combos | `["query", "page"]` | 100 | Which keywords drive which pages | | 8 | Period comparison (first half) | `["page"]` with first-half dates | 500 | Growth analysis | | 9 | Period comparison (second half) | `["page"]` with second-half dates | 500 | Growth analysis | **Period comparison logic**: Split the date range in half. For each page URL, compare clicks between the two halves. Categorize pages as: - **Growing**: clicks increased (sort by change descending) - **Declining**: clicks decreased (sort by change ascending) - **New**: appeared only in the second half - **Lost**: appeared only in the first half ### Step 4: Calculate Summary Statistics For each site, compute: ```python daily = site_data["daily_trend"] total_clicks = sum(d["clicks"] for d in daily) total_impressions = sum(d["impressions"] for d in daily) avg_ctr = sum(d["ctr"] for d in daily) / len(daily) # Already *100 if you stored it that way avg_position = sum(d["position"] for d in daily) / len(daily) # Trend: compare last 30 days vs first 30 days if len(daily) >= 60: first_30_clicks = sum(d["clicks"] for d in daily[:30]) last_30_clicks = sum(d["clicks"] for d in daily[-30:]) click_trend_pct = ((last_30_clicks - first_30_clicks) / max(first_30_clicks, 1)) * 100 # Same for impressions ``` ### Step 5: Save Raw Data as JSON Save all fetched data to `sc_detailed_data.json` for reproducibility: ```python with open(f"{output_dir}/sc_detailed_data.json", "w") as f: json.dump(all_data, f, ensure_ascii=False, indent=2) ``` ### Step 6: Generate Charts with Matplotlib **IMPORTANT**: Always set `matplotlib.use('Agg')` BEFORE importing pyplot (no display server available). Generate these charts (save as PNG, dpi=150): #### Chart 1: Combined Traffic Trend (all sites) - 2-row subplot: top = daily clicks, bottom = daily impressions - One line per site, color-coded - X-axis: dates formatted as `%m-%d`, rotated 45 degrees - Legend in upper-left #### Chart 2: Per-site Detail (one per site with enough data) - 2-row subplot: top = daily clicks with 7-day moving average, bottom = average position (inverted Y-axis — lower is better) - Fill area under clicks line with alpha=0.3 ```python # 7-day moving average calculation if len(clicks) >= 7: ma7 = [sum(clicks[max(0,i-6):i+1]) / min(7, i+1) for i in range(len(clicks))] ``` #### Chart 3: Device Distribution - 1-row, N-column pie charts (one per site) - Show percentage labels #### Chart 4: Country Distribution (pie chart per site) - Use `figsize=(10, 7)` and `radius=0.9` — smaller sizes make labels illegible - Place on its **own full-width row** in the PDF (do NOT put side-by-side with keyword chart) - In PDF, render at `width=16*cm` - Top 8 countries by clicks - Map country codes to readable names using this mapping: ```python COUNTRY_NAMES = { 'idn': 'Indonesia', 'hkg': 'Hong Kong', 'mac': 'Macau', 'kor': 'South Korea', 'usa': 'United States', 'jpn': 'Japan', 'sgp': 'Singapore', 'mys': 'Malaysia', 'twn': 'Taiwan', 'tha': 'Thailand', 'phl': 'Philippines', 'ind': 'India', 'vnm': 'Vietnam', 'gbr': 'United Kingdom', 'deu': 'Germany', 'fra': 'France', 'aus': 'Australia', 'can': 'Canada', 'bra': 'Brazil', 'mex': 'Mexico', 'chn': 'China', 'pak': 'Pakistan', 'bgd': 'Bangladesh', 'lka': 'Sri Lanka', 'mmr': 'Myanmar', 'khm': 'Cambodia', 'npl': 'Nepal', 'are': 'UAE', 'sau': 'Saudi Arabia', 'tur': 'Turkey', 'egy': 'Egypt', 'nga': 'Nigeria', 'ken': 'Kenya', 'zaf': 'South Africa', 'col': 'Colombia', 'arg': 'Argentina', 'per': 'Peru', 'chl': 'Chile', 'nzl': 'New Zealand', 'ita': 'Italy', 'esp': 'Spain', 'nld': 'Netherlands', 'rus': 'Russia', 'pol': 'Poland', } ``` ### Step 7: Generate PDF Report Use reportlab with A4 page size. The report has 7 sections: #### PDF Structure ``` Cover Page - Report title (in user's language) - Subtitle: "Google Search Console Data Analysis & Recommendations" - Report date, data range, data source, covered sites Section 1: Executive Summary - Summary table (all sites: clicks, impressions, avg CTR, avg position, trends) - Key findings (5-6 bullet points highlighting most important insights) Section 2: Traffic Trends - Combined traffic trend chart (all sites) - Per-site detail charts (clicks + position) Section 3: Top Pages (TOP 10) - Table per site: rank, page path, clicks, impressions, CTR, position - Shorten long URLs: if > 45 chars, truncate with "..." Section 4: Top Keywords (TOP 15) - Table per site: rank, keyword, clicks, impressions, CTR, position Section 5: Country & Device Distribution - Device pie charts - Country bar charts (for major sites) - Country tables (all sites, top 10) Section 6: Growth Analysis - Per site: growing pages table (green header), declining pages table (red header) - New page count, lost page count Section 7: Recommendations & Action Plan - AI-generated recommendations based on the data (see analysis guidelines below) - Priority action table (P0/P1/P2) ``` #### PDF Style Configuration Use **1.2cm margins** (not the default 2cm) to maximize content area. Usable width = 21cm - 2×1.2cm = **18.6cm**. Set all chart/table widths to 18.6cm accordingly. ```python from reportlab.lib.pagesizes import A4 from reportlab.lib.units import mm, cm from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle from reportlab.lib.enums import TA_CENTER from reportlab.platypus import (SimpleDocTemplate, Paragraph, Spacer, Table, TableStyle, Image, PageBreak, HRFlowable) from reportlab.lib import colors as rl_colors doc = SimpleDocTemplate(pdf_path, pagesize=A4, leftMargin=1.2*cm, rightMargin=1.2*cm, topMargin=1.2*cm, bottomMargin=1.2*cm) # Define styles — use 'CNFont' (the registered STHeiti font) styles = getSampleStyleSheet() styles.add(ParagraphStyle(name='CNTitle', fontName='CNFont', fontSize=22, alignment=TA_CENTER, spaceAfter=6*mm, leading=28)) styles.add(ParagraphStyle(name='CNSubtitle', fontName='CNFont', fontSize=12, alignment=TA_CENTER, textColor=rl_colors.grey, spaceAfter=10*mm)) styles.add(ParagraphStyle(name='CNHeading1', fontName='CNFont', fontSize=16, spaceAfter=4*mm, spaceBefore=8*mm, leading=22, textColor=rl_colors.HexColor('#1a73e8'))) styles.add(ParagraphStyle(name='CNHeading2', fontName='CNFont', fontSize=13, spaceAfter=3*mm, spaceBefore=5*mm, leading=18, textColor=rl_colors.HexColor('#333333'))) styles.add(ParagraphStyle(name='CNBody', fontName='CNFont', fontSize=10, spaceAfter=2*mm, leading=16)) styles.add(ParagraphStyle(name='CNSmall', fontName='CNFont', fontSize=8, textColor=rl_colors.grey, leading=12)) styles.add(ParagraphStyle(name='CNBullet', fontName='CNFont', fontSize=10, spaceAfter=1.5*mm, leading=16, leftIndent=10*mm, bulletIndent=5*mm)) ``` #### Table Style Template Use this consistent style for all data tables: ```python table_style = TableStyle([ ('FONTNAME', (0,0), (-1,-1), 'CNFont'), ('FONTSIZE', (0,0), (-1,-1), 7), # Small font for dense data ('BACKGROUND', (0,0), (-1,0), rl_colors.HexColor('#1a73e8')), # Blue header ('TEXTCOLOR', (0,0), (-1,0), rl_colors.white), ('ALIGN', (2,0), (-1,-1), 'RIGHT'), # Numbers right-aligned ('ALIGN', (0,0), (0,-1), 'CENTER'), # Rank column centered ('GRID', (0,0), (-1,-1), 0.5, rl_colors.HexColor('#dddddd')), ('ROWBACKGROUNDS', (0,1), (-1,-1), [rl_colors.white, rl_colors.HexColor('#f8f9fa')]), ('TOPPADDING', (0,0), (-1,-1), 2), ('BOTTOMPADDING', (0,0), (-1,-1), 2), ]) ``` #### Bullet Points Use Unicode bullet character `\u2022` (•) for list items: ```python story.append(Paragraph(f"\u2022 {text}", styles['CNBullet'])) ``` This renders correctly with STHeiti font. Do NOT use other bullet approaches. ### Step 8: Generate SEO Recommendations Analyze the data and generate recommendations following these guidelines: #### Analysis Framework 1. **CTR Analysis**: If average CTR < 5%, recommend Title/Description optimization. 2. **Position Opportunities**: Find keywords ranking 5-15 (page 1-2 boundary) — these are low-hanging fruit for optimization. 3. **Country Focus**: Identify the #1 traffic country and recommend localized content. 4. **Growth Momentum**: Sites with click growth > 100% are in "breakout" phase — recommend increasing content investment. 5. **New Sites**: Sites with < 30 days of data need basic SEO foundations (Sitemap submission, internal linking). 6. **Declining Pages**: If many pages have declining clicks, recommend content quality audit. 7. **Device Split**: If mobile > 60%, emphasize mobile optimization and Core Web Vitals. 8. **Technical SEO**: Always recommend hreflang for multi-region sites, 404 fixes, and page speed optimization. #### Priority Classification - **P0** (Do immediately): CTR optimization, fixing unindexed pages - **P1** (Do this month): Content localization, keyword optimization for positions 5-15 - **P2** (Plan for next quarter): hreflang implementation, Core Web Vitals, Sitemap tuning ### Step 9: Present Results After generating the PDF: 1. Confirm the PDF file path to the user 2. Provide a summary in chat covering: - Report structure (sections and page count) - Key highlights per site (1-2 sentences each) - Top 3 priority recommendations 3. Mention the raw data JSON file path for further analysis 4. Offer next steps (e.g., "Do you want me to analyze a specific page or keyword in more detail?") ## Common Errors and Solutions | Error | Cause | Solution | |-------|-------|----------| | `403 Forbidden` on API call | Service account not added to Search Console | Add the service account email as a user in Search Console settings | | `403 Google Search Console API has not been enabled` | API not enabled | Enable it at https://console.cloud.google.com/apis/library/searchconsole.googleapis.com | | Empty `rows` in response | No data for that site/date range | Check if the site URL exactly matches the Search Console property (trailing slash matters!) | | `jwt.encode` error | Missing `cryptography` package | `pip install cryptography` | | PDF shows garbled Chinese | Wrong font | Use TTFont with STHeiti, NOT UnicodeCIDFont with STSong-Light | | Matplotlib timeout on first run | Building font cache | Set bash timeout to 180000ms; this only happens once | | `MPLCONFIGDIR` warning | No write access to `~/.matplotlib` | Harmless; matplotlib creates a temp cache automatically | ## Example Usage **User**: "Help me generate an SEO report for my websites using Search Console" → Ask for key file path and site URLs, then run the full pipeline. **User**: "Analyze search performance for example.com over the last 90 days and export to PDF" → Run with default 90-day range, generate full report. **User**: "Compare search traffic between my 3 sites" → Run for all 3 sites, emphasize the comparison aspects in the summary table and trends chart. ## Layout Rules (Lessons Learned) ### Top Pages Table — Use Paragraph for Word Wrap **CRITICAL**: The Page column contains long URLs that WILL overflow into adjacent columns if you use plain strings. Always wrap ALL table cells in `Paragraph()` objects: ```python style_cell = ParagraphStyle('Cell', fontName='CNFontLight', fontSize=9, leading=13, wordWrap='CJK') style_cell_hd = ParagraphStyle('CellHd', fontName='CNFont', fontSize=9, leading=13, textColor=colors.white) page_rows = [[Paragraph("Page", style_cell_hd), Paragraph("Clicks", style_cell_hd), ...]] for r in rows: page_rows.append([Paragraph(url, style_cell), Paragraph(f"{clicks:,}", style_cell), ...]) ``` ### Country Chart — Keep on Its Own Row Do NOT put the country pie chart side-by-side with the keyword chart. The pie becomes too small to read. Always put it on a separate row at `width=16*cm`. ### File Naming — Always Include Date Output filename must include today's date to avoid overwriting previous reports: ```python OUTPUT = f"search_console_report_{datetime.date.today()}.pdf" ``` ### Per-Site Layout Order (One Page Per Site) 1. KPI metrics row (Clicks / Impressions / CTR / Avg. Position) 2. Top Keywords chart (full width, 18.6cm) 3. Country pie chart (own row, 16cm) 4. Top Pages table (with Paragraph cells) 5. PageBreak() ## Reference Script See `gen_report.py` in this skill directory for the complete, production-ready implementation with all fixes applied. ## Output Files | File | Description | |------|-------------| | `sc_detailed_data.json` | Raw API data for all sites (reproducible) | | `report_charts/*.png` | Generated chart images (temp, /tmp/) | | `project/reports/search_console_report_YYYY-MM-DD.pdf` | Final PDF report (date-stamped, never overwritten) |

search-console-report

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

search-console-report