meta-ad-spy

# Meta Ad Spy — Competitor Ad Intelligence Skill A two-phase skill for extracting and analyzing competitor ads from Meta platforms. ## Architecture Overview ``` Phase 1: Playwright Scraper (No API key needed) └── facebook.com/ads/library → Ad creatives, copy, status, platforms, dates Phase 2: Meta Graph API (Requires access token) └── graph.facebook.com/v23.0/ads_archive → Spend ranges, impressions, demographics Analysis Layer: Claude synthesizes insights from both sources ``` --- ## PHASE 1: Playwright Scraper **When to use**: Always as the first step, or when user has no API token. **What it gets**: Ad creatives (image/video URLs), ad copy, CTA text, page name, start date, active status, platforms (Facebook/Instagram), ad format (carousel, video, static). **What it can't get**: Spend ranges, impressions, demographic breakdown (those need Phase 2). ### Setup ```bash pip install playwright --break-system-packages playwright install chromium pip install asyncio --break-system-packages ``` ### Core Playwright Script Write this to `/tmp/meta_ad_scraper.py`: ```python import asyncio import json import re import sys from playwright.async_api import async_playwright async def scrape_ad_library( search_query: str = None, page_id: str = None, country: str = "ALL", ad_type: str = "all", # all | political_and_issue_ads | housing_ads active_status: str = "active", # active | inactive | all media_type: str = "all", # all | image | meme | video | none max_ads: int = 50 ) -> list[dict]: """ Scrape Meta Ad Library for competitor ads. Either search_query or page_id must be provided. """ results = [] # Build URL base = "https://www.facebook.com/ads/library/?" params = { "active_status": active_status, "ad_type": ad_type, "country": country, "media_type": media_type, } if search_query: params["q"] = search_query params["search_type"] = "keyword_unordered" elif page_id: params["view_all_page_id"] = page_id params["search_type"] = "page" url = base + "&".join(f"{k}={v}" for k, v in params.items()) async with async_playwright() as p: browser = await p.chromium.launch( headless=True, args=[ "--no-sandbox", "--disable-blink-features=AutomationControlled", "--disable-dev-shm-usage", ] ) context = await browser.new_context( viewport={"width": 1440, "height": 900}, user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36", locale="en-US", ) # Stealth: mask webdriver await context.add_init_script(""" Object.defineProperty(navigator, 'webdriver', { get: () => undefined }); """) page = await context.new_page() print(f"[Phase 1] Navigating to: {url}") await page.goto(url, wait_until="networkidle", timeout=30000) await page.wait_for_timeout(3000) # Scroll to load more ads ads_loaded = 0 scroll_attempts = 0 while ads_loaded < max_ads and scroll_attempts < 20: await page.evaluate("window.scrollTo(0, document.body.scrollHeight)") await page.wait_for_timeout(2000) # Count ad cards ad_cards = await page.query_selector_all('[data-testid="ad-card"], ._7jvw, [class*="x8t9es0"]') ads_loaded = len(ad_cards) scroll_attempts += 1 if scroll_attempts % 5 == 0: print(f"[Phase 1] Loaded {ads_loaded} ads so far...") # Extract ad data via JavaScript ads_data = await page.evaluate(""" () => { const ads = []; // Meta Ad Library renders ads in divs; extract all visible text/image data // Look for ad archive links which contain library IDs const links = document.querySelectorAll('a[href*="ads/archive"]'); const seen_ids = new Set(); links.forEach(link => { const href = link.href; const id_match = href.match(/id=(\d+)/); if (id_match && !seen_ids.has(id_match[1])) { seen_ids.add(id_match[1]); // Walk up to find the ad container let container = link; for (let i = 0; i < 8; i++) { container = container.parentElement; if (!container) break; } const getText = (el, fallback='') => el ? el.innerText.trim() : fallback; const getAttr = (el, attr, fallback='') => el ? el.getAttribute(attr) || fallback : fallback; ads.push({ ad_archive_id: id_match[1], ad_snapshot_url: href, page_name: getText(container?.querySelector('[class*="page-name"], strong')), ad_body: getText(container?.querySelector('[data-ad-preview="message"], [class*="body"]')), ad_title: getText(container?.querySelector('[class*="title"]')), cta_text: getText(container?.querySelector('[class*="cta"], button')), image_url: getAttr(container?.querySelector('img[src*="fbcdn"]'), 'src'), started_running: getText(container?.querySelector('[class*="started-running"]')), platforms: Array.from(container?.querySelectorAll('[class*="platform"]') || []).map(el => el.innerText.trim()).filter(Boolean), raw_text: container?.innerText?.substring(0, 500) || '', }); } }); return ads; } """) # Also capture network requests for richer data print(f"[Phase 1] Extracted {len(ads_data)} ads from DOM") results = ads_data[:max_ads] await browser.close() return results async def main(): query = sys.argv[1] if len(sys.argv) > 1 else "Nike shoes" ads = await scrape_ad_library(search_query=query, max_ads=20) print(json.dumps(ads, indent=2, ensure_ascii=False)) if __name__ == "__main__": asyncio.run(main()) ``` ### How to Run Phase 1 ```bash python /tmp/meta_ad_scraper.py "competitor brand name" ``` Or from within Python (for page ID lookups): ```python ads = await scrape_ad_library(page_id="434174436675167", active_status="active") ``` ### Filters Available in Phase 1 | Filter | Values | Notes | |--------|--------|-------| | `active_status` | `active`, `inactive`, `all` | `active` = currently running | | `ad_type` | `all`, `political_and_issue_ads`, `housing_ads`, `employment_ads`, `credit_ads` | Default: all | | `country` | `ALL`, `US`, `IN`, `GB`, `DE`, `FR`, `AU`, etc. | ISO codes | | `media_type` | `all`, `image`, `meme`, `video`, `none` | Filter by creative format | | `search_query` | Any keyword string | Brand name, product, keyword | | `page_id` | Facebook Page ID | More precise than keyword search | --- ## PHASE 2: Meta Graph API **When to use**: After Phase 1, or when user wants spend/impression/demographic data. **Requirements**: Meta developer account + access token (see setup below). **What it gets**: Spend ranges, impression ranges, demographic distribution (EU/political), delivery by region, ad creative details, estimated audience size. ### Setup Instructions (tell the user) 1. Go to [Meta for Developers](https://developers.facebook.com/) → Create App 2. Go to [facebook.com/ID](https://www.facebook.com/ID) → Confirm identity (required for spend data) 3. Generate a User Access Token with `ads_read` permission from [Graph API Explorer](https://developers.facebook.com/tools/explorer/) 4. Set as env var: `export META_ACCESS_TOKEN="your_token_here"` ### Core API Script Write this to `/tmp/meta_ad_api.py`: ```python import requests import json import os import time import sys from typing import Optional META_API_VERSION = "v23.0" BASE_URL = f"https://graph.facebook.com/{META_API_VERSION}/ads_archive" # All available fields from the API ALL_FIELDS = [ "id", "ad_archive_id", "ad_creative_bodies", "ad_creative_link_captions", "ad_creative_link_descriptions", "ad_creative_link_titles", "ad_delivery_start_time", "ad_delivery_stop_time", "ad_snapshot_url", "bylines", "delivery_by_region", "demographic_distribution", "estimated_audience_size", "impressions", "page_id", "page_name", "publisher_platforms", "spend", "languages", "currency", "ad_creative_link_caption", "ad_creative_link_url", ] def query_ad_library( access_token: str, search_terms: str = None, search_page_ids: list[str] = None, ad_reached_countries: list[str] = ["US"], ad_active_status: str = "ACTIVE", # ACTIVE | INACTIVE | ALL ad_type: str = "ALL", # ALL | POLITICAL_AND_ISSUE_ADS | etc. ad_delivery_date_min: str = None, # "YYYY-MM-DD" ad_delivery_date_max: str = None, # "YYYY-MM-DD" publisher_platforms: list[str] = None, # ["FACEBOOK", "INSTAGRAM"] languages: list[str] = None, limit: int = 50, max_pages: int = 5, ) -> list[dict]: """ Query Meta Ad Library API with full pagination support. Returns list of ad objects with all available fields. """ if not access_token: raise ValueError("META_ACCESS_TOKEN is required for Phase 2") params = { "access_token": access_token, "ad_active_status": ad_active_status, "ad_type": ad_type, "ad_reached_countries": json.dumps(ad_reached_countries), "fields": ",".join(ALL_FIELDS), "limit": min(limit, 500), # API max per page } if search_terms: params["search_terms"] = search_terms if search_page_ids: params["search_page_ids"] = ",".join(search_page_ids) if ad_delivery_date_min: params["ad_delivery_date_min"] = ad_delivery_date_min if ad_delivery_date_max: params["ad_delivery_date_max"] = ad_delivery_date_max if publisher_platforms: params["publisher_platforms"] = json.dumps(publisher_platforms) if languages: params["languages"] = json.dumps(languages) all_ads = [] page_count = 0 next_url = None while page_count < max_pages: try: if next_url: response = requests.get(next_url, timeout=30) else: response = requests.get(BASE_URL, params=params, timeout=30) response.raise_for_status() data = response.json() if "error" in data: print(f"[Phase 2] API Error: {data['error']}", file=sys.stderr) break ads = data.get("data", []) all_ads.extend(ads) page_count += 1 print(f"[Phase 2] Page {page_count}: fetched {len(ads)} ads (total: {len(all_ads)})") # Pagination paging = data.get("paging", {}) next_url = paging.get("next") if not next_url or len(all_ads) >= limit: break time.sleep(1) # Rate limit courtesy except requests.exceptions.RequestException as e: print(f"[Phase 2] Request error: {e}", file=sys.stderr) break return all_ads[:limit] def analyze_ads(ads: list[dict]) -> dict: """ Extract competitive intelligence insights from raw ad data. """ if not ads: return {"error": "No ads found"} # Spend analysis spends = [] for ad in ads: spend = ad.get("spend", {}) if isinstance(spend, dict): lo = spend.get("lower_bound", 0) hi = spend.get("upper_bound", 0) if lo and hi: spends.append({"ad_id": ad.get("ad_archive_id"), "min": int(lo), "max": int(hi), "midpoint": (int(lo)+int(hi))//2}) # Platform distribution platform_counts = {} for ad in ads: for p in ad.get("publisher_platforms", []): platform_counts[p] = platform_counts.get(p, 0) + 1 # Ad longevity (proxy for performance — longer running = likely working) from datetime import datetime long_running = [] for ad in ads: start = ad.get("ad_delivery_start_time") if start: try: days = (datetime.now() - datetime.fromisoformat(start.replace("Z",""))).days long_running.append({"ad_id": ad.get("ad_archive_id"), "days_running": days, "page": ad.get("page_name")}) except: pass long_running.sort(key=lambda x: x["days_running"], reverse=True) # Creative format distribution creative_bodies = [ad.get("ad_creative_bodies", []) for ad in ads if ad.get("ad_creative_bodies")] return { "total_ads": len(ads), "spend_analysis": { "ads_with_spend_data": len(spends), "estimated_total_min_spend": sum(s["min"] for s in spends), "estimated_total_max_spend": sum(s["max"] for s in spends), "top_spenders": sorted(spends, key=lambda x: x["midpoint"], reverse=True)[:5], }, "platform_distribution": platform_counts, "longest_running_ads": long_running[:10], "pages_advertising": list(set(ad.get("page_name") for ad in ads if ad.get("page_name"))), "sample_creatives": [ { "page": ad.get("page_name"), "body": (ad.get("ad_creative_bodies") or [""])[0][:300], "title": (ad.get("ad_creative_link_titles") or [""])[0], "platforms": ad.get("publisher_platforms", []), "snapshot_url": ad.get("ad_snapshot_url"), } for ad in ads[:10] ] } if __name__ == "__main__": token = os.environ.get("META_ACCESS_TOKEN", "") search = sys.argv[1] if len(sys.argv) > 1 else "Nike" ads = query_ad_library(token, search_terms=search, ad_reached_countries=["US"], limit=50) analysis = analyze_ads(ads) print(json.dumps(analysis, indent=2, ensure_ascii=False)) # Also save raw data with open("/tmp/meta_ads_raw.json", "w") as f: json.dump(ads, f, indent=2, ensure_ascii=False) print(f"\n[Phase 2] Raw data saved to /tmp/meta_ads_raw.json") ``` ### API Filter Reference | Parameter | Values | Notes | |-----------|--------|-------| | `search_terms` | Any string | Keyword search in ad content | | `search_page_ids` | List of FB page IDs | Most precise competitor lookup | | `ad_reached_countries` | `["US"]`, `["IN"]`, `["GB","DE"]` | Required parameter | | `ad_active_status` | `ACTIVE`, `INACTIVE`, `ALL` | ACTIVE = currently live | | `ad_type` | `ALL`, `POLITICAL_AND_ISSUE_ADS`, `HOUSING_ADS`, `EMPLOYMENT_ADS`, `FINANCIAL_SERVICES` | Filter by category | | `ad_delivery_date_min` | `"2024-01-01"` | Start of date range | | `ad_delivery_date_max` | `"2024-12-31"` | End of date range | | `publisher_platforms` | `["FACEBOOK"]`, `["INSTAGRAM"]`, `["FACEBOOK","INSTAGRAM"]` | Platform filter | | `languages` | `["en"]`, `["hi"]`, `["es"]` | Language codes | ### Data Fields Available from API **Always available (all ads):** - `ad_archive_id` — Unique ad ID - `page_id`, `page_name` — Advertiser page - `ad_creative_bodies` — Ad copy text(s) - `ad_creative_link_titles`, `ad_creative_link_descriptions` — Headlines - `ad_delivery_start_time`, `ad_delivery_stop_time` — Run dates - `publisher_platforms` — FB/Instagram/Messenger/Audience Network - `ad_snapshot_url` — Link to view the actual ad **EU/UK/Political ads only:** - `spend` — `{lower_bound, upper_bound, currency}` — Spend RANGE, not exact - `impressions` — `{lower_bound, upper_bound}` — Impression RANGE - `estimated_audience_size` — `{lower_bound, upper_bound}` - `demographic_distribution` — `[{age, gender, percentage}]` array - `delivery_by_region` — Geographic breakdown - `bylines` — "Paid for by" disclaimer > ⚠️ **Important**: Spend and impressions are RANGES, not exact numbers. For non-EU/non-political ads in most countries including US and India, spend/impression data will NOT be returned. The official API is primarily a transparency tool. For richer commercial ad data, see the third-party alternatives in `references/alternatives.md`. --- ## ANALYSIS WORKFLOW When a user wants competitor ad intelligence, follow this flow: ### Step 1 — Clarify the Target Ask (or infer from context): - **Who** — brand name OR Facebook Page ID (better) - **Where** — country/region (`US`, `IN`, `ALL`, etc.) - **What** — active only, or historical too? - **Goal** — creative inspiration, spend monitoring, format analysis, copy patterns? ### Step 2 — Find the Page ID (if only brand name given) ```bash # Tell user to visit: https://www.facebook.com/ads/library/?q=BRAND_NAME # The page_id appears in the URL when clicking on a page # OR use the search API: curl "https://graph.facebook.com/v23.0/pages/search?q=BRAND_NAME&access_token=TOKEN" ``` ### Step 3 — Run Phase 1 (Playwright) Always run Phase 1 first. Write and execute `/tmp/meta_ad_scraper.py`. ### Step 4 — Run Phase 2 (API), if token available Check for `META_ACCESS_TOKEN` env var. If present, run Phase 2. If missing, tell user what Phase 2 would add, and give setup instructions. ### Step 5 — Synthesize & Report Produce a structured competitive intelligence report covering: ``` ## 🕵️ Competitor Ad Intelligence Report: [Brand Name] ### 1. Current Ad Activity - How many ads active right now - Platforms being used (FB vs Instagram split) - Ad formats (video, image, carousel) ### 2. Creative Strategy Analysis - Common themes in ad copy - CTA patterns (Shop Now, Learn More, Sign Up, etc.) - Headline formulas being used - Hook styles (question, statement, social proof, urgency) ### 3. Ad Longevity Signals - Longest-running ads (strong = likely performing well) - New ads launched recently (testing phase) ### 4. Spend & Scale Signals (Phase 2 only, EU/political) - Estimated spend ranges - Impression volume estimates - Geographic distribution ### 5. Audience Signals (Phase 2 EU only) - Age/gender demographic breakdown - Platform delivery split ### 6. Strategic Recommendations - Gaps in competitor's strategy you can exploit - Formats/messages they're NOT using - High-performing creative patterns to draw inspiration from ``` --- ## COMMON WORKFLOWS ### "What ads is [Brand] running right now?" ```python # Phase 1 ads = await scrape_ad_library(search_query="Brand Name", active_status="active") # Phase 2 (if token available) ads_api = query_ad_library(token, search_terms="Brand Name", ad_active_status="ACTIVE") ``` ### "Show me competitor video ads in India" ```python ads = await scrape_ad_library( search_query="competitor name", country="IN", media_type="video", active_status="active" ) ``` ### "How much is [Brand] spending on ads?" (EU/political only) ```python ads = query_ad_library( token, search_terms="Brand", ad_reached_countries=["GB"], # or EU countries ad_type="ALL", ) analysis = analyze_ads(ads) # Look at analysis["spend_analysis"] ``` ### "Show me ads that have been running the longest" (= likely winners) ```python ads = query_ad_library(token, search_terms="Brand", ad_active_status="ALL") analysis = analyze_ads(ads) # analysis["longest_running_ads"] — sorted by days running ``` ### "Find ads about [topic/product keyword]" ```python ads = await scrape_ad_library(search_query="keyword phrase", active_status="all") ``` --- ## LIMITATIONS & WORKAROUNDS | Limitation | Workaround | |------------|------------| | Spend data only for EU/political | Target EU countries in API query | | No CTR/conversion data | Use ad longevity as performance proxy | | Phase 1 DOM selectors may break | Fall back to raw text extraction + Claude analysis | | Rate limits on API | Add `time.sleep(1)` between pages, use cursor pagination | | Max ~429 ads per API session | Run multiple targeted queries, filter by date ranges | | No exact targeting info | Infer from demographic distribution (EU only) | --- ## NOTES ON LEGALITY & ETHICS - The Meta Ad Library is **public data** — no login required for commercial ads - Using it for competitive research is explicitly Meta's stated purpose for the tool - The official API is a **transparency tool** — use it as intended - Playwright scraping of public pages is generally legal (ref: hiQ v. LinkedIn, 2022) - Do NOT attempt to scrape user data, private profiles, or non-public content - Always respect rate limits and avoid aggressive scraping --- ## SEE ALSO - `references/alternatives.md` — Third-party APIs with richer data (SearchAPI.io, AdLibrary.com) - `references/page_id_lookup.md` — How to find competitor Facebook Page IDs - `references/field_reference.md` — Complete API field reference

meta-ad-spy

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

meta-ad-spy

meta-ad-spy

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement