Two data sources feed the database. The collection approach is a deliberate hybrid - each source covers what the other cannot access.
Playwright-based automation loads Shopee search pages as a real browser, passing all security checks including Akamai Bot Manager, TLS fingerprinting, canvas and WebGL validation. API calls are made from within the authenticated browser session - making requests indistinguishable from real user behaviour. Runs on a weekly cron job via residential proxies.
Shopee's official seller API provides ECOSAM's own store metrics - real order volumes, actual revenue data, ad spend performance, and fulfilment rates. This data is exact and real-time. It cannot cover competitor stores, which is why the browser scraper handles the external landscape.
Sample weekly report output
Technology stack
| Component | Technology | Purpose |
|---|---|---|
| Scraper | Playwright (Python) | Headless browser automation - the only reliable method for bypassing Shopee's bot detection at the session level |
| Proxies | Residential Proxy Pool | Rotates IP addresses to prevent rate limiting across weekly scrape runs |
| Database | PostgreSQL via Supabase | Relational storage for all tables, free tier available, managed hosting with built-in REST API |
| Vector Search | pgvector | Semantic similarity search on product embeddings - runs in the same PostgreSQL instance, no separate service needed |
| Embeddings | OpenAI text-embedding-3-small | Converts product text to 1536-dimension vectors for semantic matching and RAG retrieval |
| AI / LLM | Claude API (Anthropic) | Intent classification, query answering, recommendation generation - always grounded in retrieved data |
| Backend API | FastAPI (Python) | Serves all endpoints - query, alerts, recommendations, dashboard feed |
| Scheduler | Cron (Railway / Render) | Triggers the weekly scrape job and Monday report generation |
| Twilio WhatsApp API | Delivers real-time alerts and powers the WhatsApp chat interface | |
| SendGrid | Delivers the formatted weekly intelligence digest | |
| Frontend | Next.js (React) | Web dashboard - charts, competitor tables, AI chat, alert history |
| Hosting | Railway or Render | Managed hosting for the FastAPI backend and scheduled jobs |
On AI accuracy: The intelligence layer uses pre-built query functions - not open-ended text generation. The AI selects which function to call, retrieves real database values, and interprets them into a human answer. It is structurally prevented from generating numerical claims not grounded in actual scraped data. No fine-tuning of any AI model is required - the system works entirely through context, retrieval, and structured tool use.