Version: MarketPulse

Crawler Empty-Result Screenshot Upload

Author(s)

Alapan Das

Last Updated Date

2026-04-23

Version History

Version	Date	Changes	Author
1.0	2026-04-23	Initial draft for crawler empty-result screenshot upload flow	Alapan

Purpose

When a listing crawl returns no results or no usable page content, the crawler captures a screenshot and uploads it to an external API for later debugging.

Scope

This applies to list-page crawling in VehicleScraper.scrape_listings(...) for both Facebook Marketplace and Craigslist.

High-Level Flow

Crawl page is loaded and extraction is attempted.
If the scraper detects an empty or failed result, it captures a full-page screenshot.
The screenshot is uploaded through a multipart/form-data API call.
The upload target path is derived as error/{jobId}/{image}.

Trigger Conditions

A screenshot upload is attempted when:

Extracted listings are empty.
Page content is missing, including no HTML.
Craigslist list selector is not found.

Upload API Contract

The uploader sends a multipart request equivalent to:

import requests

url = f"{BASE_URL}/upload"

with open(file_path, "rb") as f:
    files = {
        "fileContent": (file_name, f, "image/png")
    }
    data = {
        "fileName": file_name,
        "folderName": f"error/{job_id}/{file_name}"
    }

    response = requests.post(url, files=files, data=data, timeout=30)
    response.raise_for_status()

Field Mapping

fileContent: binary screenshot payload (image/png)
fileName: screenshot filename
folderName: logical storage path in backend storage, formatted as error/{jobId}/{image}

URL Construction

Upload URL is built from crawler BASE_URL:

Upload endpoint: {BASE_URL}/upload

If BASE_URL is empty, upload is skipped and a warning is logged.

Naming and Storage

Screenshots are first written locally under:

crawler/artifacts/screenshots/

Filename format:

{platform}_{timestamp}_{random}.png

Examples:

fbm_20260422T093455123456Z_abc123.png
cgl_20260422T093500654321Z_def456.png

Job Correlation

job_id is passed from start_filter_crawl(...) into scrape_listings(..., job_id=job_id).

When available, the upload path uses:

error/{jobId}/{image}

If missing, the fallback segment unknown-job is used.

Logging

The feature logs:

screenshot capture success or failure
upload success or failure
skip reason when BASE_URL is not configured

Current Implementation Files

crawler/app/scraper/scraper.py
crawler/app/scraper/agent.py

Operational Note

This feature does not block crawl completion status publishing. If screenshot upload fails, crawl flow continues and the failure is logged.

Crawler Empty-Result Screenshot Upload

Author(s)​

Last Updated Date​

Version History​

Purpose​

Scope​

High-Level Flow​

Trigger Conditions​

Upload API Contract​

Field Mapping​

URL Construction​

Naming and Storage​

Job Correlation​

Logging​

Current Implementation Files​

Operational Note​