Project Goal
The primary objective was to develop a high-performance Apify Actor that interfaces directly with the official TenderNed (TNS) API. This tool automates the retrieval of Dutch public procurement notices, providing a structured way for businesses to monitor and analyze government contracts without the overhead of manual portal searches.
TenderNed Publications Apify Scraper
How it was built
The architecture focuses on efficiency by utilizing server-side filtering. Instead of downloading large datasets and filtering locally, the actor passes specific query parameters—such as CPV codes, publication dates, and contract types—directly to the TNS API. I implemented a flexible pagination system that supports both page-based and offset-based logic to ensure no data is missed during large crawls. For deeper insights, I added an enrichment layer that fetches official TED public-xml files and parses them into JSON. The actor also includes a browser-based extraction mode to capture metadata from Nuxt.js state objects and detail pages when API data alone isn’t sufficient.
Technologies used
- Node.js & Apify SDK
- TenderNed (TNS) REST API
- XML-to-JSON Parsing
- Basic Authentication & Proxy Management
- Headless Browser Automation (for detail enrichment)