codinggames

Using Node.js and GitHub Action to find a house on Funda.nl

Written in March 12, 2022 - 🕒 4 min. read

If you want to rent or buy a house in the Netherlands, you better brace yourself because it is no easy task, especially if you’re looking in Amsterdam.

Applying for rent or bidding on a house is like a job interview, or getting a visa for another country. You need to provide your payment slips, work contract, previous landlords’ recommendations, and more.

Not only that but many times you don’t even have a chance to apply, because they already have so many people applying that they don’t take applications anymore. Sometimes within 2 days after the house is listed on Funda.nl, they are already not booking viewings.

If you at least then manage to book a viewing, you will see the house with many other people and sometimes even queue for it.

Because of that, I had to constantly open Funda.nl and check for new houses, so as soon as a new house was available I could book a viewing.

funda scraper

Let’s automate it

The good thing about Funda.nl is that they have a vast variety of filters for your search, and those filters are saved in the URL. Hooray!

For example, this link will list all houses that are listed since 1 day ago, with 2 bedrooms, at least 40m², and a maximum of €2000.

The results on this page are listed in an HTML li element with a class search-result, so I can get all elements with the query document.querySelectorAll('.search-result'), and inside each of these elements, I can get the link for the house with element.querySelectorAll('a').[0].href.

With puppeteer.js I can automate the process of going to that URL and checking if any new house showed up and save it to a file. Since this is a Node.js script, I will use jsdom to work with the DOM elements.

const { writeFileSync } = require('fs');
const puppeteer = require('puppeteer');
const jsdom = require('jsdom');

const runTask = async () => {
  const url = 'https://www.funda.nl/en/huur/amsterdam/beschikbaar/0-2000/40+woonopp/2+slaapkamers/1-dag/';
  const browser = await puppeteer.launch({
    headless: true,
  });

  const page = await browser.newPage();
  await page.goto(url, { waitUntil: 'domcontentloaded' });

  const htmlString = await page.content();
  const dom = new jsdom.JSDOM(htmlString);
  const result = dom.window.document.querySelectorAll('.search-result');

  const urls = [];
  for (const element of result) {
    urls.push(element?.querySelectorAll('a')?.[0]?.href);
  }

  writeFileSync('urls.json', JSON.stringify(urls));
}

runTask();

Now I can simply run node script.js and I will have all URLs on a urls.json file.

Sending new URLs to Telegram

Instead of saving the URLs to a urls.json file, I want to have these URLs sent to a Telegram group using a Telegram bot, for that, I will need to use node-fetch.

const nodeFetch = require('node-fetch');

const runTask = async () => {
  for (const element of result) {
    nodeFetch(`https://api.telegram.org/bot${BOT_API}/sendMessage`, {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
        text: element?.querySelectorAll('a')?.[0]?.href,
        chat_id : CHAT_ID,
        parse_mode : 'markdown',
      }),
    });
  }
}

runTask();

That’s great, but not very useful because I don’t want to have to manually run the script, I will fix that in the next step.

Using GitHub Actions

GitHub Actions is great, I use it all the time for tasks like tests and building my blog, but you can also have scripts running on a schedule using cron.

For this script, I need to use actions/checkout to checkout to my code, and stefanzweifel/git-auto-commit-action to persist a list of the URLs that were already sent to Telegram.

name: Run Task
on:
  workflow_dispatch:
  schedule:
    - cron: "*/30 * * * *"

jobs:
  build-and-deploy:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout
        uses: actions/[email protected]
        with:
          ref: ${{ github.head_ref }}

      - name: Running
        run: |
          npm cache clean --force
          npm install
          npm run task
        env:
          CHAT_ID: ${{ secrets.CHAT_ID }}
          BOT_API: ${{ secrets.BOT_API }}

      - uses: stefanzweifel/git-auto-commit-action@v4
        with:
          commit_message: Update db.json
          branch: main

And that’s it! Now whenever a new house is listed in my URL, I will get a message on Telegram 🎉.

Bonus

Now run to book your viewing!
Now run to book your viewing!

There is a very nice open-source Chrome extension that adds a lot of new data into Funda.nl about the house, so I copied some of its code to add extra info to the houses send to Telegram.

You can clone this project on GitHub, add your own list of Funda.nl URLs, and the script will automatically run every 30 minutes to check for new houses and send them to Telegram.

See you in the next one!

Tags:


Post a comment

Comments

Pablo Spinelli on 8/11/23

Hi Pablo, I'm testing the script and I can't identify the error, basically it doesn't detect any publication. I'll share the main if you notice anything, THANKS A LOT! https://github.com/paspinelli/funda_search/blob/main/src/main.js ...I did several tests and I can't find it

Phil on 10/1/22

Niiiiice! It was a fun read!

samuel norbury on 3/25/22

Impressive work!