Testing My Pet Project. Just Because I Can

07.11.2025

I share the first stage of my open experiment with GetMeOne — designing the architecture and performance testing approach. Next up: metrics, JMeter, k6, analysis, and real-world optimizations. Just like on a real project.

pet-project

🧪

Backstory

GetMeOne used to be just my personal little Python script. I got tired of manually refreshing OLX and Otodom pages when I was looking for an apartment in Poland, so I wrote a small script that sent me new listings to Google Sheets.

Back then, it was a script without a database, without a bot, without notifications. Just a way to make my life easier.

Time passed, and I thought: what if I turn this into something bigger? I added a database, filters, data normalization, a notification system and now GetMeOne is almost a ready-to-use service that any Telegram user can use.

But before releasing it to production, I decided to do something that most people don’t do with pet projects — test its performance. If I were a manual tester, I’d probably test the functionality. But I’m a performance engineer, so I do what I do best, and the rest… we’ll see 😅

By the way, you can try it here: GetMeOne bot.

I don’t know what the result will be — whether the system will handle the load I’m planning, or if I’ll have to redesign the architecture. So this isn’t just a test, it’s an open experiment: how I design the strategy, set goals, choose metrics, and check the system step by step.

In the following articles, I’ll share how I:

  • set up monitoring, tracing, and alerts in Grafana;
  • prepare the environment and scripts in different tools (JMeter, k6, and maybe something else);
  • analyze results, optimize and hopefully don’t rewrite everything from scratch.

This is the first article in the series about pet project performance. Let’s go! 🚀

Contents


Architecture

GetMeOne is a service that collects listings from different sources (currently only OLX and Otodom), filters them based on user preferences, and sends relevant results to Telegram.

The system is built with a microservice architecture, where each component has a clear role. Sounds pretty good, right?




User and Interfaces

Telegram User interacts with the system through two entry points:

  • Telegram Bot — receives new listings, manages filters and interface language.
  • WebApp (via Telegram WebApp) — creates and edits filters through UI.

How it works:

  1. User opens WebApp → request goes through HTTPS to Caddy (reverse proxy);
  2. Caddy forwards the request to WebApp (AioHTTP);
  3. WebApp loads data and user creates a new filter;
  4. Changes are saved in filters and user_prefs tables.

Telegram Bot, in turn, receives commands (/filters, /language, /start), updates settings, and receives notifications from Notifier.


Data Flow and Events

Here’s where it gets interesting — how data travels through the system:

  • Parsers (OLX, Otodom, Otomoto) — scrape websites, normalize data (convert to a unified format), and do upsert into the home table, as well as insert into new_listings.
  • Filter Service — processes records from new_listings (where processed=false), looks for matches with user filters, writes results to sent_listings, and marks processed records.
  • Notifier — takes tasks from sent_listings, renders templates (Jinja2), sends via Telegram Bot API. After successful delivery, updates the status.
  • Cronjobs — perform background work: clean old records, archive outdated listings, and update locations.

Database (PostgreSQL)

The heart of the system is PostgreSQL. Here are the main tables and their roles:

TablePurposeWho WritesWho Reads
homeNormalized listingsParsersFilter, Notifier
new_listingsQueue of new uids to processParsersFilter
filtersUser filtersWebApp/BotFilter
user_prefsUser settings (language, etc.)WebApp/BotNotifier
sent_listingsQueue of tasks to sendFilter, NotifierNotifier
home_archiveArchive of outdated listingsCron
locationsGeo-indexes and reference dataCronWebApp

Yes, I use the database as a queue. It’s not the most elegant solution, but it’s good enough for a pet project.


Monitoring and Observability

Without monitoring, performance testing is shooting blindly. So:

  • Prometheus collects /metrics from all services: parsers, WebApp, Bot, Filter, Notifier, DB, Caddy.
  • Grafana visualizes everything you need: parser RPS, filtering latency, delivery time, errors, resource usage, and much more.
  • Loki + Promtail collect centralized logs (by services, tags, parser errors).

Metrics are available at the /metrics endpoint, logs are aggregated through Promtail, alerts are set up in Grafana. Everything as it should be!


Sequence (end-to-end scenario)

Here’s what the full listing processing path looks like:

StepActionComponent
S1Parser gets listings from websiteParser Service
S2Normalizes and saves to DBParser Service
S3INSERT into new_listingsParser Service
S4Filtering by usersFilter Service
S5INSERT into sent_listingsFilter Service
S6Creating messagesNotifier
S7Sending to TelegramNotifier + Bot
S8Marking as sentNotifier

Or even simpler — the path from website to Telegram notification:

Imagine a new apartment listing appears on Otodom.

  1. Parser finds it, extracts all fields (price, area, address, etc.) and saves it to the database. If it’s truly a new listing, it goes into new_listings.
  2. Filter Service notices the new item, finds users whose filters match, and adds tasks to sent_listings.
  3. Notifier takes these tasks, creates nice messages, and sends them via Telegram Bot API.
  4. After successful delivery, Notifier marks the records as “delivered”.

All this should happen quickly enough. But what does quickly mean — that’s a question for the next chapter.


Key Architecture Principles

A few important decisions that make the system reliable:

  • Idempotency:

    • Duplicate inserts don’t create duplicates thanks to ON CONFLICT.
    • Parser and notifier retries are safe — you can restart without fear.
  • Queue via DB: new_listings and sent_listings are basically internal queues. Not Kafka, of course, but good enough for a start.

  • Component Isolation: each service does its own thing — parsing, filtering, notifications, UI, bot. One goes down — the others keep working.

  • Observability: all services are monitored and logged centrally.

  • Simple Deployment: everything runs on a single Hetzner server in Docker Compose. No Kubernetes (yet).


Performance Requirements (SLO)

When I started defining SLOs (Service Level Objectives), I wanted them to reflect not just numbers in a table, but real user experience.

Users don’t see filtering latency or queue depth. They see one thing — how many seconds after a listing is published on the website does it appear in their notifications.

Main SLO: E2E latency

E2E latency (Source → Telegram) — time from publication on the website to receiving a notification.

Goal: ≤ 60 seconds (p95) ⚡

Why 60 seconds? Because if the listing arrives in 5 minutes, someone else will already rent the apartment. We’re doing serious stuff here, right?


Component SLOs

To fit within 60 seconds end-to-end, I broke the system into components:

ComponentMetricGoal (p95)Why exactly this
ParserParser latency≤ 5 sHTTP request + parsing + normalization + write
FilterFilter latency≤ 20 sFinding matches among all filters
NotifierNotifier delay≤ 6 sk=3 messages, 3 seconds between each
TotalE2E (full cycle)≤ 60 sWith margin for various delays

By the way, I wrote about what performance requirements are and how to create them properly in this article.


Resource SLOs

Besides functional metrics, we need to watch the infrastructure. Nobody needs a system that works fast but eats all resources and crashes every hour.

ResourceMetricGoalWhy we need it
CPUprocess_cpu_seconds_total≤ 70% (p95)Margin for cronjobs and peak loads
Memoryprocess_resident_memory_bytesGrowth ≤ 10% in 48hMemory leak detection
Disk I/Onode_disk_io_time_seconds_total≤ 70% busySo DB doesn’t slow everything down
DB latencydb_query_seconds≤ 50 msSELECT/INSERT should fly

These metrics are especially important for the Soak test, where the system will run under load for 48 hours straight.


Testing Approach

Testing this kind of service is not a trivial task. You can’t just send a simple HTTP request and get a clear response time.

All the magic happens in the background: parsers work constantly, filters process data asynchronously, notifications are sent with delays. This reminds me of a project where we tested data replication — everything in the background through queues.

How I Will Load the System

The script will create filters as the number of users increases. So our task is not to generate requests per minute, but to bring the system to a certain number of created filters and stop generation.

Sounds simple, but the magic starts later.

When a new listing appears, it goes through the whole chain:

  1. Parser → new_listings
  2. Filter Service checks it against all filters
  3. Notifier sends notifications to all matching users

The more filters in the system, the more work for Filter Service and Notifier. That’s how real load is created.


Types of Tests

There will be four key testing stages (I wrote more about test types here):

  1. Baseline test

    • Small load to capture the initial system state.
    • This is our starting point for all future comparisons.
  2. Load test

    • Target load: 3,000 users with 4,500 filters.
    • Check if we meet SLOs under realistic load.
    • Compare results with Baseline.
  3. Capacity test

    • Increase load to the limit.
    • Goal — find the system ceiling. Where will it start breaking?
  4. Soak test

    • Run for 48 hours under constant load.
    • Check stability, memory leaks, and Cronjob behavior.

Important Details

  • Telegram API is partially mocked — most traffic goes through mock, but a small percentage (sending to one user — me) goes to the real Telegram API to check behavior;
  • Otodom and OLX remain real sources. I don’t mock them — simulating real website behavior with pagination, anti-bot, and unpredictable HTML changes is too complex and doesn’t look like production. Traffic will be constant: before running tests, I clean the database for the last N days (default 7 days), so parsers re-fetch all listings and the load from external sites is representative.

What’s Next?

GetMeOne started as a small script for apartment hunting and turned into a full-fledged service with microservice architecture, monitoring, and a performance testing strategy. Sometimes I can’t believe this is happening 🤯

Now this project is a platform for an open experiment. Here you can watch how I do performance work in practice, with real errors, solutions, and conclusions.

In the following parts, I’ll share

  • How I set up monitoring and tracing (Prometheus, Grafana, Loki)
  • How I prepared the test environment (mocks, data, scripts)
  • How I ran tests in JMeter and k6
  • What the results showed and what surprises there were
  • What optimizations I had to make (and if I had to at all)

The story is just beginning. Subscribe on LinkedIn (en) or Telegram (ru), it will be interesting! 🚀