Testing My Pet Project. Just Because I Can
I share the first stage of my open experiment with GetMeOne — designing the architecture and performance testing approach. Next up: metrics, JMeter, k6, analysis, and real-world optimizations. Just like on a real project.
🧪
Backstory
GetMeOne used to be just my personal little Python script. I got tired of manually refreshing OLX and Otodom pages when I was looking for an apartment in Poland, so I wrote a small script that sent me new listings to Google Sheets.
Back then, it was a script without a database, without a bot, without notifications. Just a way to make my life easier.
Time passed, and I thought: what if I turn this into something bigger? I added a database, filters, data normalization, a notification system and now GetMeOne is almost a ready-to-use service that any Telegram user can use.
But before releasing it to production, I decided to do something that most people don’t do with pet projects — test its performance. If I were a manual tester, I’d probably test the functionality. But I’m a performance engineer, so I do what I do best, and the rest… we’ll see 😅
By the way, you can try it here: GetMeOne bot.
I don’t know what the result will be — whether the system will handle the load I’m planning, or if I’ll have to redesign the architecture. So this isn’t just a test, it’s an open experiment: how I design the strategy, set goals, choose metrics, and check the system step by step.
In the following articles, I’ll share how I:
- set up monitoring, tracing, and alerts in Grafana;
- prepare the environment and scripts in different tools (JMeter, k6, and maybe something else);
- analyze results, optimize and hopefully don’t rewrite everything from scratch.
This is the first article in the series about pet project performance. Let’s go! 🚀
Contents
Architecture
GetMeOne is a service that collects listings from different sources (currently only OLX and Otodom), filters them based on user preferences, and sends relevant results to Telegram.
The system is built with a microservice architecture, where each component has a clear role. Sounds pretty good, right?
User and Interfaces
Telegram User interacts with the system through two entry points:
- Telegram Bot — receives new listings, manages filters and interface language.
- WebApp (via Telegram WebApp) — creates and edits filters through UI.
How it works:
- User opens WebApp → request goes through HTTPS to Caddy (reverse proxy);
- Caddy forwards the request to WebApp (AioHTTP);
- WebApp loads data and user creates a new filter;
- Changes are saved in
filtersanduser_prefstables.
Telegram Bot, in turn, receives commands (/filters, /language, /start), updates settings, and receives notifications from Notifier.
Data Flow and Events
Here’s where it gets interesting — how data travels through the system:
- Parsers (OLX, Otodom, Otomoto) — scrape websites, normalize data (convert to a unified format), and do
upsertinto thehometable, as well asinsertintonew_listings. - Filter Service — processes records from
new_listings(whereprocessed=false), looks for matches with user filters, writes results tosent_listings, and marks processed records. - Notifier — takes tasks from
sent_listings, renders templates (Jinja2), sends via Telegram Bot API. After successful delivery, updates the status. - Cronjobs — perform background work: clean old records, archive outdated listings, and update
locations.
Database (PostgreSQL)
The heart of the system is PostgreSQL. Here are the main tables and their roles:
| Table | Purpose | Who Writes | Who Reads |
|---|---|---|---|
home | Normalized listings | Parsers | Filter, Notifier |
new_listings | Queue of new uids to process | Parsers | Filter |
filters | User filters | WebApp/Bot | Filter |
user_prefs | User settings (language, etc.) | WebApp/Bot | Notifier |
sent_listings | Queue of tasks to send | Filter, Notifier | Notifier |
home_archive | Archive of outdated listings | Cron | — |
locations | Geo-indexes and reference data | Cron | WebApp |
Yes, I use the database as a queue. It’s not the most elegant solution, but it’s good enough for a pet project.
Monitoring and Observability
Without monitoring, performance testing is shooting blindly. So:
- Prometheus collects
/metricsfrom all services: parsers, WebApp, Bot, Filter, Notifier, DB, Caddy. - Grafana visualizes everything you need: parser RPS, filtering latency, delivery time, errors, resource usage, and much more.
- Loki + Promtail collect centralized logs (by services, tags, parser errors).
Metrics are available at the /metrics endpoint, logs are aggregated through Promtail, alerts are set up in Grafana. Everything as it should be!
Sequence (end-to-end scenario)
Here’s what the full listing processing path looks like:
| Step | Action | Component |
|---|---|---|
| S1 | Parser gets listings from website | Parser Service |
| S2 | Normalizes and saves to DB | Parser Service |
| S3 | INSERT into new_listings | Parser Service |
| S4 | Filtering by users | Filter Service |
| S5 | INSERT into sent_listings | Filter Service |
| S6 | Creating messages | Notifier |
| S7 | Sending to Telegram | Notifier + Bot |
| S8 | Marking as sent | Notifier |
Or even simpler — the path from website to Telegram notification:
Imagine a new apartment listing appears on Otodom.
- Parser finds it, extracts all fields (price, area, address, etc.) and saves it to the database. If it’s truly a new listing, it goes into
new_listings. - Filter Service notices the new item, finds users whose filters match, and adds tasks to
sent_listings. - Notifier takes these tasks, creates nice messages, and sends them via Telegram Bot API.
- After successful delivery, Notifier marks the records as “delivered”.
All this should happen quickly enough. But what does quickly mean — that’s a question for the next chapter.
Key Architecture Principles
A few important decisions that make the system reliable:
-
Idempotency:
- Duplicate inserts don’t create duplicates thanks to
ON CONFLICT. - Parser and notifier retries are safe — you can restart without fear.
- Duplicate inserts don’t create duplicates thanks to
-
Queue via DB:
new_listingsandsent_listingsare basically internal queues. Not Kafka, of course, but good enough for a start. -
Component Isolation: each service does its own thing — parsing, filtering, notifications, UI, bot. One goes down — the others keep working.
-
Observability: all services are monitored and logged centrally.
-
Simple Deployment: everything runs on a single Hetzner server in Docker Compose. No Kubernetes (yet).
Performance Requirements (SLO)
When I started defining SLOs (Service Level Objectives), I wanted them to reflect not just numbers in a table, but real user experience.
Users don’t see filtering latency or queue depth. They see one thing — how many seconds after a listing is published on the website does it appear in their notifications.
Main SLO: E2E latency
E2E latency (Source → Telegram) — time from publication on the website to receiving a notification.
Goal: ≤ 60 seconds (p95) ⚡
Why 60 seconds? Because if the listing arrives in 5 minutes, someone else will already rent the apartment. We’re doing serious stuff here, right?
Component SLOs
To fit within 60 seconds end-to-end, I broke the system into components:
| Component | Metric | Goal (p95) | Why exactly this |
|---|---|---|---|
| Parser | Parser latency | ≤ 5 s | HTTP request + parsing + normalization + write |
| Filter | Filter latency | ≤ 20 s | Finding matches among all filters |
| Notifier | Notifier delay | ≤ 6 s | k=3 messages, 3 seconds between each |
| Total | E2E (full cycle) | ≤ 60 s | With margin for various delays |
By the way, I wrote about what performance requirements are and how to create them properly in this article.
Resource SLOs
Besides functional metrics, we need to watch the infrastructure. Nobody needs a system that works fast but eats all resources and crashes every hour.
| Resource | Metric | Goal | Why we need it |
|---|---|---|---|
| CPU | process_cpu_seconds_total | ≤ 70% (p95) | Margin for cronjobs and peak loads |
| Memory | process_resident_memory_bytes | Growth ≤ 10% in 48h | Memory leak detection |
| Disk I/O | node_disk_io_time_seconds_total | ≤ 70% busy | So DB doesn’t slow everything down |
| DB latency | db_query_seconds | ≤ 50 ms | SELECT/INSERT should fly |
These metrics are especially important for the Soak test, where the system will run under load for 48 hours straight.
Testing Approach
Testing this kind of service is not a trivial task. You can’t just send a simple HTTP request and get a clear response time.
All the magic happens in the background: parsers work constantly, filters process data asynchronously, notifications are sent with delays. This reminds me of a project where we tested data replication — everything in the background through queues.
How I Will Load the System
The script will create filters as the number of users increases. So our task is not to generate requests per minute, but to bring the system to a certain number of created filters and stop generation.
Sounds simple, but the magic starts later.
When a new listing appears, it goes through the whole chain:
- Parser → new_listings
- Filter Service checks it against all filters
- Notifier sends notifications to all matching users
The more filters in the system, the more work for Filter Service and Notifier. That’s how real load is created.
Types of Tests
There will be four key testing stages (I wrote more about test types here):
-
Baseline test
- Small load to capture the initial system state.
- This is our starting point for all future comparisons.
-
Load test
- Target load: 3,000 users with 4,500 filters.
- Check if we meet SLOs under realistic load.
- Compare results with Baseline.
-
Capacity test
- Increase load to the limit.
- Goal — find the system ceiling. Where will it start breaking?
-
Soak test
- Run for 48 hours under constant load.
- Check stability, memory leaks, and Cronjob behavior.
Important Details
- Telegram API is partially mocked — most traffic goes through mock, but a small percentage (sending to one user — me) goes to the real Telegram API to check behavior;
- Otodom and OLX remain real sources. I don’t mock them — simulating real website behavior with pagination, anti-bot, and unpredictable HTML changes is too complex and doesn’t look like production. Traffic will be constant: before running tests, I clean the database for the last N days (default 7 days), so parsers re-fetch all listings and the load from external sites is representative.
What’s Next?
GetMeOne started as a small script for apartment hunting and turned into a full-fledged service with microservice architecture, monitoring, and a performance testing strategy. Sometimes I can’t believe this is happening 🤯
Now this project is a platform for an open experiment. Here you can watch how I do performance work in practice, with real errors, solutions, and conclusions.
In the following parts, I’ll share
- How I set up monitoring and tracing (Prometheus, Grafana, Loki)
- How I prepared the test environment (mocks, data, scripts)
- How I ran tests in JMeter and k6
- What the results showed and what surprises there were
- What optimizations I had to make (and if I had to at all)
The story is just beginning. Subscribe on LinkedIn (en) or Telegram (ru), it will be interesting! 🚀