The first part was about intent and structure.
This part exists because the system started to behave.
GetMeOne is no longer something I reason about only on paper. It runs continuously, touches real external systems, accumulates delays, and sometimes behaves in ways that are not obvious from code or diagrams. At this stage, performance stops being a design topic and becomes an observation problem.
This is where my thinking about performance changed the most.
The system does not have a request flow
There is no single request that enters GetMeOne and waits for a response.
Listings appear when external platforms publish them. Users do not wait synchronously. Notifications arrive later, and that delay is part of the product, not a failure.
Parser, Filter, and Notifier are not chained by calls. They are connected by time and data stored in the database. Each stage runs on its own schedule, with its own constraints and its own backlog.
Because of this, performance does not exist at a single point. It is distributed across minutes and sometimes hours. That alone makes many usual performance habits less useful.
End to end time became the main boundary
At some point I stopped asking whether a component was fast. The only question that mattered was how long it takes for a listing to reach a user after it appears.
This end to end time is not a clean metric. It relies on timestamps written in different services. It assumes clocks behave correctly. It ignores some failure scenarios. I am aware of these limitations.
Still, this metric draws a real boundary. If it grows, the system becomes worse for users, even if CPU usage looks fine and individual stages look fast.
At this point, I stopped optimizing individual components in isolation. A fast parser, a fast filter, or a fast notifier stopped being a goal for me. If end to end time grows, local speed does not matter.
Everything else exists only to explain why this number changes.
Stage metrics exist to explain accumulation
Parsing, filtering, and notification each have their own timing metrics. These metrics are not goals on their own. Making parsing faster does not automatically improve the system. Making notification faster can make things worse if upstream stages are unstable.
What matters is where time accumulates.
Sometimes parsing slows down because external platforms respond unevenly. Sometimes filtering falls behind because batches become more complex. Sometimes notification delays grow because Telegram enforces limits.
Looking at a single stage in isolation almost always leads to wrong conclusions. This is a common mistake I now recognize quickly. Teams optimize the stage that looks slow, instead of the place where time actually accumulates.
Only the relationship between stages explains system behavior.
External systems are part of the performance model
OLX and Otodom are not mocked. They are slow sometimes. They change response patterns. They return more data than expected. That variability directly shapes how the system behaves under load.
Telegram is the only external system with controlled behavior, and even there the limitation is not speed but rules. Rate limits, retries, ordering. Ignoring these constraints would make any optimization meaningless.
Because externals are real, performance problems cannot be dismissed as test artifacts. They have to be handled as part of the system.
Optimization started after behavior became visible
Most optimizations in GetMeOne did not start with profiling tools.
They started with observation.
Backlogs growing slowly. Notification delays increasing without errors. Database connections being exhausted only under very specific interaction patterns.
These were not dramatic failures. The system kept working. That is exactly why they were easy to miss.
When I say “optimization” here, I do not mean a hunt for microseconds. I mean small design corrections that change how the system spends time and resources.
Database connections taught me about hidden concurrency
I hit “too many clients already” in PostgreSQL.
At first glance it looked like a database sizing problem. In reality it was a behavior problem.
Some helper functions created a new connection pool when no pool was passed in. Those pools were not being closed. In a system with multiple services, this turns into a quiet multiplication effect: a few services, each occasionally creating pools, each pool holding multiple connections.
What I took from it is simple. Connection pooling is not an implementation detail. It is part of system capacity.
The fix was to make connection usage bounded again: smaller pools per service, closing temporary pools, and logging that makes “a new pool was created” visible as a real signal. Raising max_connections helped with safety margin, but it was not the real solution.
This change did not make anything faster in isolation. It made the system predictable under steady load.
I removed work from the notification path
Notifier used to do extra URL checks before sending. It felt like a quality feature. It also doubled the amount of external work in the most time-sensitive path.
When a user has a backlog of listings to receive, the notifier is already doing expensive work. Selecting data, rendering templates, sending messages, dealing with Telegram limits. Adding one more external HTTP operation per listing made that path heavier and more fragile.
I removed the check.
This was a trade. Some notifications can point to listings that are already inactive. In practice, the system already has a separate mechanism for refreshing state on the next parse cycle, and users still click through to confirm.
I accepted a small loss in freshness certainty to protect latency and stability in the delivery path.
I started caring about batch boundaries
Batch size sounds like a tuning parameter. In GetMeOne it directly affected end to end delay.
A large batch makes the parser efficient at writing to the database, but it also delays the moment when the first new listings become visible for filtering.
With smaller batches, the first results appear sooner. The system becomes more responsive even if total throughput stays similar.
This is a pattern I now expect to see. In asynchronous systems, batching is not just about efficiency. It is also about when work becomes observable to the next stage.
I optimized fetch page function, but not for elegance
Parsing is dominated by network. I cannot optimize OLX or Otodom response time. What I can control is overhead around it.
At some point I noticed that fetch_page was paying costs that had nothing to do with the remote server. Recreating SSL context, creating connectors, losing connection reuse, decoding large HTML pages even when I only needed a small JSON fragment.
I moved toward reuse. A global connector and SSL context, and extracting only the data I actually needed. Working with bytes and parsing JSON efficiently reduced CPU overhead and memory churn.
This did not make the network faster. It removed self-inflicted costs. That matters when you repeat the same operation hundreds of times per cycle.
Caching became a decision about what changes
I added caching in a few places, but not because caching is faster. I cached because certain data almost never changes, while the system touches it constantly.
In parsers, loading large sets of existing listings from the database on every cycle was wasteful. The data changes slowly compared to parsing frequency. Caching shifted work away from the database and removed a repeated startup cost.
In filtering, location data for radius checks is stable, but queries repeat frequently. Caching locations reduced database roundtrips and made filtering less sensitive to input volume.
Caching is never free. It creates memory use and invalidation questions. In this system, I accepted simple TTL-based strategies and restart-based resets because the scope and risk are manageable.
I stopped sleeping when there was work
One change looks almost too small to mention.
Filter used to sleep on every loop, even when there was a backlog. That sleep is invisible in code review. It also directly becomes latency in an asynchronous pipeline because it inserts artificial waiting between stages.
I changed the behavior so the filter sleeps only when there is no work. It is not about squeezing performance. It is about not paying latency taxes for no reason.
Telegram forced a choice between throughput and delivery quality
Telegram rate limits shaped notification behavior.
I added intentional pacing between messages. This increases the time needed to flush a large backlog to a single user. It also reduces the risk of being rate limited or flagged as spam, and it makes delivery feel less like a burst.
This optimization does not look like an optimization. It makes individual batches slower. It makes the system more stable and predictable.
To validate changes safely, I kept a controlled mock mode on the Telegram sending path. External platforms were still real, but notifications could be simulated for most users while still generating real metrics and traces. This allowed me to test behavior without spamming real users.
None of these optimizations were clean wins. Every change traded one kind of risk for another. What mattered was not removing risk, but choosing which risk the system would live with.
Steady load exposed more than peak load
The most useful performance signals did not come from spikes. They came from steady, continuous activity.
Under constant load, small inefficiencies started to accumulate. Connection pools filled up slowly. Async tasks waited longer than expected. Stages drifted out of sync without throwing errors.
Nothing crashed. Nothing alerted immediately. The system simply became less responsive over time. This kind of behavior cannot be understood through peak testing alone. It requires watching the system live and letting it age.
Tracing helped locally, not globally
There is no distributed trace from parsing to notification. That is not a missing feature. It is a consequence of the architecture.
Tracing is still useful, but only within a stage. It explains where time is spent inside parsing loops, filtering batches, or notification sending. It does not explain the full journey of a listing through the system. Expecting it to do so would be misleading.
What this system changed in my thinking
Working on GetMeOne forced me to stop thinking about performance as speed.
What matters is how time accumulates and where it accumulates. Asynchronous systems rarely fail loudly. They drift. If you only look for spikes and saturation, you miss the real problems.
Because of that, I no longer treat performance as a phase or a checklist. I think about it as system behavior under time and constraints. I no longer start performance work with tools or tests. I start with where time can accumulate and what the system is allowed to sacrifice.
GetMeOne is small.
Its limits are visible.
Its mistakes are cheap.
That is exactly why it is useful.
This text does not conclude anything. It fixes my current way of reasoning. Everything else is intentionally left open.