Author: Pavel Penchev

Optimise Business Process for Communication

Optimise Business Process for
Communication

Mushrooms are some of the oldest and compared to us humans – simplest, organisms. What we call a mushroom is actually just a fruit body of a complex underground network of fungal threads called mycelium – that’s where all the exciting stuff happens. Not only does mycelium provide the nutrients, but it also serves as a communication network.

How-to Optimize business Process for Communication – lessons from a Mushroom

While yet to be studied in full detail, scientists understand that the network sends various signals which incentivize growth in a particular direction, alerts for danger, and more. The network communicates very efficiently. Scientists were able to make it solve basic labyrinths or even optimize city traffic problems.

Why Human Communication doesn’t scale for Business

Humans are not like this.

We don’t naturally form networks when building and collaborating; human communication just doesn’t scale. People are most comfortable communicating one on one – they can get all verbal and non-verbal signals and form a pretty good understanding between themselves. But when we start introducing more people to the group, the communication efficiency drops incredibly fast. There are multiple studies, both behavioral and historical in nature, on the optimal size of a group. The consensus is that groups larger than 10 people cannot collaborate efficiently.

This is hardly novel, but that’s not the point. What possessed me to write these lines is how often we forget or ignore that the drop in communication efficiency happens right after moving away from the one-to-one setup. Time and again, we see projects grow to 10s of people, introduce multiple decision centers, heavy inter-team dependencies and eventually deliver poor products or services.

Why It’s Not Enough for Software to Scale – Business Communication Must be Human as Well

The solution is to implement a “no compromise” attitude, putting efficient human-modeled communication practices above all else, reshaping the organization and the product accordingly.

After all, agile doesn’t only mean the development team is using delivery sprints – the entire business must be agile.

Conway’s law states that an organization’s output unwittingly becomes a copy of the communication structure within that organization. In this way, we can apply this rule to our scenario: the structure of a piece of software will mirror the structure of the organization that built it.

You want world-class software – make efficient communication your system optimization factor.

September 14, 2021
Simple, Simpler, Perfect – Finding Complexity in Simplicity
How to frame simple, simpler, perfect? A drum teacher once told me – “To play a simple beat really well, you must first master the complex stuff; practice a lot. Then revisit the simple beat”. At the time, I was not particularly convinced. I mean, how hard could an AC/DC drum pattern be? Actually, really simple. But the drum teacher was wise, and I guarantee you, even with an untrained ear, in a blind test, you’ll vote for AC/DC’s drummer above my playing any day of the week and twice on Sunday. Because simple things like how you attack the note, the timing precision of each stroke, sum up to playing a simple beat perfectly vs. “kind of ok.”

How the K.I.S.S. Concept Applies to Software Development

This concept applies to software as well. Like music, you compose software, combining different components and functionality, then interface into something a client can understand. As in music, you can’t expect easy adoption if you’re composing avant-garde techno-folk-jazz music.

Simple

Previously I wrote about dumb services architecture, but the application of the “simplicity concept” is tied most strongly to the client experience. If your core client experience is simple to understand, you’ll appeal to a much wider audience.

To restate: your product improves, congruent to your focus on polishing the simple things in your software. Perhaps even simpler (pun intended). Simplicity = Scale.

Simpler

Scaling your software and business is more manageable when you focus on the core client experience. In the case of software, though, unlike music, the effects of this concept are multiplied.
- Users will intuitively pick your polished product over the competition.
- No need to educate users on how to use the software
- Users can show and persuade others to use your software. With a strong core experience, users can build a mental model of your product, creating natural advocates for you.
- Your software is easier to maintain and deploy. Now, this may not always be true, especially if you leverage a simple user experience to hide a lot of complexity. Nevertheless, at least at the UI level, it still has merit.
Perfect

Last week an event occurred that offers the perfect example for the above. Coinbase IPOed at $100b valuation. Now, you may or may not follow cryptocurrencies, but here’s the essence of the story. They beat all the competition within the crypto-industry by creating a simple, polished, core client experience. Everything else was secondary for them.

Simply Complex Perfection

In conclusion: before building, ask yourself a few questions. Is this client functionality necessary? Even if they insist, will it bring value to your core experience? Are 3 layers of backend frameworks essential to make an SQL query? These decisions are hard to make. Paradoxically, building simply is more arduous than building complexly. But it pays off.
April 27, 2021
Search Event Data Collection – Progress So Far
More than 2 years ago we open-sourced our search-collector, a lightweight javascript SDK that allows you to run search KPI collection from your e-commerce website. This post will illustrate our progress with search event data collection to date. Since launch, our search event collector has gathered close to a billion events, all while maintaining utmost user privacy – the collector SDK does not track any personally identifiable information, uses no fingerprinting or any associated techniques. The sole focus of our collector is rather, to simply record search events and pass them to an endpoint.

Why the search collector and why you need it too

One may argue that Google Analytics provides everything you need. However, once you dive deeper into site search analytics, its deficiencies become apparent.
- Google Analytics runs on sampled data. As a result, it does not depict an accurate picture.
- It’s not possible to implement certain KPIs. For example, product click tracking per keyword.
- Google Analytics often lacks optimum configuration within the web-shop. Fixing it rarely requires available engineering resources.
These types of scenarios led to the birth of the search event collector. As we would rather not impose a particular type of configuration, we structured the collector as an SDK. This strategy gives every team the flexibility to assemble a unique search metric collection solution fit for a particular purpose.

How does search-collector work?

Search-collector has two key concepts

Collectors

Collectors are simple javascript classes that attach to a DOM element, ex. the search box. When an event of interest happens at that DOM element, the collector reacts, packages the relevant data, and passes it on to a Writer (see below).

We’ve provided many out of the box collectors that can help track events like typing, searches, refinements, add to baskets and more.

Writers

Writers receive data from the collectors and deliver it to a storage location. Chaining Writers together will provide separation of concerns (SoC) and prepare them for reuse. For example, we offer BufferingWriter whose only role is to buffer the incoming data for a certain amount of time before sending the package on to the endpoint. This is necessary to prevent an HTTP request from firing upon each keypress within the search box.

Two key writers of interest to the readers of this post are the RestEventWriter and the SQSEventWriter, sending data either to a specified REST endpoint or to Amazon’s SQS. In production, we mostly use the SQS writer, due to its low cost and reliability.

Search-Collector: Progress vs. Room For Improvement

The Progress
- The search-collector has reliably collected close to a billion events on both desktop and mobile.
- We have not encountered any client issues, while the appeal of precise search tracking captures the interest of web-shops and e-commerce owners immediately. The resulting data is easy to digest and manage.
- We package the collector as a single script bundle. This single line adds the search-collector to the web-shop. This streamlined initial setup ensures flexible updates to the event collection setup later.
- The SQS mechanism turned out to be a cheap and reliable option for search event storage.
- The composable Collectors and Writers are flexible enough to capture almost any case we’ve encountered to date.
Room For Improvement

The tight coupling of the collector code to the DOM model within the web-shop sometimes creates issues.
- For example, when DOM structure changes are made without notice. We’re working on a best practice document and a new code version that encourages the use of custom client-side events.
  - For example, soon, we will recommend web-shops send a custom searchEvent when the search is triggered. At the same time, the collector code will register as a page listener for these events.
- Impression tracking on mobile is difficult. Events are fired differently and detection, whether a product was within the visible screen area, does not work consistently across devices. Although impressions are rarely used, we’re working on improving in this area.
- Combining Google Analytics data (web-shops usually have it and use it) with Search-Collector data is not trivial. We’re close to launching our Search Insights product that does just that. This will be a considerable help in the event you need to combine these data sources manually – mind the bot traffic.
Summary – Making Search More Measurable and Profitable

2 years in, we’ve learned much from our Search-Collector SDK project. On the one hand, we are collecting more data with seamless web-shop integration than ever before. This ultimately allows for a broader understanding of things like findability. On the other hand, the more information we gather, the more necessary the maintenance of the collection pipelines. It’s clear, however, that the value we add to our customer’s e-commerce shops far outweighs any limitations we may have encountered.

As a result, we continue on this journey and look forward to the next version of our search-collector. This new version will offer the benefits of streamlined integration, and added transparency into Google Analytics site-search data. All the while, maintaining integration flexibility to ensure continuity of the collected data even after sudden, unforeseen changes to web-shop code.

We’ll be launching soon, so please watch this space.

Footnotes
1. The Document Object Model (DOM) defines the logical structure of documents and the way a document is accessed and manipulated.
2. (SoC) is a design principle for separating a computer program into distinct sections such that each section addresses a separate concern.
3. SQS is a queue from which your services pull data, and it only supports exactly-once delivery of messages.
January 19, 2021
Search Orchestration with Dumb Services

You may think the word “dumb services” in connection with software orchestration in the title is clickbait. However, I can assure you, the aforementioned “Orchestration with Dumb Services”, is a real and simple software orchestration concept certainly to improve your sleep.

Any engineer, DevOps, or software architect can relate to the stress of running a loaded production system. To do so well, it’s necessary to automate, provision, monitor, provide redundancy and fail-over to hit those SLAs. The following paragraphs cut to the chase. You won’t see any fancy buzzwords. I aim to help avoid pitfalls into which many companies stumble, when untangling monolithic software projects. Or, for that matter, even when building small projects from scratch. While the concept is not applicable for every use case, it does fit perfectly into the world of e-commerce search. It’s even applicable for full-text search. Generally, wherever the search index read and writes are separate pipelines, this is for you! So, what are we waiting for? Let’s start orchestrating with dumb services.

What is the Difference Between Regular vs. Dumb services

To begin, let’s define the term “service”

A service is a piece of software that performs a distinct action.

Nowadays, a container running as part of a kubernetes cluster is a good example of a service. This container can spin-up multiple instances of the service to meet demand. The configuration of a so-called regular service points it to other services it may need. These could be things like connections to databases, and so on.

Regular Services in action are seen illustrated in the diagram to the right. As they grow, companies run many such hierarchically organized services.

Regular Service Hierarchy

Dumb Services

Now, let’s clarify what “dumb service” means. In this context, a dumb service is a service which knows nothing about its environment. Its configuration is reduced to performance related aspects (ex. memory limits). When you start such a service, it does nothing — no connection to other services, no joining of clusters, just waits to be told what to do.

Orchestrator Services

To create a full system composed of dumb services, you deploy another service type called an “orchestrator”. The orchestrator is the “brain”, the dumb services are the “muscle” — the brain tells the muscles what to do.

The orchestrator sends tasks to each service. Additionally, it directs the data exchange between services. Finally, it pushes data and configurations to the client facing services. Furthermore, the orchestrator initiates all service state changes.

Dumb Service Orchestration

Let’s review our “regular vs. dumb” services in light of two key aspects of a software system — fault tolerance and scalability.

Fault Tolerance

Fault Tolerance with Regular Services

In the regular case diagram we illustrate a typical flow during a user request. The client facing services at level 1 (labeled with L1 in the diagram) need to call the internal services at levels 2 and 3 to complete the request. Naturally, in a larger system, this call hierarchy goes much deeper. To meet the SLA, all services must be up all time as any incoming request could call a service further down the hierarchy. This is obviously a hard task, combining N services with uptime of 99.95% does not result in 99.95% for the entire system — in the worst case, for a request that hits 5 services you’d go down to 99.7% (99.95 to the power of N).

Fault Tolerance with dumb services.

Let’s compare this to the system composed of dumb services. The client facing services on level 1, serve the request without any dependency to the services at level 2 and 3. We only need to ensure the SLA of the L1 services to ensure the SLA of the entire client facing part of the system — services at levels 2 and 3 could go down without affecting user requests.

Scaleability

Scaling Regular Services

Scaling the system composed of regular services, necessarily means scaling the entire system. If only one layer is scaled, it could result in overloading the lower layers of the system as user requests increase. The process of scaling also means more automation as you need to correctly wire the services together to scale.

Scaling Dumb Services Architecture

Let’s take a look again at our dumb services architecture. Each service can be scaled independently as it has no direct dependencies on any other services. You can spin up as many client facing services as you like to meet increased user requests without scaling any of the internal systems. And vice versa, you can increase the number of nodes for internal services on demand to meet a heavy indexing task and then easily spin it down. Again, all this without affecting user requests.

What about Testing?

Finally, testing your service is simple: you start it and pass a task to it — no dependencies you need to consider.

Wrapping it up

In conclusion, you can simplify your architecture significantly by deploying this simple concept. However, as mentioned previously, this does not apply to all use cases. Situations, where your client facing nodes are part of both the reading and writing data pipelines, are harder to organize in this way. Even still, any time you’re faced with designing a system composed of multiple services, think about these patterns — it may save you a few sleepless nights.

November 5, 2020