Henrik Ingo: “Half the time problem with an application code and not with databases.”

Fwdays
6 min readSep 30, 2018

We would like to introduce you the new special project of Fwdays and speakers.in.ua. Highload fwdays’18 conference was held on September 15 and below you can find an interview with Henrik Ingo — R&D on the performance team of Mongo DB and a speaker at the conference. Enjoy reading :)

P.S. Russian version of the interview can be found here.

Why did you choose performance? You have been working on it for about 15 years already.

I guess I’ve been working on performance always. With MySQL and databases, in general, I’ve been working for ten years. I think consultants are called to solve performance problems often with databases. Also, often performance problems can mean many things. However, the reason you call a consultant is when you stuck, and you need help. Maybe, it’s an architectural problem. So half the time it’s a problem with an application code and not with databases. Then I fix what the application is doing wrong, what is causing the problem. And sometimes it’s a performance problem when you need to buy a bigger server or other solutions. When I say that: “I do performance” it’s quite a broad topic. It’s called non-functional requirements. So, when you develop a product, you have UI, features and at the end of the project time, you get to work with non-functional tasks. These tasks are about availability and performance which is quite remarkable, actually. If the database isn’t there, the application won’t work.

What are the first steps you do in starting the work with performance?

There is a great principle when I do a performance work — you should test what the application actually does, i.e., you should test what the user does. It’s quite common that people run some benchmark against a database (maybe, because it’s available in Open Source) or do some tuning but it may have no relevance with what your application does when it’s in production.

A similar situation often arises when building monitoring solutions. So you may have a monitoring process that connects to the database to check that it’s still there. Maybe it does a “SELECT 1”. And this check may succeed, but many things might still be not working. So, the same is true for performance testing. Make sure you’re you’re testing what the application is really doing.

When is MongoDB the best choice?

I answer this question in 2 steps usually.

1. You have these well-established relational databases based on SQL. You have to design upfront the scheme. So, it’s kind of planning-heavy and design-heavy approach. However, there are some benefits of having a strict data model with many correctness checks sometimes. Then you have this new field of NoSQL databases, including MongoDB, that was invented ten years ago. And MongoDB is still growing. Typically, what they do differently are two things: one is a scale-out model (MongoDB), and the second is that data is not always tables and columns and rows. So, you have JSON documents or some other very flexible models. That leads me to the second part.

2. Okay, you have like a lot of different databases and what is special with MongoDB? In the field of databases, you have a spectrum where some products focused on being simple high performance and scale out to the database. Like MemcacheDB(https://en.wikipedia.org/wiki/MemcacheDB) or something, so these are just key-value databases, but they can scale very well. However, they don’t have a lot of features. And in MongoDB, we always try to do both. Early on, MongoDB already supported replication and worked smoothly. And I’m quite impressed by this database. Also, we have a very rich Query language. Despite the fact, that a Query language has a different syntax but functionally it’s a match to what you can do with SQL database. This is actually unique.

Are there some projects where MongoDB is not the best choice?

There are less and less. I used to say that we don’t have transactions and if you have an application that is very transactional it might be a problem. However, now you can do it with MongoDB. I have to think of a new answer, I’m sorry, but don’t have it now.

What tools do you use in your work?

If you use our cloud-service Atlas, there is a monitoring tool, but I use it very rarely. One reason is when I was working as a consultant and went to customers I couldn’t start installing some tools. So I can’t spend two days installing this management solution. I use scripts that are written in Python often. Sometimes these scripts even don’t have a GUI, but they can produce bar charts or something for getting an overview. Thus, I need a kind of light tools that I can download «without being root-user.» For example, my colleague has developed a suite called Mtools that means MongoDB tools. These tools are handy and were developed by Thomas Rückstieß. Mtools are intended for different things, and some of them are also for analysis, and I used those a lot. Basically, the type of tools I like to use first of all is a graph. Even if you don’t know math well and statistics, try to use Excel or whatever for drawing a chart for your data. It may help you to find what is the cause and what is the symptom.

It wasn’t problems on the side of MongoDB, it was many randomnesses

How did you solve a problem of performance testing with AWS (Amazon Web Services)?

For a couple of years, I’ve been working on our performance testing, and we used Amazon to do extensive performance tests. We could handle even 16 Amazon servers. And for the first years of doing this, we had a problem that we had much variability in results. It wasn’t problems on the side of MongoDB; it was many randomnesses. We’ve spent on this project around three months and a lot of time testing Amazon itself. We’ve compared a lot of stuff: Linux versions, many options of CPU scaling e.t.c. It was a long process. I already told you about one of my favorite principles and another that I want to emphasize here is testing one thing at a time. So we used this approach for solving this problem.

I’ve participated in a lot of conferences and one day tired of existing tools

You contribute the great projects to Open Source. Please, tell us a story about what is behind impress.js.

Impress.js is a framework and not an application yet. With this framework, you can create presentations in browsers using HTML and CSS. And impress.js makes it very simple. I’ve participated in a lot of conferences and one day tired of existing tools. That was how I found impress.js. The person who created it at that moment stopped actively developing it. So I’ve made improvements and couple pull requests. During two years I’ve been making a lot of additional features.

Also, now I’m a maintainer of this framework. When I was a consultant, it was refreshing for me to have some project to work on.

How do you choose a topic for the conference? How much time does it take to be prepared?

Topics are coming from our experience that can be shared at conferences. So I must have something interesting to talk about. If I had a boring job, there would be nothing to talk about. That is the first point. Moreover, many years I’ve been able to talk about products from Open Source. You can submit several proposals to the conference organizers, and they choose. When I’m invited I always ask what they want to hear from me. So I try to get feedback always for presenting the best possible topic for the specific conference.

Maybe one day you change the world

What is your advice for people who want to follow your way?

I would recommend diving in Open Source. You can learn a lot from it. Also, do something you’re passionate about. Maybe one day you change the world.

Stay tuned for the next interviews ;)

--

--

Fwdays

We organize large conferences (JS, PHP, .NET, Highload, etc.) and meetups