[Translation] The faster you forget OOP, the better for you and your programs.

[Translation] The faster you forget OOP, the better for you and your programs.



Object-oriented programming is an extremely bad idea that could only occur in California.

- Edsger Vibe Dijkstra

Perhaps these are just my feelings, but object-oriented programming seems like the standard, most common design paradigm BY. It is usually taught to students, explained in online tutorials and, for some reason, used spontaneously even when they were not going to do it.

I know how attractive she is, and how wonderful this idea appears on the surface. It took me many years to destroy her charms, and now I understand how terrible she is, and why. Thanks to this point of view, I have a clear conviction that people need to realize the error of the PLO and know the solutions that can be used instead of it.

Many people have previously discussed the problems of OOP, and at the end of this post I will list my favorite articles and videos. But first I want to share my own opinion.

Data is more important than code


At its core, all software is designed to manipulate data in order to achieve a certain result. The result determines how the data is structured, and the data structure determines the necessary code.

This moment is very important, so I repeat: target - & gt; data architecture - & gt; code . There is no way to change the order here! When designing a program, you should always begin with finding out the goal to be achieved, and then at least approximately present the data architecture: the structure and infrastructure of the data necessary to achieve it effectively. And only after that you need to write code to work with such an architecture. If the goal changes over time, you must first change the data architecture, and then the code.

In my experience, the most serious problem of OOP is that it motivates to ignore the architecture of the data model and apply a stupid pattern of saving everything into objects that promise some vague advantages. If it is suitable for the class, then it is sent to the class. I have a Customer ? It is sent to class Customer . Do I have a rendering context? It is sent class RenderingContext .

Instead of building a good data architecture, the developer’s attention is shifted towards the invention of “good” classes, relationships between them, taxonomies, inheritance hierarchies, and so on. This is not just a useless exercise. In its depths it is very harmful.

Motivation for difficulty


When designing an explicit data architecture, the result is usually the minimum required set of data structures serving the purpose of our software. If you think in the categories of abstract classes and objects , then the grandeur and complexity of abstractions from above is not limited by anything. Just take a look at FizzBuzz Enterprise Edition - such a simple task can be accomplished in so many lines of code only in that OOP there is always room for new abstractions.

OOP advocates will say that testing abstractions is a matter of developer skill level. Maybe. But in practice, OOP programs always grow and never decrease, because OOP stimulates this.

Everywhere graphs


Since OOP requires scattering information over many small encapsulated objects, the number of references to these objects is also growing at an explosive pace.OOP requires you to pass long lists of arguments everywhere or directly store links to related objects for quick access to them.

Your class Customer has a link to class Order , and vice versa. The class OrderManager contains links to all Order , and therefore indirectly to Customer . Everything tends to refer to everything else, because gradually more and more places appear in the code that refer to the related object.

You needed a banana, but you got a gorilla holding banana and whole jungle.

OOP projects usually do not look like well-designed data warehouses, but as huge spaghetti graphs of objects pointing at each other, and methods that get huge lists of arguments. When you start designing Context objects just to cut the number of arguments passed to and fro, you realize that you are writing a real OOP code of the Enterprise level.

Cross-sectional tasks


The overwhelming majority of the essential code does not work with only one object, but actually implements cross-sectional tasks. Example: when class Player strikes using the hits () class Monster method, where do you really need to change the data? The hp value of the Monster object should decrease by attackPower of the Player object; the xp value of the Player object should increase by Monster in the case of Monster . Should this happen in Player.hits (Monster m) or in Monster.isHitBy (Player p) ? What if you need to take into account class Weapon ? We pass the argument to isHitBy or does Player have a currentWeapon () getter?

This simplified example with just three interacting classes is already becoming a typical OOP nightmare. Simple data conversion turns into a bunch of clumsy intertwined methods that call each other, and the reason for this is only in the OOP dogma - encapsulation. If we add a bit of inheritance to this mix, we will get a good example of what stereotypical Enterprise-level software looks like.

Schizophrenic encapsulation of objects


Let's take a look at the definition of encapsulation :

Encapsulation is an OOP concept that binds data and functions to manipulate this data, protecting them from outside interference and misuse. Data encapsulation has led to the concept of data hiding important to OOP.

Intention is good, but in practice, encapsulation with the fragmentation of an object or class often leads to the code trying to separate everything from everything else (from itself). This creates a huge amount of boilerplate: getters, setters, numerous designers, strange methods, and they all try to protect us from mistakes that are too unlikely to occur on such a modest scale. You can use this metaphor: I put a padlock on my left pocket so that my right hand could not take anything from it.

Don't misunderstand me - imposing restrictions, especially in the case of ADT , is usually a good idea. But in OOP with all these cross-references of objects, encapsulation often does not achieve anything useful, and it is rather difficult to take into account the restrictions scattered across many classes.

In my opinion, classes and objects are too fractional, both in terms of isolation, API, etc. it is better to work within the "modules"/"components"/"libraries".And in my experience, it is in OOP codebases (Java/Scala) that modules/libraries are not used. Developers are focused on building fences around each class, not really thinking about which groups of classes together form a separate, reusable, holistic logical unit.

You can look at the same data differently


OOP requires ordering data in an inflexible way: to divide it into a set of logical objects, which determines the data architecture - a graph of objects with related behavior (methods). However, it is often useful to have different possibilities for the logical expression of data manipulation.

If the program data, for example, is stored in a tabular, data-oriented form, you can create two or more modules, each of which works with the same data structure, but in a different way. If the data is broken into objects with methods, then it is no longer possible.

This is also the main cause of the object-relational gap . Although the relational data structure is not always the best, it is usually flexible enough to work with it in various ways using different paradigms. However, the stiffness of data organization in OOP causes incompatibility with any other data architecture.

Poor performance


The combination of scatter of data across many small objects, the active use of indirection and pointers, the lack of a proper data architecture lead to low execution speed. This justification is more than enough.

What approach should be used instead of OOP?


I don’t think there is a “silver bullet”, so I’ll just describe how it usually works in my code today.

First I study the data. I analyze what goes to the input and output, data format, their volume. Understand how data should be stored at run time and how it is stored: which operations should be supported and at what speed (processing speed, latency), etc.

Usually, if the data has a significant amount, my structure is close to the database. That is, I will have an object, for example DataStore with an API that provides access to all the necessary operations for querying and saving data. The data itself will be contained in the form of ADT/PoD structures, and any links between data records will be presented in the form of ID (number, uuid or deterministic hash). Internally, it usually strongly resembles or actually has support for a relational database: Vec tori or HashMap store most of the data for Index or ID, other structures are used as “indexes Required to perform a quick search, and so on. Other data structures are also located here, such as LRU caches and the like.

The main part of the program logic receives a link to such DataStore and performs the necessary operations with them. For the sake of concurrency and multithreading, I usually connect different logical components through message passing like actors. Example actor: stdin reader, input data handler, trust manager, game state, etc. Such "actors" can be implemented as pools of subprocesses, elements of conveyors, etc. If necessary, they can have their own or common with other DataStore .

Such an architecture gives me convenient testing points: DataStore can have different implementations using polymorphism, and the communicating instances of the actors can be created separately and managed through test message sequences.

The basic idea is this: just because my software works in an area where there are concepts, for example, customers and orders, it will not necessarily have a Customer class and related methods. Quite the contrary: the concept of Customer is just a set of data in tabular form in one or several DataStore , and the “business logic” code directly manipulates this data.

Additional reading


Like so much in software design, criticism of OOP is not an easy topic. Perhaps I did not manage to clearly convey my point of view and/or convince you. But if you're interested, here are some more links:

Source text: [Translation] The faster you forget OOP, the better for you and your programs.