Familiarity with Python for comrades who have outgrown the “language A vs. B language and other prejudices

Familiarity with Python for comrades who have outgrown the “language A vs. B language and other prejudices


For all habravchan who have a sense of deja vu: Write this post was prompted by the article " Introduction to Python " and comments to it. Unfortunately, the quality of this "introduction" ahem ... let's not talk about sad things. But it was even sadder to see squabbles in the comments, from the category "C ++ is faster than Python", "Rust is even faster than C ++", "Python is not needed", etc. Surprisingly, they did not remember Ruby!


As Bjarn Straustrup said,


"There are only two types of programming languages: those that people swear at all times, and those that nobody uses."

Welcome to the kat for everyone who would like to get acquainted with Python, not going down to dirty curses!


Morning in the mountains of the Eastern Caucasus was marked by screams. Two young men sat on a large boulder and discussed something fervently, actively gesticulating with their hands. A minute later, they began to push each other, and then they clasped and fell off the boulder into (as it turned out) the nettle bush. It can be seen this bush grew there for a reason - he immediately calmed down the fighters and made a truce in their undying argument. As you probably guessed, I was one of the debaters, the other was my best friend (hello, Quaker_t!), But the subject of our small talk was Visual Basic vs. Delphi !


Do you recognize yourself? Sometimes we turn our favorite programming languages ​​into a cult and are ready to defend it to the last! But the years go by and the moment comes when "A vs. B" from the subject of disputes turns into "I am more comfortable working with A, but if necessary I will learn how to work with B, C, D, E and in general, with anything ". That's only when we are faced with new programming languages, old habits and culture can not let us go for a long time.


I would like to introduce you to Python and help you transfer your experience to a new direction. Like any technology, it has its strengths and weaknesses. Python, like C ++, Rust, Ruby, JS, and all the rest, is a tool. Instructions are attached to any instrument and any instrument must be learned to use correctly .


"Author, don't powder your brains, were you going to introduce us to Python?" Let's get acquainted!


Python is a dynamic, high-level, general-purpose programming language. Python is a mature programming language with a rich ecosystem and tradition. Although the language saw the light in 1991, its modern look began to take shape in the early 2000s. Python is a charged language, its standard library has solutions for many occasions. Python is a popular programming language: Dropbox, Reddit, Instagram, Disqus, YouTube, Netflix, damn it, even Eve Online and many others actively use Python.


What is the reason for such popularity? With your permission, I will state my own version.


Python is a simple programming language. Dynamic typing. Garbage collector. Higher order functions. Simple syntax for working with dictionaries, sets, tuples and lists (including for getting cuts). Python is great for beginners: it gives you the opportunity to start with procedural programming, slowly move to OOP and feel the taste of functional programming. But this simplicity is like the tip of the iceberg. Dive into the depths as you come across Python's philosophy - Zen Of Python . Diving even further - and you find yourself in the code of clear rules on the design of the code - Style Guide for Python Code . Plunging, the programmer gradually delves into the concept of "Python way" or "Pythonic". At this amazing stage of learning a language, you begin to understand why good Python programs are written this way and not otherwise. Why language evolved in this direction, and not in the other. Python did not succeed in execution speed.But he succeeded in the most important aspect of our work - readability. "Write code for people, not for the machine" - this is the basis of the foundations of Python.


Good Python code looks beautiful. And writing beautiful code is not a pleasant activity?


Tip 0: Before you read further, please take a look at the corner Zena Python . Language is based on these postulates and our communication will be much nicer if you also know them.


What clever man thought of indents?


The first shock for those who have never seen Python code is the designation of the body of instructions with indents:


  def main ():
  ins = input ('Please say something')

  for w in ins.split (''):
  if w == 'hello':
  print ('world!')  

I remember evenings at the St. Petersburg Polytechnic dorm, when my neighbor, VlK , told with burning eyes what else is new he dug up in Python. "Body instructions are indented? Seriously?" - was my reaction. Indeed, for a person passing from Visual Basic ( if ... end if ) to C # (braces), through C, C ++ and Java, this approach seemed strange, to say the least. "Do you format the code with indents?", Asked VlK . Of course, I formatted it. More precisely, for me it was done by the spiral Visual Studio. She coped with it damn well. I never thought about formatting and indents - they appeared in the code by themselves and seemed something ordinary and familiar. But there was nothing to cover - the code was always formatted with indents. "Then why do you need braces, if the body of the instructions in any case will be shifted to the right?".


That evening, I sat down at Python. Looking back, I can say for sure what exactly helped to quickly absorb new material. It was a code editor. Influenced by the same VlK , shortly before the events described above, I switched from Windows to Ubuntu and Emacs to as an editor (in the yard 2007th year, before PyCharm, Atom, VS Code and others - many more years). "Well, now will be promoting Emacs ..." - you will say. Just a little :) Traditionally, the & lt; tab & gt; key in Emacs does not add tabs, but serves to align the line according to the rules of this mode. Clicked & lt; tab & gt; - and the line of code moves to the next appropriate position:



So you never have to wonder if you aligned the code correctly.


Tip 1: When meeting Python, use an editor that will take care of indentation.


And you know what a side effect of all this disgrace? The programmer tries to avoid long constructions. As soon as the size of the function goes beyond the vertical borders of the screen, it becomes more difficult to distinguish which construction the given code block belongs to. And the more investments, the more difficult. As a result, you try to write as concisely as possible, breaking long bodies of functions, cycles, conditional transitions, etc.


Oh, your dynamic typing


O, this discussion exists almost as much as the concept of "programming" itself exists! Dynamic typing is not bad and not good. Dynamic typing is also our tool. In Python, dynamic typing gives you great freedom of action. And where there is greater freedom of action - more likely to shoot yourself in the foot.


It is worth clarifying that the typing in Python is strict and adding the number to the string will not work:


  1 + '1'
 & gt; & gt; & gt;  TypeError: unsupported operand type (s) for +: 'int' and 'str'  

Python also checks the function signature when calling and throws an exception if the call signature is not valid:


  def sum (x, y):
  return x + y

 sum (10, 20, 30)
 & gt; & gt; & gt;  TypeError: sum () takes 2 positional arguments but 3 were given  

But loading the script, Python will not tell you that the function expects a number and not a string that you pass to it. And you will know about it only during the execution:


  def sum (x, y):
  return x + y

 sum (10, '10')
 & gt; & gt; & gt;  TypeError: can only concatenate str (not "int") to str  

The stronger the challenge to the programmer, especially when writing large projects . The modern Python responded to this challenge with the annotation mechanism and type library, and the community developed programs performing static type checking . As a result, the programmer learns about such errors before executing the program:


  # main.py:
 def sum (x: int, y: int) - & gt;  int:
  return x + y

 sum (10, '10')

 $ mypy main.py
 tmp.pyive: error: Argument 2 to "sum" has incompatible type "str";  expected "int"  

Python does not attach any importance to annotations, although it stores them in the __ annotations __ attribute. The only condition is that annotations must be valid values ​​from the point of view of the language. Since their introduction in version 3.0 (which was more than ten years ago!), It was through community efforts that annotations were used for typed marking of variables and arguments.


Another example is more complicated.
  # For those who are very  in the subject, I recall: this is an example :)

 from typing import TypeVar, Iterable

 Num = TypeVar ('Num', int, float)

 def sum (items: Iterable [Num]) - & gt;  Num:
  accum = 0
  for item in items:
  accum + = item
  return accum

 sum ([1, 2, 3)
 & gt; & gt; & gt;  6  

Tip 2: In practice, the most dynamic typing causes problems when reading and debugging code. Especially if this code was written without annotations and you have to spend a lot of time figuring out the types of variables. You do not need to specify and document the types of everything and everything, but the time spent on a detailed description of the public interfaces and the most critical sections of the code will be rewarded!


Quack! Duck typing


Sometimes experts on Python assume a mysterious look and speak about Duck Typing.
Duck typing (Duck typing) is the use of the "duck test" in programming:


If an object quacks like a duck, flies like a duck and walks like a duck, then most likely it’s a duck.

Consider an example:


  class RpgCharacter:
  def __init __ (self, weapon)
  self.weapon = weapon

  def battle (self):
  self.weapon.attack ()  

Here is the classic dependency injection. The RpgCharacter class gets the weapon object in the constructor and later, in the battle () method, calls weapon.attack () . But RpgCharacter does not depend on the specific implementation of weapon . It can be a sword, a BFG 9000, or a whale with a flower pot, ready to land on the head of the enemy at any time. It is important that the object has a attack () method, Python is not interested in everything else.



Strictly speaking, duck typing is not unique. It is present in all (familiar to me) dynamic languages ​​that implement OOP.


This is another example of how to carefully program in the world of dynamic typing. Poorly named method? Ambiguously named variable? Your colleague, or you yourself, after half a year later, will be happy to clean such a code:)


What would we do if we use conditional Java?
  interface IWeapon  {
  void attack ();
 }

 public class Sword implements IWeapon {
  public void attack () {
//...
  }
 }

 public class RpgCharacter {
  Iweapon weapon;

  public RpgCharacter (IWeapon weapon) {
  this.weapon = weapon;
  }

  public void battle () {
  weapon.attack ();
  }
 }  

And there would be a classic static typing, with a type checking at the compilation stage. Price - the inability to use an object that has a attack () method, but does not implement the IWeapon interface explicitly.


Tip 3 : If you wish, you can describe the interface by building your own . Better yet, take the time to thoroughly test and write documentation for yourself and the users of your code.


The procedural approach and __special____ __ ()


Python is an object-oriented language and at the root of the inheritance hierarchy is the object class:


  isinstance ('abc', object)
 & gt; & gt; & gt;  True

 isinstance (10, object)
 & gt; & gt; & gt;  True  

But where obj.ToString () is used in Java and C #, in Python there will be a call to str (obj) . Or, for example, instead of myList.length , in Python there will be len (my_list) . The creator of the language, Guido van Rossum, explained it this way:


When I read a code that says len (x) , I know that the length of something is being requested. This immediately tells me that the result will be an integer, and the argument will be some kind of container. Conversely, when reading x.len () , I need to know that x is some kind of container that implements a specific interface or inherits from a class that has a method < code> len () . [Source] .

However, the functions len () , str () and some others will be call object defined methods:


  class User:
  def __init __ (self, name, last_name):
  self.name = name
  self.last_name = last_name

  def __str __ (self):
  return f "Honorable {self.name} {self.last_name}"

 u = User ('Alex', 'Black')
 label = str (u)
 print (label)
 & gt; & gt; & gt;  Honorable Alex Black  

Special methods are also used by language operators, both mathematical and Boolean, and the operators of the for ... in ... cycle, the context operator with , the index operator [] etc.
For example, an iterator protocol consists of two methods: __ iter __ () and __ next __ () :


  # No Iterable, IEnumerable, std :: iterator, etc.
 class InfinitePositiveIntegers:
  def __init __ (self):
  self.counter = 0

  def __iter __ (self):
  "" "Returns an object that will be iterated.

  Called by the built-in iter () function.
  "" "
  return self

  def __next __ (self):
  "" "Returns the elements of an iteration.

  Called by the built-in next () function.
  "" "
  self.counter + = 1
  return self.counter

 for i in InfinitePositiveIntegers ():
  print (i)
 & gt; & gt; & gt;  one
 & gt; & gt; & gt;  2
 & gt; & gt; & gt;  ...
 # to stop, press Ctrl + C  

Well, let's say special methods. But why do they look so vivid? Guido explained this by saying that if they had the usual names without underscores, programmers, if they did not, they would eventually redefine them. Those. The __ method __ () is a kind of foolproof protection. As time has shown, protection is effective:)


Tip 4: Carefully read the built-in features and using special object methods . They are an integral part of the language, without which it is impossible to fully speak it.


Where is the encapsulation? Where is my private ?! Where is my fairy tale? !!


In Python, there are no access modifiers for class attributes. The interiors of objects are open for access without any restrictions. However, there is a convention whereby attributes with the _ prefix are considered private, for example:


  import os

 class MyFile:
  # The field is considered private
  _os_handle = None

  def __init __ (self, path: str):
  self._open (path)

  # Method is considered private
  def _open (self, path):
  # os.open () - * low-level * function for opening files.
  # In practice, the built-in function open () is used.
  # Os.open () is perfect for an example.
  self._os_handle = os.open (path, os.O_RDWR | os.O_CREAT)

  # And this method is considered public
  def close (self):
  if self._os_handle is not None:
  os.close (self._os_handle)

 f = MyFile ('/tmp/file.txt')
 print (f._os_handle) # there is no problem accessing the "private" field!
 f.close ()  

Why?


There is nothing private in Python. Neither the class nor its copy will hide from you what lies inside (due to which deepest introspection is possible). Python trusts you. He seems to be saying, "Buddy, if you want to rummage around the dark corners - there are no problems. I believe that there are good reasons for this and I hope that you will not break anything.

In the end we are all adults here.

  - Karl Fast [Source] .

And how to avoid name collisions during inheritance?

Python has a special name distortion mechanism (mangling) attributes starting with double underscore and not ending with double underscore ( __ my_attr )! This is done to avoid name collisions during inheritance. To call outside the body of class methods, Python adds the prefix _Class_name attribute . But for internal access, nothing changes:


  class C:
  def __init __ (self):
  self .__ x = 10

  def get_x (self):
  return self .__ x

 c = C ()
 c .__ x
 & gt; & gt; & gt;  'C' object has no attribute '__x'

 print (c.get_x ())
 & gt; & gt; & gt;  ten

 print (c._C__x)
 & gt; & gt; & gt;  10  

Let's look at the practical application. For example, the File class, which reads files from the local file system, we want to add caching capabilities. Our colleague managed to write a mixin class for this purpose. But to isolate methods and attributes from potential collisions, a colleague added the prefix __ to their names:


  class BaseFile:
  def __init __ (self, path):
  self.path = path

 class LocalMixin:
  def read_from_local (self):
  with open (self.path) as f:
  return f.read ()

 class CachedMixin:
  class CacheMissError (Exception):
  pass

  def __init __ (self):
  # Teper, even if in the next class in the inheritance chain
  # there will be an __cache attribute, or __from_cache () method,
  # Collision, or rather override will not happen!
  self .__ cache = {}

  def __from_cache (self):
  return self .__ cache [self.path]

  def read_from_cache (self):
  try:
  return self .__ from_cache ()
  except KeyError as e:
  raise self.CacheMissError () from e

  def store_to_cache (self, data):
  self .__ cache [self.path] = data

 class File (CachedMixin, LocalMixin, BaseFile):
  def __init __ (self, path):
  CachedMixin .__ init __ (self)
  BaseFile .__ init __ (self, path)

  def read (self):
  try:
  return self.read_from_cache ()
  except CachedMixin.CacheMissError:
  data = self.read_from_local ()
  self.store_to_cache (data)
  return data  

If you are interested to look at the implementation of this mechanism in CPython, asking to Python/compile. c


Finally, due to the presence of properties in the language, it makes no sense to write getters and setters in the style of Java: getX (), setX () . For example, in the originally written class Coordinates ,


  class Coordinates:
  def __init __ (self, x, y):
  self.x = x
  self.y = y

 c = Coordinates (10, 10)
 print (c.x, c.y)
 & gt; & gt; & gt;  (10, 10)  

It was necessary to control access to the x attribute. The correct approach would be to replace it with property , thereby keeping the contract with the outside world.


  class Coordinates:
  _x = 0

  def __init __ (self, x, y):
  self.x = x
  self.y = y

  @property
  def x (self):
  return self._x

  @ x.setter
  def x (self, val):
  if val & gt;  ten:
  self._x = val
  else:
  raise ValueError ('x should be greater than 10')

 c = Coordinates (20, 10)
 c.x = 5
 & gt; & gt; & gt;  ValueError: x should be greater than 10  

Tip 5: Like so much in Python, the concept of private fields and class methods is based on an established convention. Do not be offended by the authors of libraries if “everything stopped working” on the basis that you actively used the private fields of their classes. In the end, we are all adults here:) .


A bit about exceptions


In Python culture, a peculiar approach to exceptions. In addition to the usual interception and processing a la C ++/Java, you will have to face the use of exceptions in context


"It's easier to ask for forgiveness than asking permission" (Easier to ask for permission, than forgiveness, EAFP).

To paraphrase - do not write too much if if in most cases execution will follow this thread. Instead, wrap the logic in try..except .


Example: Imagine a POST request handler that creates a user in a conditional database. At the input of the function is a dictionary (dictionary) of the key-value type:


  def create_user_handler (data: Dict [str, str]):
  try:
  database.user.persist (
  username = data ['username'],
  password = data ['password']
  )
  except KeyError:
  print ('There was a field of data creation for users')  

We didn’t pollute the code with checks "is there username or password in data ". We expect that they will most likely be there. We do not ask for "permission" to use these fields, but "we ask for forgiveness" when another kulkhacker post the form with missing data.


Just don’t make this absurd!

For example, you want to check if the user's last name is in the data and in the absence set it to an empty value. if will be much more appropriate here:


  def create_user_handler (data):
  if 'last_name' not in data:
  data ['last_name'] = ''

  try:
  database.user.persist (
  username = data ['username'],
  password = data ['password'],
  last_name = data ['last_name']
  )
  except KeyError:
  print ('There was a field of data creation for users')  

Errors should never pass silently. - keep silence exceptions! Modern Python has a wonderful raise from construct that allows you to save the context of the exception chain. For example:


  class MyProductError (Exception):
  def __init __ (self):
  super () .__ init __ ('There has been a terrible product error')

 def calculate (x):
  try:
  return 10/x
  except ZeroDivisionError as e:
  raise MyProductError () from e  

Without raise from e , the chain of exceptions ends with MyProductError , and we cannot find out exactly what caused this error. With raise from X , the cause (i.e. X ) of the thrown exception is stored in the __ cause __ attribute:


  try:
  calculate (0)
 except MyProductError as e:
  print (e .__ cause__)

 & gt; & gt; & gt;  division by zero  

But there is a small nuance in the case of iteration: StopIteration

In the case of iteration, throwing out the exception StopIteration is the official way to signal the end of an iterator.


  class PositiveIntegers:
  def __init __ (self, limit):
  self.counter = 0
  self.limit = limit

  def __iter __ (self):
  return self

  def __next __ (self):
  self.counter + = 1

  if self.counter == self.limit:
  # no hasNext () or moveNext (),
  # only exceptions, only hardcore
  raise StopIteration ()

  return self.counter

 for i in PositiveIntegers (5):
  print (i)
 & gt;  one
 & gt;  2
 & gt;  3
 & gt;  4  

Tip 6: We only pay for exception handling in exceptional situations. Do not neglect them!


There should be one-- and to do it.


switch or pattern matching? - use if and dictionaries. do cycles ? - for this there is while and for . goto ? I think you yourself guessed it. The same applies to some techniques and design patterns that seem to be taken for granted in other languages. The most amazing thing is that there are no technical restrictions on their implementation, just "this is not the way we have accepted."


For example, in Python you don’t often see the Builder pattern. Instead, it uses the ability to pass and explicitly request the function's named arguments. Instead of


  human = HumanBuilder.withName ("Alex"). withLastName ("Black"). ofAge (20) .withHobbies (['tennis', 'programming']). build ()   

will


  human = Human (
  name = "Alex"
  last_namne = "Black"
  age = 20
  hobbies = ['tennis', 'programming']
 )  

The standard library does not use chains of methods for working with collections . I remember how a colleague who came from the world of Kotlin showed me the code of the following sense (taken from official documentation on Kotlin):


  val shortGreetings = people
  .filter {it.name.length & lt;  ten }
  .map {"Hello, $ {it.name}!"  }  

In Python, map () , filter () and many others are functions, not collection methods.Rewriting this code one-to-one will:


  short_greetings = map (lambda h: f "Hello, {h.name}", filter (lambda h: len (h.name) & lt; 10, people))  

I think it looks awful. Therefore, for long bundles like .takewhile (). Filter (). Map (). Reduce () it is better to use the so-called. inclusions (comprehensions), or good old cycles. By the way, the same example on Kotlin is given in the form of the corresponding list comprehension. And on Python it looks like this:


  short_greetings = [
  f "Hello {h.name}"
  for h in people
  if len (h.name) & lt;  ten
 ]  

For those who miss chains

There are libraries such as Pipe or py_linq !


Chains of methods are used where they are more effective than standard tools. For example, in the Django web framework, chains are used to build a database query object:


  query = User.objects \
  .filter (last_visited__gte = '2019-05-01') \
  .order_by ('username') \
  .values ​​('username', 'last_visited') \
  [: 5]  

Tip 7: Before you do something very familiar from past experience, but not familiar in Python, ask yourself what decision an experienced pythonist would make?


slow python


Yes.


Yes, if we are talking about execution speed compared to statically typed and compiled languages.


But you seem to want a detailed answer?


Python's reference implementation (CPython) is far from its most effective implementation. One of the important reasons is the desire of developers not to complicate it. And the logic is quite clear - not too abstruse code means fewer errors, a better possibility of making changes, and in the end, more people who want to read this code, understand and supplement it.


Jake VanderPlas makes out on his blog, what happens in CPython under the hood when adding two variables containing integer values:


  a = 1
 b = 2
 c = a + b  

Even if you don’t go deep into the CPython jungle, you can say that the interpreter will have to create three objects to store the variables a , b and c heap, in which the type and (pointers to) values ​​will be stored; redefine the type and values ​​of an add operation to call something like binary_add & lt; int, int & gt; (a- & gt; val, b- & gt; val) ; write the result to c .
This is terribly inefficient compared to a similar C program.


Another issue with CPython is Global Interpreter Lock (GIL). This mechanism, in fact - a boolean value, enclosed by a mutex, is used to synchronize the execution of bytecode. GIL simplifies the development of code that works in a multi-threaded environment: you don’t need to think about synchronizing access to variables or deadlocks. You have to pay for this by the fact that only one stream gets access and executes bytecode at a given time:



If you are wondering what attempts are being made to eradicate the GIL

I recommend reading the article by Anthony Shaw " Has the GIL been slain? ".


What are the ways out?


  1. Python interacts well with native libraries. In the simplest version ( CFFI ) you need to describe the source and signature of the function in Python and call it from the dynamic library. For full-fledged work with the interpreter and environment, Python provides an API for writing extensions in C/C ++ If you go to Google, you can find the implementation of extensions on Rust, Go and even Kotlin Native !
  2. Use an alternative implementation of Python, for example:
    • PyPy , with a built-in JIT compiler. The speed gain will be less than when using native expansion, but in a particular case it may be larger and will not be needed?
    • Cython is a transpiler and compiler of the Python superset of C code.
    • IronPython is an implementation running on top of the .NET framework.

Tip 8: If you are a priori important to the speed of execution, it will be more efficient to use Python as a bundle between native components and not try to cram in a non-breathing one. If you are working on an application in which IO (network, database, file system) is a bottleneck, then by the time Python speed ceases to suit you, you will know exactly how to solve this problem:)


Basic Tools


How do the first steps in Python begin? If you have Linux or MacOS at hand, then in 95% of cases Python will be installed out of the box. If you live on the cutting edge of progress, then most likely this is version 3.x, and not the obsolete version 2.7. For comrades on Windows, everything is a little more complicated. Here are a few options: use Docker, Windows Subsystem for Linux, Cygwin, finally, the official Python installer for Windows.


Tip 9: If possible, use the latest version of Python. The language is developing, each version is working on bugs and always something new and useful.


Have you already written "Hello world" and does it work? Excellent! After a couple of days, you’ll take machine learning and you will need some library from the Python Package Index (PyPI) directory.


To avoid version conflicts when installing packages, Python uses so-called virtual environments They allow you to partially isolate the environment by creating a directory in which the installed packages will be located. There will also be shell scripts to manage this environment. The pip package installer is also included. With an activated virtual environment, pip will install the packages in it. And all this combines utilities such as pipenv or poetry - analogs of npm, bundler, cargo, etc.


Tip 0xA: Your main dependency management assistants are pip and virtualenv . Everything else is comfortable, beautiful, high-level wrappers. After all, all Python needs and you are the right sys.path - the list of directories that the modules will be searched for when they are imported.


What's next?


Have you read official tutorial ? Then take a look at the tutorial on the tools above . And as in the famous copy-paste:


Tomorrow we are looking for a book called Dive into python on the Internet ...

I am sure that you have a lot of ideas and itching to take on a new project on Python. After all, a rare day is spent on Habré without the appearance of an article on the use of Python where, it seemed, it did not belong at all:)


Dare, comrades!

Source text: Familiarity with Python for comrades who have outgrown the “language A vs. B language and other prejudices