[Translation] Why 2D vector graphics are much more complicated than 3D

[Translation] Why 2D vector graphics are much more complicated than 3D

Recently, a lot of fantastic research on 2D rendering has appeared. Petr Kobalichek and Fabian Yzerman are working on Blend2D : this is one of the fastest and most accurate CPU rasterizers on the market, with innovative JIT technology. Patrick Walton from Mozilla studied not one, but three different approaches in Pathfinder , the culmination of which was Pathfinder v3. Raf Levienne built a computational pipeline using the technology described in Ghana research paper with colleagues on vector textures (2014) . It seems that the distance fields with the sign are developing further: Adam Simmons and Sara Frisken .

Some may ask: why is there so much noise around 2D? That can't be much harder than 3D, right? 3D is a completely different dimension! Here we have real-time ray tracing with accurate illumination on the nose, and you can’t master the nondescript 2D graphics with solid colors?

For those who are not very well versed in the details of a modern GPU, this is really very amazing! But in 2D graphics, there are many unique limitations that make it extremely difficult. In addition, it can not be parallelized. Let's walk along the historical path that brought us here.

PostScript takeoff

In the beginning was the plotter. The first graphic device capable of interacting with a computer was called " plotter " (plotter): one or more pens, able to move around on paper. Everything works using the pen-down command, then the drawing head moves in a unique way, possibly in a curve, and the pen command -up ". HP, the manufacturer of some of the earliest plotters, used a BASIC variant called AGL on the control computer, which then sent commands to the plotter in another language, such as HP-GL . In the 1970s, graphic terminals became cheaper and more popular, starting with Tektronix 4010 . He showed the image using a CRT, but do not be fooled: this is not a pixel display. Tektronix came from the industry of analog oscilloscopes, and these machines work by controlling the electron beam in a specific way . Thus, Tektronix 4010 did not have pixel output. Instead, you sent him commands in a simple graphical mode that could draw lines, but again, in the “feather down”, “feather up” mode.

As in many other areas, everything changed the invention of Xerox PARC. Researchers began to develop a new type of printer, more computationally expressive than a plotter. This new printer worked in a small stack turing-full programming language similar to Forth, and called it ... Interpress ! Obviously, Xerox could not find a worthy use for it, so the inventors left the ship and founded a small startup called Adobe. They took Interpress with them, and as they were corrected and improved, it changed beyond recognition, so they gave it a different name: PostScript. In addition to the sweet, turing-complete stack language, in the fourth chapter of the original manual, PostScript Language Reference describes the Imaging Model, which is almost identical to modern programming interfaces. Example 4.1 from the manual contains sample code that can be almost line by line translated to HTML5 & lt; canvas & gt ;.

 /box {function box () {
  newpath ctx.beginPath ();
  0 0 moveto ctx.moveTo (0, 0);
  0 1 lineto ctx.lineTo (0, 1);
  1 1 lineto ctx.lineTo (1, 1);
  1 0 lineto ctx.lineTo (1, 0);
  closepath ctx.closePath ();
 } def}
 gsave ctx.save ();
 72 72 scale ctx.scale (72, 72);
 box fill box ();  ctx.fill ();
 2 2 translate ctx.translate (2, 2);
 box fill box ();  ctx.fill ();
 grestore ctx.restore ();  

This is no coincidence.

Steve Jobs from Apple met with Interpress engineers during his visit to PARC. Jobs thought the printing business would be profitable, and he tried to buy Adobe at birth. But Adobe made a counter offer and eventually sold Apple a five-year PostScript license. The third pillar in Jobs’s plan was financing a small startup, Aldus, who made a WYSIWYG application for creating PostScript documents. It was called PageMaker. In early 1985, Apple released the first PostScript compatible printer, the Apple LaserWriter. The combination of Macintosh, PageMaker and LaserWriter instantly turned the printing industry on its head, and the new hit "desktop publishing" has strengthened its place in history for PostScript. The main competitor Hewlett-Packard eventually also bought a PostScript license for a series of competing LaserJet printers. It happened in 1991 after consumer pressure.

PostScript has slowly moved from printer control language to file format. Clever programmers studied the form in which PostScript commands were sent to the printer — and began to create PostScript documents manually, adding charts, graphs, and drawings to their documents, while using PostScript to display graphics on the display. There is a demand for graphics outside the printer! Adobe noticed and quickly released Encapsulated PostScript , which was nothing more than a few specially formatted PostScript comments with metadata about the size of the image. and restrictions on the use of printer commands, such as “page feed”. In the same 1985, Adobe began developing Illustrator, an application where artists worked in Encapsulated PostScript in a convenient UI. Then these files could be transferred to a word processor, which created ... PostScript documents for sending to PostScript printers. The whole world switched to PostScript, and Adobe could not be even happier. When Microsoft was working on Windows 1.0 and wanted to create its own graphical API for developers, the main goal was to make it compatible with existing printers so that graphics were sent to printers as easily as the screen. This API was eventually released as GDI - the main component used by every engineer during the rapid growth of Windows popularity in the 90s. Generations of programmers on the Windows platform began to unknowingly identify 2D vector graphics with the PostScript image model, assigning it this status de facto.

The only serious PostScript problem was turing completeness: viewing the 86th page of a document means first running the script for pages 1-85. And it may be slow. Adobe found out about this complaint from users and decided to create a new document format that did not have such restrictions, it was called “Portable Document Format” or, for short, “PDF”. A programming language was thrown out of it, but the graphic technology remained the same. Quotation from PDF specifications, chapter 2.1, "Image Model" :

PDF is based on its ability to describe the appearance of complex graphics and typography. This feature is achieved through the use of the Adobe image model, the same high-level, device-independent representation that is used in the PostScript page description language.

When the W3C consortium viewed applicants for 2D markup on the Internet, Adobe defended PGML based on XML The basis of which was a graphical PostScript model:

PGML should include a PDF/PostScript image model to ensure scalable 2D graphics that meet the needs of both regular users and graphics professionals.

Microsoft’s VML from Microsoft was based on GDI, which, as we know, is based on PostScript. The two competing proposals, which were still essentially PostScript, were brought together, so the W3C adopted the “scalable vector graphics” (SVG) standard that we know and love today.

Even if it is old, let's not pretend that PostScript innovations brought into this world are something less than a technological miracle. Apple's PostScript Printer LaserWriter was twice as powerful as the Macintosh that controlled it, just to interpret PostScript and rasterize vector paths to points on paper. This may seem excessive, but if you have already bought a fashionable printer with an laser inside, then you should not be surprised at the expensive processor. In its first incarnation, PostScript invented a rather complex visualization model with all the features that we now take for granted. What is the most powerful, awesome feature? Fonts. At that time, the fonts were drawn by hand with a ruler and protractor and cast on film for photochemical printing . In 1977, Donald Knut showed the world what his system METAFONT , which he introduced with the text editor TeX, is capable of, but she not caught on. She demanded from the user a mathematical description of the fonts using brushes and curves. Most font developers did not want to learn it. And the bizarre bends with small sizes turned into a mess: the printers of that time did not have sufficient resolution, so the letters blurred and merged with each other. PostScript has proposed a new solution: an algorithm for “binding” outlines to coarser grids with which printers operated. This is known as grid fitting. To prevent too much distortion in the geometry, they allowed the fonts to set “hints” which parts of the geometry are the most important and what should be preserved.
Adobe's original business model was to sell this font technology to printer developers and sell special recreated fonts with added tips to publishers, so Adobe is still selling its versions Times and Futura . By the way, this is possible because the fonts, or, more formally, “typefaces”, are one of five things explicitly excluded from US copyright law , since they were originally designated as" too simple or utilitarian to be creative works. " Instead, it is copyrighted by a digital program that reproduces the font on the screen. So that people cannot copy Adobe fonts and add their own, the format is Type 1 Font was originally owned by Adobe and contained" font encryption "code. Only Adobe's PostScript could interpret Type 1 fonts, and only they implemented proprietary hints technology that provides sharpness at small sizes. < br/>
The grid fits, by the way, became so popular that when Microsoft and Apple were tired of paying Adobe licensing fees, they invented an alternative method for their alternative font format TrueType . Instead of specifying declarative “hints”, TrueType gives the font author full turing-full stack language , so that the author can control all aspects of grid fitting (avoiding Adobe’s declarative hints patents). For many years, there was a war between the Adobe Type 1 and TrueType formats, and the font developers were stuck in the middle, providing users with both formats. In the end, the industry has reached a compromise: OpenType . But instead of actually determining the winner, they simply pushed both specifications into one file format. Adobe now earned not on the sale of Type 1 fonts, but on Photoshop and Illustrator, so it removed the encryption part, refined the format and presented the fonts CFF /Type 2, which are fully included in OpenType as cff table . TrueType, on the other hand, pasted as glyf and other tables. Although somewhat ugly, OpenType seemed to do the work for users, basically took them out of business: just demand that all software support both types of fonts, because OpenType requires you to support both types of fonts.

Of course, we are forced to ask: if not PostScript, then what's in its place? There are other options to consider. The previously mentioned METAFONT did not use strictly defined letter shapes (filled paths). Instead, Knut, in his typical manner, is in the article “Mathematical Typography” proposed for typography a mathematical concept that is “most enjoyable”. You specify several points, and some algorithm finds the correct "most pleasant" curve through them. You can put these outlines on top of each other: define some of them as “feathers”, and then “drag the feathers” through some other line. Knut, a computer scientist at heart, even introduced recursion. His student, John Hobby, developed and implemented Algorithms for Computing " the most pleasant curve ", overlaying nested paths and rasterizing such curves . For more information on METAFONT, curves, and typography history in general, strongly recommend the book /a> as well as John Hobby articles .

Fortunately, the renewed interest in 2D studies meant that Knut's and Hobby's splines were not completely forgotten. Although they are definitely abstruse and unconventional, they recently snuck into Apple iWork Suite , and are the default spline type there.

Triangle Rise

Without going too far into the math of the math, we call such approaches as Bezier curves and Hobby splines, implicit curves , because that they are listed as a mathematical function that generates a curve. They look good on any resolution, which is great for 2D images to scale.

The 2D graphics supported the momentum around these implicit curves, which are almost indispensable when modeling glyphs. The hardware and software for calculating these paths in real time was expensive, but the printing industry came a big push for vector graphics, and most of the rest of the existing industrial equipment was already much more expensive than a laser printer with a fancy processor.

However, 3D graphics went a completely different route. From the outset, an almost universal approach was the use of polygons (polygons), which are often manually marked up and were manually entered into the computer . However, this approach was not universal. The 3D equivalent of an implicit curve is a implicit surface consisting of basic geometric primitives, such as spheres, cylinders, and cubes. An ideal sphere with infinite resolution can be represented by a simple equation, so at the dawn of 3D development for geometry it was clearly preferable to polygons . One of the few companies that developed graphics with implicit surfaces was MAGI . In combination with the clever artistic use of procedural textures, they won a contract with Disney for the design of the “light motorcycle” for the 1982 film “Tron”. Unfortunately, this approach quickly faded away. Thanks to the acceleration of the CPU and the study of such problems as “removing the hidden surface”, the number of triangles you could display in the scene was rapidly growing, and for complex shapes it was much easier for artists to think about polygons and vertices that you can click and drag, rather than use combinations of cubes and cylinders.

This does not mean that implicit surfaces were not used in the modeling process . Techniques like Catmell-Clark algorithm by the early 80s became the accepted industry standard, allowing artists to create smooth, organic simple geometric shapes. Although until the beginning of the 2000s, the Catmell-Clark algorithm was not even defined as an “implicit surface” that can be calculated using the equation. Then it was considered as an iterative algorithm: a way to divide polygons into even more polygons.

Triangles took over the world and were followed by tools for creating 3D content. New developers and designers of video games and special effects in films were trained solely on modeling programs with polygonal grids, such as Maya, 3DS Max and Softimage. When in the late 1980s, “3D graphics accelerators” (GPUs) appeared on the scene, they were designed specifically to speed up existing content: triangles. Although early GPU projects, such as NVIDIA NV1 , had limited hardware support for the curves, but it was buggy and was quickly removed from product lines.

This culture basically extends to what we see today. The dominant 2D model of PostScript images began with a product that could display “real-time” curves.At the same time, the 3D industry ignored curves that are difficult to work with, and instead relied on autonomous solutions to pre-convert curves into triangles.

Implicit surfaces are returned

But why could the implicit 2D curves be calculated in real time on a printer in the 80s, and the same implicit 3D curves are still very buggy in the early 2000s? Well, the Catmella-Clark algorithm is much more complicated than the Bezier curve. Bezier curves in 3D are known as B-splines, and they are well computable, but there is a drawback that they limit the way the mesh is connected. Surfaces like Catmella-Clarke and NURBS allow arbitrarily connected grids to expand the possibilities of artists, but this can lead to polynomials of more than fourth degree, which, usually do not have an analytical solution . Instead, you get approximations based on the separation of polygons, as is done in OpenSubdiv from Pixar. If someone ever finds an analytical solution to find Catmella's roots - Clark or NURBS, Autodesk will pay him a lot. Compared to them, the triangles seem much nicer: just calculate three linear equations in the plane , and you have an easy answer.

... But what if we do not need an exact solution? This is the question that was asked by the graphic developer Iñigo Quilles , when conducting research on implicit surfaces. Decision? Distance fields with a sign (signed distance fields, SDF). Instead of giving the exact point of intersection with the surface, they say how far you are from it. Similar to the difference between the analytically calculated integral and the Euler integral, if you have a distance to the nearest object, you can “march” around the scene, at any given point asking how far you are and passing this distance. Such surfaces breathed a whole new life into the industry through the demoscene and communities like Shadertoy. The hack of the old MAGI modeling technique brings us incredible discoveries, such as Surfer Boy from Keilles, calculated with infinite precision as an implicit surface. You do not need to look for the algebraic roots of Surfer Boy, you just feel the scene go.

Of course, the problem is that creating a Surfer Boy is only a genius like Killes. There are no tools for SDF geometry, all code is written by hand. Nevertheless, given the exciting revival of implicit surfaces and the natural shapes of curves, now there is a lot of interest in this technique. The MediaMolecule game Dreams on PS4 is a content creation kit based on a combination of implicit surfaces. In the process, most traditional graphics are destroyed and recreated . This is a promising approach, and the tools are intuitive and interesting. Oculus Medium and unbound.io also did some good research on this issue. This is definitely a promising look at what the future of 3D graphics and next-generation tools might look like.

But some of these approaches are less suitable for 2D than you might think. In general 3D gaming scenes, as a rule, advanced materials and textures, but few geometry calculations, as many critics and sellers immediately point out questionable products . This means that we need less smoothing, because the silhouettes are not so important.Approaches like 4x MSAA may be suitable for many games, but for small fonts with solid colors, instead of 16 fixed selective locations rather, calculate the exact area under the curve for each pixel, which will give you as much resolution as you want.

Rotating the screen in a 3D game causes effects similar to saccadic suppression , as the brain re-adjusts to a new look. In many games, this helps to hide artifacts in post-processing effects, such as temporal smoothing , which Dreams and unbound.io rely heavily on, to get good scene performance. Conversely, in a typical 2D scene, we do not have this luxury of perspective, so an attempt to use it will make the glyphs and forms boil and tremble with these artifacts in full. On 2D, they look different, and expectations are higher. When scaling, panning and scrolling, stability is important.

None of these effects can be implemented on the GPU, but they show a radical departure from the “3D” content, with other priorities. Ultimately, 2D-graphics rendering is complex, because we are talking about forms — exact letters and symbols, and not materials and lighting, which are mostly solid colors. As a result of evolution, graphics accelerators decided not to cheat on implicit real-time geometry, such as curves, but instead focused on everything that happens inside these curves. Perhaps, if PostScript had not won, we would have a 2D model of an image without Bezier curves as the main requirement for real-time. Perhaps in such a world, instead of triangles, the best geometric representations would be used, the content creation tools focused on 3D splines, and the GPUs would support real-time curves at the hardware level. In the end, it is always fun to dream.

Source text: [Translation] Why 2D vector graphics are much more complicated than 3D