Arick Shao | 邵崇哲 |
Why Study Abstract Things?A major trend in modern mathematics is abstraction. Rather than studying the more familiar objects, such as the natural numbers or the real numbers, mathematicians often instead consider general systems which satisfy given properties. The idea is that one is concerned less with the concrete form of something than with its structure, its defining properties. Of course, the question one should ask is why one would opt to study these abstract systems over more concrete objects. We will explore this question, and several answers, through a number of examples. An Analogy for AbstractionTo get a better sense of what abstraction means, let us first discuss a most primitive analogy. Imagine, for example, that you are a sheepherder from long before the advent of number systems as we know it. For this role, you would need to keep track of your sheep, so you go through your herd and count: one sheep, two sheep, three sheep, and so on. (Since at this point, we have not defined/discovered numbers yet, you should think instead of the concrete physical states of there being however many sheep.) Next, imagine you are some other person, who for some reason needs to count chickens—one chicken, two chickens, etc.—or perhaps trees, or people, or bowls, or whatever else comes to mind. Now, if you think through all these hypothetical examples for long enough, you eventually come across an epiphany: these processes of counting sheep, chickens, trees, people, bowls, and so on, are in some sense all the same! Although three sheep are clearly not the same as three bowls, there still is clearly something common to enumerating sheep and bowls. This intuition leads to the notion of natural numbers—rather than thinking of there physically being three sheep or three bowls, we introduce this abstract notion of "three" to represent this commonality between these physical states, as well as for chickens, trees, and any other crazy examples. In other words, this abstraction represents an understanding of this common element among the processes of counting various things. If we wish, we can certainly take this particular counting analogy even further. For instance, you may observe that "combining a group of three sheep with a group of two sheep results in a group of five sheep" is in some sense equivalent to "combining a group of three people with a group of two people results in a group of five people". Again, this directs us toward an abstraction capturing this common theme: \(3 + 2 = 5\). Of course, this analogy also extends to multiplication, exponentiation, and so on. Although this entire thought experiment seems silly by modern sensibilities, it does demonstrate, in very elementary terms, what abstraction entails. Modular ArithmeticLet us now consider more examples, this time in a direction toward abstract algebra. While most people view addition or multiplication as operations one would apply to numbers, be it natural numbers, integers, or real numbers, one already learns in high school that these notions can also be extended to other objects. For example, one learns to add two vectors or matrices (of the same dimensions) by performing a numeral addition on each element, while in linear algebra, one also learns to multiply two matrices (of the appropriate dimensions). An appropriate analogy can be found in computer science, in particular in typed programming languages such as C++. Those languages contain several data types (integers, floating point, and so on, as well as custom data types), which one can think of as different systems. Each of these data types may have its own definition of "addition". For example, integer addition is considered different from floating point addition, although they have similar interpretations; custom data types may also have their own version of addition (for example, ordered pairs of integers and vector addition). We next consider a more creative example by asking the following question: if the time is 11 o'clock, then after 4 hours pass, what is the new time? While one can plausibly interpret this question as "\(11 + 4\)", the reasonable answer would be "3 o'clock", not "15 o'clock" (assuming of course a 12-hour cycle). Of course, this example of adding time is manifestly different from the usual addition of natural numbers, but the point is that both cases can be reasonably interpreted as some form of "addition". Thus, we can look at this clock setting as a different system of addition, which behaves like the usual addition of natural numbers, except that after surpassing \(12\), we loop back to zero. In this new system, our rules of addition tell us that "\(2 + 3 = 5\)" and "\(7 + 9 = 4\)". We can also devise an analogous system of addition representing 24-hour clocks (which loops back to zero after surpassing \(24\)), or, if we feel especially obnoxious, 37-hour or 1379-hour clocks. If we want, we can also define multiplication and exponentiation in these "clock systems", again with the rule that one loops back to zero after surpassing \(12\) or some other base. For example, in the "12-hour clock" system, we can write \[ 11 + 4 = 3 (\text{mod } 12) \text{,} \qquad 11 \cdot 4 = 8 (\text{mod } 12) \text{,} \] with the last equation holding since \( 11 \cdot 4 = 44 = 3 \cdot 12 + 8 \). These systems are collectively referred to as modular arithmetic. Its significance goes far beyond mere formalization of clock arithmetic, though, as it has led to many fantastic and clever applications. For instance, modular arithmetic involving very large numbers has been tremendously influential in cryptography, with one of the most notable examples being the RSA algorithm. Abstract AlgebraNow, as in our primitive example for natural numbers, again we can construct abstract concepts that encapsulate the properties common to all the above systems which can be interpreted as "addition" or "multiplication". From considering addition or multiplication on its own, we often run into the notion of groups. We could also construct more complicated abstract systems that contain both addition and multiplication; this leads to the theories of rings and fields. The study of such objects forms a major field of mathematics now known as (abstract) algebra. What is interesting is that applications of these abstract concepts go beyond adding and multiplying. For instance, groups have also been used to model symmetries and transformations. For a simple example, let us compare the 12-hour clock to a regular dodecagon (polygon having 12 sides). The rotational symmetries of this dodecagon—the ways one can spin this dodecagon without changing its appearance—can be enumerated as counterclockwise (or clockwise) rotations of \(0, 30, 60, 90, \dots, 330\) degrees. Moreover, we can associate a \(30\)-degree rotation with adding by \(1\) in the (12-hour) clock arithmetic, \(60\)-degree rotation with adding by \(2\), and so on. More explicitly, if we think of the dodecagon as slowly spinning, at 30-degrees per hour, then each hour of spinning corresponds to a rotational symmetry, and adding by \(n\) hours directly corresponds to the rotational symmetry of spinning by \(30 \cdot n\) degrees. In short, via this abstract notion of groups, we have made a connection between two rather different concepts: addition modulo \(12\) and the rotational symmetries of a dodecagon. Such abstractions can also lead to new perspectives on dealing with the more concrete objects that we ultimately care about. One elaborate example is the rather elementary statement of Fermat's last theorem: if \(n\) is an integer greater than \(2\), then there are no positive integers \(a, b, c\) such that \(a^n + b^n = c^n\). Very surprisingly, this simple statement (from the 1600s!) was not proved until the 1990s, and the proof itself relied upon an intense amount of advanced abstract algebra and number theory. Distances and LimitsFor a different set of examples, let us now turn toward first-year calculus, in particular to the notion of limits of (real-valued) sequences and functions. While we will not get into the exact definitions here (\(\delta\)'s and \(\varepsilon\)'s!), what is important is that the definition of a limit depends only on the distance between numbers: \( d (x, y) = | x - y | \). Even the fact that we are dealing with the real numbers may not be so important; the fundamental intuition behind limits is of something tending "arbitrarily closely" toward another object. As a result, we have another situation that is ripe for abstraction—as long as we have a system with an appropriate notion of distance, we can define limits in an analogous manner as for real numbers. For instance, one has a intuitive notion of "distance" between points in two or three-dimensional space from the Pythagorean theorem: \[ d ( (x_1, y_1), (x_2, y_2) ) = \sqrt{ ( x_2 - x_1 )^2 + ( y_2 - y_1 )^2 } \text{.} \] By using this distance instead of the single-dimensional distance from before, we obtain limits for ordered pairs and triples of numbers, which is precisely the limits encountered in vector calculus. However, this is still too limiting (pun not intended), as there is no need to constrain ourselves to finite-dimensional objects. There have been many notions of distances constructed on various infinite-dimensional spaces (say, of functions), which allow us to expand our previous outlook on limits far beyond numbers, points on a plane, or points in space. These infinite-dimensional extensions of the toolbox of basic calculus form a significant amount of the mathematical foundations in many modern areas of study, such as partial differential equations, quantum mechanics, and signal processing. As an example, one often solves differential equations—more specifically, proves that a differential equation has a solution—by using such infinite-dimensional limits of functions. Without going into details, the idea is to obtain almost-solutions that better and better approximate the actual solution. Then, by using a very abstract theorem on systems with distances—the contraction mapping theorem—one can generate a limit of these approximating functions that is the actual solution to the differential equation. This process, called Picard iteration, is often sketched in a basic differential equations course. What is even more surprising, though, is that the same abstract contraction mapping theorem that generates solutions to differential equations also has many other seemingly unrelated uses. For example, by cleverly constructing a "distance" on regions in a plane, one can use abstract limits and the contraction mapping theorem to construct many well-known fractals, such as the Sierpinski triangle and the Koch snowflake. This is another example of discovering unexpected connections between different topics. Moreover, this connection is uniquely mathematical, since one would not have encountered such insights without having expanded into a broader, more abstract perspective in the framework of mathematics. The Brachistochrone ProblemFor another example along similar calculus foundations, we look at a historical physics problem from several centuries ago, the brachistochrone problem: consider two points \(A\) and \(B\) in space, with \(B\) lower than \(A\); assuming (for simplicity) Newtonian gravity and no friction, find the shape of a ramp joining \(A\) to \(B\) that minimizes the time needed for a ball to roll along the ramp from \(A\) to \(B\). While your initial intuition might suggest that a straight ramp is the best (as it represents the shortest path from \(A\) to \(B\)), the optimal solution is in fact a rounded shape called a cycloid. If you want to convince yourself, you can see, for example, this video demonstration (courtesy of KoonPhysics). The main idea behind the solution comes from a basic principle in calculus: if a differentiable function attains its maximum or minimum value at a point, then its derivative must vanish at that point. The creative step is taking this principle—which one encounters in a first-year calculus course—and generalizing it to infinite-dimensional spaces of curves. From Newtonian mechanics principles, one can construct a function mapping a curve from \(A\) to \(B\) to the time required for a ball to roll down the curve. One then discovers the optimal cycloid by taking "derivatives" (in this infinite-dimensional space of curves!) of this function and finding curves so that this derivative vanishes—just like in first-year calculus! This provides a very early scientific application of abstracting the concept of differentiation. These ideas also led to the development of what is called the calculus of variations, which is an active topic of research in mathematics today, as well as a fundamental tool for several areas of mathematics and physics. The Many Uses of AbstractionIn summary, there are many reasons mathematicians often elect to consider more abstract concepts. For example, we have discussed the following:
Given the considerable benefits that abstractions have produced in modern mathematics and in other mathematically-inclinced disciplines, we will likely see many more abstract ideas, as well as many interesting applications and connections, in the years to come. |