The word ‘linear’ is used in so many different contexts that it kind of loses its importance. After a lot of contemplation, it turns out that this word as simple as it may sound, has far reaching consequences. The point of this post is to exactly make this clear and precise (maybe not precise). There are lots of ways to describe linearity. Someone who has taken a calculus course in high school or college will say that things are nice when everything is linear, for example linear equations or straight lines have a constant slope, oops we need to define what ‘slope’ is. Let’s say we would like to buy sugar. The cost of a kilogram of sugar is $10 and to simplify things, we pay a tax of $5 how much ever the quantity is. If we want to buy 10 kilograms of sugar, we need $105 i.e., 10 times 10 plus 5(meh). A linear equation is a thing that can be written in this form i.e., c = ax + b where a and b can be any arbitrary numbers, x is the amount of sugar and c is the total cost to buy x kilograms of sugar. The number a is often called as the slope since it decides how steep the function, that is, if a is a big positive number then we can say that the buying sugar will be expensive and so on, well to be honest it also depends on b but let’s say that we always have $b. We note at this point that this also allows a to be zero (now what does that mean in our example? I’ll leave that for you). But the situation is bad when a is zero, why? this requires some amount of reasoning and some machinery or terminology to address this issue.

What is a function? For us, a function is a box or a machine which takes a number as input and outputs ‘a’ number, pretty simple right? In our example, we say that c is a function of x, in short c = f(x). Once someone gives us the numbers a and b one can quickly compute the value c for a fixed x by using simple high school algebra techniques. Wait a minute, we can also rewrite the equation as x = c – b divided by a, tada (!) now we have a x as a function of c, x = g(c). But there’s this one small thing, we can only to do this if a is not equal to zero, ouch! We now turn to invertible functions. We say that a function y = f(x) is an invertible function if f is a function and if for every value of y there is a corresponding value of x, that is, given the output we should be able to figure out what the input should have been. When a is zero, c = b, which means no matter what x is, our output will always be b, in other words given the total cost one can never figure out how much sugar was bought.

But mathematicians are always looking for invertible functions, why? simply because we can work with y or x and convert the final answer by applying f (or the inverse of f) appropriately. So, we want linear functions to be invertible and so we define a not be equal to zero, goodbye constant functions! It’s kind of odd that it is actually difficult to swallow this because when I was in my high school, I was told that straight lines are linear functions and constant functions are straight lines so why throw them away? Don’t be mad at me, it is what it is. To be precise, I’m trying to motivate linear (invertible) transformations which we will see next time. In fact, we actually don’t get much value allowing constant functions, we don’t have much to talk about them since they’re just constant functions (MEH). Next time we will go much deeper and start talking about matrices.