Alright before we start, I’m gonna call the Conditional Gradient Algorithm as Frank-Wolfe Algorithm (FW), attributing to the authors of the first ever paper on these methods viz., Marguerite Frank and Philip Wolfe in 1956! The simplex method (one of the top 10 algorithms of the 20th century, I’d say that simplex gets the first place for me) was introduced by Dantzig (I strongly encourage to read the history in the wiki page, it’s amazing!) in 1947 for solving linear optimization (LO) problems. A LO algorithm solves the following question: Let’s say we are given with the following two types of data viz., 1. A linear function in say ; 2. Legal set: A (sub)set of points that can described just using linear equations or inequalities. Now how can I pick the point in the legal set where achieves the minimum value (or maximum value, it doesn’t matter in this case only, in general this is not true).
This will hopefully the last post in the matrix completion problems. But since the MBI inequality is widely used, I decided to squeeze in one more application of Community Detection problem in this as well (yay!) in the future post. So first the matrix completion. problem, we will see how a simple algorithm can yield exact results. To that end, since the problem as stated is ill posed, we will first make some assumptions on the matrix that we are completing, namely, the low rank assumption (we will discuss the case where the matrix we are completing is square).
This last three weeks I was in Utah for a summer math school or conference whichever you prefer that was organized by the Institute of Advanced Studies (IAS), Princeton, New Jersey. Basically we had three weeks of lectures, problem sessions, endless fun discussions day in day out with free food, free stay – the resort was amazing, we stayed at the Zermatt resort (if you’re using headphones you might wanna turn the volume down a bit). Anyway, the point of this post is lectures given by Roman Vershynin that I sat through on Random Matrices and applications (basically this is a scribe). Specifically, about the famous problem called as matrix completion. Later (hopefully soon) I’ll write about another topic that definitely got my attention called as Differential Privacy.
In the basics we saw that vector spaces (v.s.) are defined over a field. Field is the keyword. Recall that examples of field include real numbers ‘r’, complex numbers ‘c’, rational numbers ‘q’. Let’s take the last example of q. q consists of numbers of the form a/b where a and b are integers like -2, 3, 5, 10 etc.. Examples of numbers that are not rational include square root of 2, pi, e etc.. v.s. of dimension n are sets of n – tuple of numbers where each number in the tuple comes from the underlying field. The question is, why do we need a field, that is, why can’t we just define it over the set of integers ‘z’. Recall also that z shares all properties with q except that not all elements have a multiplicative inverse i.e., there is no number for 21 in z so that the product of both of these numbers equal 1 but if we ask for an element in q, then we have 1/21 * 21 = 1. But that’s ok because nothing is stopping me from defining n – tuples from z. For example let’s look at z^2 which consists of numbers of the form (a,b) where a and b are integers. Well, the world didn’t come to an end anyway, in fact, we can see that if we take two 2- tuples of this form and add them coordinate-wise, we get a number of the same form with different integers. Secondly, if we multiply (coordinate-wise) a 2-tuple with an integer we still end with a different 2-tuple where each coordinate is again an integer, so what’s the problem?
If you noticed, a significant of aspect this blog is digression all over so I thought it is only natural for me to write about it. It turns out that one of the main reasons why math is ‘hard’ for those who feel that it is hard is that they learn math the hard way which is why it is hard. That doesn’t obviously sound like a thing, but the truth is that it is as good as any other reason. Let’s take an example.
Here we will see how the phenomenon of time zones be made very ‘easy’, that is, for non-frequent travelers it becomes a pain every time they have to figure out how long the travel duration is even though most airlines provide the information in the ticket. I personally have tried computing the duration of actual travel in my mind just using the local time of departure and local time of arrival. Why would someone do this? well, why not? it’s just a good mental exercise, the calculations are often interesting when there are more than 1 or 2 stops or worse if we miss the connecting flights. In the latter case we have no other option but to compute ourselves or well ask someone to compute it for us. So, what makes the timezones computations kinda hard or interesting or complex? From the mathematical point, it is because of the fact that the time zones are not continuous. Let me try to motivate what I mean. One reason why timezones can be confusing is because that the places where times change are often arbitrary in the sense that they are chosen by a specific set of people, so not most people know the reason why India just has 1 timezone or why US has 6 timezones. Well the question here is not why we need multiple timezones but why those specific numbers. One might try to make a crude relation that the number of timezones for a country increase with the size of the country in terms of the area of the land. Well this might be a very bad assumption because the area even if we assume countries to be rectangle is length times breadth (product of difference between extreme points of latitudes and longitudes ) but only breadth (longitudes) contributes to the timezones. A smarter option would be to just use longitudes, well that might be bad too, because of daylight savings (see the timezones of US when the daylight savings are on, it’s ridiculous!). Anyway, a mathematician would easily see that the way timezones work is completely arbitrary with the current setup. How do we fix this? Here we go, I’ll explain the process since I figured that, that is the best way to explain it a clean way.