A PhD Thesis

Someone close to me finished their thesis in a separate discipline in the last 1.5 years. It’s well-lauded (news came out recently about it), so I thought I would share. It is in a discipline, with which I am not that familiar, so I cannot comment that much on the content, though I had some exposure to related fields nearly 15 years ago, so it is not totally unrelated either, so it is interesting. I have not finished reading through, but I am interesting in the stochastic approach portrayed for debris transport.

Here is a Google link containing links to the thesis and related papers:

Another quick chapter

Work is still busy, but I still come across a number of articles, which could be of interest to the audience of this blog:

And for those interested in studying datascience, you could have a day and a half free access to datacamp:

A few data science learning links

Well, it’s been busy this past month and I haven’t had time for derivations, but I have nonetheless come across some interesting articles concerning my interests relating to this blog (in this case, just data science). I hope it’s helpful:


COVID-19 and What’s With Reporting about Exponential Increases in Cases?

“Flattening the curve” has become a popular expression nowadays, referring to slowing the spread of the new Corona virus (for a reference, https://www.livescience.com/coronavirus-flatten-the-curve.html).  In contrast to a “flattened-curve” (South Korea), there are plots of exponential growth (most other countries and South Korea in the early stages of the disease):


First, off, why do many plots start at the day when 100 cases were reached?  Before there are a lot of cases, statistically, the spread of the disease can be noisy; that is, say quite a few of those infected early on are socially distant, then the disease might only be transmitted over there few interactions  (there’s the separate issue of testing for the disease, but that can be potentially gotten into at a later point).  Also, the slow incubation period/ slow time to show symptoms could cause noisiness in the plot because in the early stages, people weren’t tested until there was a reason to.  Anyhow, many of the plots in the lower right-hand corner (you are look at each country individually) confirm that the early trend is not as clearly linear as after a many cases have shown up: https://gisanddata.maps.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6 (which I got from here: http://www.cidrap.umn.edu/covid-19/maps-visuals). Also note that the slow incubation period is the reason why social distancing efforts take weeks to be noticed.

To explain the exponential beginnings, we can look at a number of models used to describe the spread of the disease:


(Or for the commonly referred to SEIR model: https://sites.me.ucsb.edu/~moehlis/APC514/tutorials/tutorial_seasonal/node4.html )

Let’s look at the early stages (and also assume that once infected, you cannot be infected again, though in the early stages, we can ignore this relation).  Taking  I to be the number of infected and H to be the number of healthy individuals, it is assumed  that the rate of infection is proportional to both these values

\frac{d I}{d t} = k H I = k I (N - I)

where the total population is either healthy or infected N = H + I .   Note that even if it is not true that every individual has the same number of contacts, statistically-speaking, the relation often holds.  The solution to the differential equation is the sigmoidal function(for reference: https://www.reddit.com/r/dataisbeautiful/comments/fohr58/oc_the_technical_problems_of_fitting_a_logistic/)

I(t) = \frac{kN}{k+\exp(kNt-t_0)}= N \frac{\exp(kNt-t_0)}{\exp(kNt-t_0)+1/k}

For small values (t << t_0 + \ln (1/k)/kN or I(t=0) = N \frac{\exp(-t_0)}{\exp(-t_0)+1/k}  << N), this curve is exponential:

I(t) \approx k N \exp(kNt-t_0)

Another way of seeing this is

\frac{d I}{d t}  \approx k I N

when I << N and the solution of that equation is a exponential

I(t) \approx k N \exp(kNt-t_0)

the same as the above!

As an exercise, you can plot the approximate and exact solutions and see how they differ (when they are the same and when they differ significantly).





Another podcast — Athlete chooses math :p

I haven’t watched it yet, but I heard that John Urschel was a talented mathematician when he was playing in the NFL. Enjoy:

If you enjoyed this, depending on your inclinations (especially if they are to the more applied side), you might wish to listen to DataCamp, e.g.:

Dataframed Podcast on “Data Nerdism,” Fun, and General Thoughts about Education

A cute mathematical problem, the catenary

Originally, I saw the problem in the link below, elsewhere (though I do not recall where).  The problem is solved in the video, but you can also look at my notes.

Can You Solve Amazon’s Hanging Cable Interview Question?

In the Wikipedia article on the Catenary, look at the Mathematical Description and Analysis sections for relevant details which I will describe below.

Off-hand, to determine the distance between the poles if the lowest point of the cable is 20 metres off the ground, I would have used a y = a \cosh(x/L) + b because I remembered the mathetamical form of a hanging cable.  However, balancing the horizontal tension in the middle of the cable and gravity in the length of the cable with the force between the pole and cable (and assuming no elastic effect changing the length of the cable) allows you to equate a and L in the equation above.  The same result can probably be gotten by looking at an infinitesmal element of cable, but the former approach is mathematically easier. b depends on a because a + b = 20.  We know that half the length of the cable is 40 metres so (using the equation for the length of a curve and skipping a few steps) 40 = \int_0^d \sqrt{1+(dy/dx)^2}dx= a \sinh(d/a) where d is half the distance between the poles. The final equation is a \cosh(d/a) + (20-a) = 50.  Hyperbolic functions have quadratic relations between them, so taking the appropriate root (positive one) should give half the separation distance being

d = \ln\left( 120/35 + \sqrt{120^2/35^2+1} \right) \approx 22.7 (metres)

Next, I’ll address the easy problem, which requires little math; if the cable is 80 m long, but the height of the drop is 40 metres, that means that the poles must be side by side (i.e., the cable is folded in half), otherwise the cable cannot drop down that far.

A popular article on teaching math

For your interest:

I would need more time to write a more thorough review, but keep in mind that it’s written for a general audience (possibly with over-simplifications). That being said, some of the things proposed (like multiple approaches and being conscious of not skipping steps) are things I also think are improtant to take into account.

Traffic thought experiment

Often when I am walking/driving, I like looking at details of my surroundings. I like looking at waves on the St Lawrence River and I definitely want to try to explain the wave patterns better. My M.Sc. supervisor might have some papers to help understand that better (examples or just a link), which I will hopefully get to in the not-too-distant future. One thing that really bothers a lot of people is a traffic jam. I had some thoughts on this topic and am getting around to writing about it. I will try tailoring this post to address a wide audience.  There should be a follow-up article exploring some more mathematical details.

There are quite a few papers (including some work by an academic “great-uncle” (PhD supervisor’s postdoctoral co-supervisor) of mine, Nigel Goldenfeld) studying traffic and the origins of traffic jams. Do traffic jams occur as an intrinsic part of the system (cars interacting with each other on a network of roads)? Or is it because of individual behaviours which give rise to these problems? You might think the latter is more reasonable, but in certain cases very different behaviours of the constituent parts (how drivers drive their vehicles) can result in the same behaviour of the system (traffic jams), if some very general rules are followed.

Let’s start with a one-lane road:

Cars on a road

A “typical” car is 4 metres long (in the diagram L = 4m). To estimate D in the diagram, let us consider it in terms of how far the cars need to be to safely stop. Say a person needs about 2 seconds to react to what is in front of them (this might be an estimate for anticipated stopping time on the highway). I will deviate from that guess (which might be explored in the follow-up post) and instead look at breaking up the estimate as follows:

D = \Delta t_r v + \frac{a}{2} v^2

We can take the reaction time, \Delta t_r, to be about a half second, so the distance covered is \Delta t_r v. Assuming constant deceleration for intense breaking (say 5 m/s^2 for reference), the time taken is t = v/a and the distance covered while stopping is d = v^2/(2a).

The least space a single car takes up when trying to be safe is about:

L +D = L + v^2/(2a)+ v \Delta t_r

Correspondingly, the maximum density of cars is reciprocal of the above relation so:

\rho = \frac{1}{4 + v^2/(2a)+ v \Delta t_r}

Taking a = 5 m/s^2 and Delta t_r = 0.5 s with v in m/s, we get this relation between car density (in cars per m) and speed:

Density (cars/m) vs Speed (m/s)

Density (cars/m) vs Speed (m/s)

or in tabular form (where Rho is the density in cars per metre and v is the speed in metres per second):
rho v
0 0.250000 0.0
1 0.217391 1.0
2 0.185185 2.0
3 0.156250 3.0
4 0.131579 4.0
5 0.111111 5.0
6 0.094340 6.0
7 0.080645 7.0
8 0.069444 8.0
9 0.060241 9.0
10 0.052632 10.0

A car takes up effectively less space at lower speeds when trying to be safe according to this model, so highways being slowed down to surrounding roads at high traffic density (many cars on a single road at the same time) is intuitive.  Let us extend this by looking at the flow rate of cars, \Phi, which is the velocity times the density:

\Phi = v \rho = \frac{v}{4 + v^2/(2a)+ v /2}

Flow rate vs velocity

Flow rate (cars/sec) vs velocity (m/sec)

Looking at some integer values, there’s a peak around 6 m/sec:

phi (cars/sec) v (m/sec)
0 0.000000 0.0
1 0.370370 2.0
2 0.526316 4.0
3 0.566038 6.0
4 0.555556 8.0
5 0.526316 10.0
6 0.491803 12.0
7 0.457516 14.0
8 0.425532 16.0
9 0.396476 18.0
10 0.370370 20.0

(Note that this is not exact, but I’m using this method to illustrate that although rigor and exactness in calculations and quantitative methods is nice and often important, in a lot of cases, rigor and exactness are difficult to obtain because of too many unknowns.  In such cases, resorting to something simple to get a sense of what is being studied can help.)

Experimenting with different parameters, I often got peak flow rates around 5-10 m/sec which is 18-36 km/h or 10-20 mph. With the current parameters, a peak occurring at a little over 20 km/h suggests. (On a side note, I am wondering if this gives some intuition into one factor as to what suitable speed limits should be — at typical traffic levels, what is a safe speed on that road?) Interestingly, there is a conflict between how fast an individual travels through a region and how all vehicles do.  On a nearly empty road, you can choose to drive however fast you want (being mindful of the speed limit).  Eventually as the density of cars increases, the speed at which they can safely travel decreases.  This results in the maximum flow of cars on the road (the number of car passing a fixed decreasing).  This corresponds with the intuition that traffic jams occur at high traffic volumes but not at low ones (unless there’s construction).

The above reasoning is qualitative, but can help some one analysing a problem with giving them an intuition. In a future post, let’s see how this model performs under perturbations. Done, the next analysis will be appropriate for an upper year undergraduate student (though younger students with the appropriate calculus skills should be able to follow the arguments as well). That being said, be cautious of this particular model because it was only conceived in a thought experiment and was highly idealized (nothing was even considered about network effects — because real traffic happens in a “grid” of multi-lane roads with various signage affecting the traffic flow).  When doing mathematics, that approach can be sufficient, but not when engaging in empirical empirical discipline like the natural sciences (e.g., physics, chemistry, biology) or social sciences (e.g., economics or politics).