I live in Bordeaux. Like many others European cities, we have great transportation system here (Europe ftw woohoo!). Every day I take the B-line tramway heading to the campus. And here comes my struggle every single morning: when I am about 20m to the station, I see the the tram slowly arriving, at that moment I have to decide whether to run or wait for the next one. One observation that I have drawn after countless mornings is that it doesn’t matter whether I leave my house 4,5 minutes earlier or later, the average waiting time remains the same, which is usually the normal interval time between 2 trams. So supposing that the tram schedule is fixed (indeed it is ?!?), this means that even if I leave my house a bit earlier, I still can not catch that ‘previous’ tram ?!?
Today, while I was shuffling through StackOverflow, trying to find something that could light up my boring Saturday afternoon, I stumbled upon a discussion on “The Waiting Paradox” (or The Inspection Paradox), which basically summarizes all my experiences: given a bus that arrives at the bus stop approximately every 15 minutes and a passenger arrives at a random time, the passengers on average would have to wait the full 15 minutes!
Before diving into the details, let’s talk about the Poisson distribution first. Poisson distribution is usually used to modeling the number of events happening over time or space. For example, the number of customers entering a store in a given time interval, the number of phone calls we receive in a hour, and in our case, the number of buses coming in the next hour… Poisson distribution and Exponential distribution are strongly related but fundamentally different as the former is discrete (a count number) and the latter is continuous (a waiting time). They are related in a way that the distribution of the time between events in a Poisson process follows an Exponential process. What makes Poisson distribution unique between among renewal processes is its memoryless property inherited from its fellow Exponential distribution. Memoryless means that the expected time until the next event is the same no matter how long since the last event occurred. This is the main idea that forms the Markov Chain Monte Carlo, a powerful Machine Learning technique that I will talk about in my future posts.
Back to our topic, The Waiting Paradox is based on the Poisson process. If we think about it, the above assumption really makes sense: in a perfect world, where every bus arrive exactly on time (in our cases every 15 minutes), the waiting time will be uniformly distributed, and the expected waiting time will be E(X) = 7.5 minutes. And that’s what intuitively people believe. However, it never happens in real life, ie. the buses never comes exactly on time. The above assumption about equal time interval is so strong, so we need something more realistic like: in average there will be 4 bus arriving per hour (which still somehow infers the idea of 15 minutes per bus in a much less strong way). We will then model it with a Poisson process with parameter (4 buses/60 minutes). Thus, as mentioned earlier, the waiting time between 2 buses (2 events) follows a Exponential distribution, which means that its expected waiting time is minutes.
This little example shows us that sometimes our “mental model” can be misleading because it is way simpler than the situation in real life. Being a scientist/engineer, our mission is to fight against the “temptation” of our intuition, and to use scientific knowledge to back up our argument.