Understanding Markov Chains through Dice Rolling Simulations
Written on
Why should Markov Chains matter to you?
Markov Chains serve as a fundamental element in numerous Machine Learning frameworks, including Hidden Markov Models (HMM) and Reinforcement Learning. They extend their utility beyond AI, influencing fields like Finance—particularly in analyzing stock price trends—and Engineering Physics, as seen in Brownian motion. A prominent application of Markov Chains is Google's original PageRank algorithm, which evaluated page importance based on this methodology.
Additionally, I find the underlying principles of Markov Chains intriguing because they offer a distinctive perspective compared to many conventional Data Science methods. Typically in Machine Learning, predictions about future events are made based on historical data. In contrast, Markov Chains focus solely on the current state to forecast future outcomes—an approach I will clarify shortly.
How would you forecast the weather?
Imagine you were tasked with predicting the weather for an entire year in a specific location. How would you tackle this?
One method involves determining the likelihood of rainy versus sunny days. For instance, if you reside in London, where precipitation occurs approximately one-third of the time, you could assign a probability of P(Raining) = 1/3 and P(Sunny) = 2/3. A simple Python simulation could then yield your predictions.
An example outcome might appear as follows:
However, there’s a crucial limitation to this approach. In reality, we know that:
- If it rains one day, the likelihood of rain the next day increases.
- Conversely, if it’s sunny today, the chances of a sunny day tomorrow are higher.
Our initial model overlooks this correlation. Ideally, we want our model to reflect that a rainy day increases the probability of rain the following day, and the same applies for sunny days. We can express this as: "If the model indicates rain today, it should maintain a high probability of rain tomorrow, and vice versa for sunny weather."
This is where Markov Chains come into play.
Markov Chain for Weather Simulation
Let’s outline how a Markov Chain would be structured for our weather forecasting scenario:
A Markov Chain comprises three main components: states (in this case, rainy or sunny), events (such as transitioning from sunny to rainy), and the probabilities assigned to each event (e.g., a 0.3 probability of moving from sunny to rainy or remaining rainy).
In this model, rather than assigning daily probabilities for sunny or rainy weather, we incorporate probabilities for staying in a sunny or rainy state, as well as for transitioning to the other state.
Formally, we consider two states: sunny and rainy, each with a probability for either leaving the state or remaining in it.
Transition Matrix
For simulations, we utilize a transition matrix instead of a diagram. Our transition matrix would look like this:
How do we interpret this matrix? To read an entry, we start from the row and move to the column. For instance, the bottom left element can be read as:
> The probability of transitioning from a sunny day to a rainy one is 0.3.
Dice Rolling Example
Now, let's consider a more complex example involving dice rolls. Imagine two players, A and B, who take turns rolling a six-sided die.
Player A bets that they will roll two consecutive numbers that sum to 10 first, while Player B bets on rolling two consecutive sixes. The game continues until one player achieves their outcome.
What is the probability of Player A winning?
How do we model this? You guessed it—using Markov Chains.
Markov Chain for Dice Rolls
Let’s explore how to approach this scenario.
For Player A, the combinations of rolls that add up to 10 include rolling a 5 twice, a 4 followed by a 6, or a 6 followed by a 4. Player B’s only winning condition is rolling two sixes consecutively.
This means that rolling a 1, 2, or 3 becomes irrelevant for our analysis: these rolls represent our Initial State S. The other relevant states are 4, 5, and 6.
Here’s a preliminary draft of our Markov Chain:
We begin with State S (rolling a 1, 2, or 3), which has a probability of 3/6=0.5 of remaining in State S. The other relevant states (4, 5, and 6) each have a 1/6 probability of staying in their current state or transitioning to another, with a 0.5 probability of moving back to state S. It's important to note that the probabilities for all states must add up to 1.
This setup looks promising, but we still need to include details about the specific outcomes we care about. Let’s refine that:
This diagram is similar to the previous one but now includes the terminal states (marked in green for Player A's wins and red for Player B's wins). These terminal states have a probability of 1 of remaining in that state, thus termed absorbent; once reached, the game concludes.
Notably, transitions can occur from 5 to 6 or from 5 to 4, but not vice versa. If a 6 is rolled followed by a 4, the game ends immediately, explaining the absence of arrows between these two states.
Despite these clarifications, we still need to determine the probability of Player A winning. To achieve that, we must conduct simulations using the transition matrix of this Markov Chain.
I created four additional states beyond rolling a 1, 2, 3, 4, 5, or 6: rolling a 4 followed by a 6 (46), rolling two fives in succession (55), rolling a 6 followed by a 4 (64), and rolling two sixes in a row (66).
Player A wins when reaching states 46, 55, or 64, while Player B wins upon reaching state 66.
Running Simulations
Let’s see how we can implement simulations using this transition matrix with Python. The code is available on my GitHub.
Initially, we define s, a vector representing the probability of reaching each state at the start. The order mirrors that of the transition matrix—where the first value s[0] corresponds to rolling a 1, s[1] corresponds to a 2, and so forth until the state 66 (rolling two sixes).
The first six entries hold a probability of 1/6 (0.1667) since any of the six faces can be rolled first. The absorbent states (46, 55, 64, and 66) hold a probability of 0 since they cannot be reached after the initial roll.
Next, we define P, the transition matrix established earlier, as a NumPy array.
Before executing the simulations, we should verify that each row of P sums to 1 (a useful check for understanding transition matrices).
In our case, all rows sum to 1 (slightly above for the first six as we use a rounded approximation of 1/6).
Now, we can proceed to run the simulations. This involves multiplying P by itself repeatedly and then applying it to our initial state vector s. This simulates numerous dice rolls starting from the initial state s.
The results are as follows:
After conducting a large number of rolls, we observe that the probabilities for the first six states are effectively zero, signifying that either Player A or Player B will eventually win, concluding the game.
The first three non-zero probabilities in the vector represent states 46, 55, and 64 (Player A's victories). Summing these probabilities gives us the overall probability of Player A winning.
The conclusion reveals that Player A has a winning probability of 0.77, while Player B's chance stands at 0.23 (1–0.77).
Further Exploration
I hope this introduction to Markov Chains has piqued your interest.
I encourage you to explore different scenarios involving dice rolls and replicate the steps outlined here (such as aiming for a sum of 7 or adding a third player) to solidify your understanding of this concept. Additionally, feel free to utilize my code for simulations with NumPy.
I hope you found this tutorial insightful! Your feedback would be greatly appreciated.
Connect with me on LinkedIn and GitHub to discuss Data Science and Machine Learning, and follow me on Medium! You can also support my work by becoming a Medium member through this link.