Decoding Westeros: Optimizing Predictive Analytics in the Game of Thrones Universe

Mehmet Emre CETIN
Sep 22, 2023
10 min read

Updated: Nov 26, 2023

Decision Tree, Pruning, Bagging, and Random Forest, with a fantastical twist incorporating elements from the world of Game of Thrones.- made by DALL-E

Unraveling the Game of Thrones Dataset: Insights and Inconsistencies

The first impression of Game of Thrones data is that everyone is somehow and someway either classified as married or not married. People in this dataset found a way to legitimize their road to victory. In other words, the sitting on an Iron Throne. Which is quite mind-boggling. Marrying is quite essential in the Game of Thrones universe. For military control, for alliances, for any kind of power, of course for money; people were getting married in this universe (Bechky, 2015). There would be a marrying list for the White Walkers too if their names were added to this dataset. Marrying is that vital. Speaking of White Walkers, why they weren't in this dataset? Even Night King could not enter this dataset too. He is in the books, and a legendary character/killer too (A Wiki of Ice and Fire). They have the will the cleanse the whole Westeros. Staying alive precisely depends if there is a White Walker in this universe or not. There would be no winter discussion without the White Walkers. Some may argue that they are already dead which would be a plausible idea. However, how come they have a self-aware opinion (Refinery29, 2017) on how to get revenge from the Westeros if they are completely dead? So, not adding the White Walkers deserves criticism. Judging who is going to die or not is highly related to knowledge of this universe. Ignorance might be bliss (Gray, 1747), but not in the Game of Thrones. One has to know her/his enemies, and also literally has to know everything. Because one can easily die. Speaking of 'ones', Jaqen H'ghar is in the dataset. But, according to this dataset, he did not appear in any of the books. He would be a colourful character in the books too. Oh wait, he's in the books (A Wiki of Ice and Fire) (Martin, 2005,2011,2012; Diamant, n.d). But, this dataset states that he is not. Furthermore, In TV series, he's a faceless man. In this dataset, he would be dataless man. Finding data to identify him is almost impossible. One can say; a man has to hide or a man has no data. According to this dataset, Faceless man has no relation with dead people. How to interpret this situation? He allegedly kills others, and no one knows what to know. The other option is, he does not commit anything at all. The third option would be that this dataset has no information about his businesses in Westeros. The options could be extended to extraordinary. But, guessing what happens is not necessary, and it is out of this paper's rhetoric. Anyway, this dataset has missing variables which is not a surprise anymore for anyone who is analyzing this dataset for a second time.

 age           isAlive        popularity      numDeadRelations
 Mode :logical   Mode :logical   Mode :logical   Mode :logical   
 FALSE:431       FALSE:431       FALSE:431       FALSE:431       
 isMarried         book1           book2           book3        
 Mode :logical   Mode :logical   Mode :logical   Mode :logical  
 FALSE:431       FALSE:431       FALSE:431       FALSE:431      
   book4           book5            plod           gender       
 Mode :logical   Mode :logical   Mode :logical   Mode :logical  
 FALSE:431       FALSE:431       FALSE:431       FALSE:431      
 above_avg      
 Mode :logical  
 FALSE:431

Constructing the Model for Decision Tree (pruning) - Bagging - Random Forest

Date of birth has some differences in terms of having contradictory dates. Meaning, Aegon Targaryen (son of Rhaegar) was born in 281 and is two years old. Also, Barra was born in 298 and is one years old. Both persons’ timeline will not be the same. They will not have the same people in their environment. The person is going to kill them might have died already in other person’s timeline. The examples might be extended. But one is fine. Date of birth is removed from the model to avoid some above mentioned predicaments. Ramsay Snow (aka Bolton) has been selected as a classifier. If one of the watchers of this show or books do not know him they cannot claim that they know the show or the books yet. He can be considered as young. He is married. Unlike Margaery Tyrell, he represents the other side of the Game of Thrones universe (Game of Thrones Fandom). Meaning, he likes to play mind games by torturing people. He finds others’ weaknesses as opposed to make an alliance with them. It is questionable that he could rule the whole Westeros. On the other hand, he could be a threshold to define GoT’s realness.

Classification Tree

Classification tree:
tree(formula = above_avg ~ . - popularity, data = train_G)
Variables actually used in tree construction:
[1] "numDeadRelations" "book1"            "age"              "book2"           
[5] "isMarried"        "plod"             "book4"            "gender"          
Number of terminal nodes:  13 
Residual mean deviance:  0.3633 = 120.6 / 332 
Misclassification error rate: 0.07826 = 27 / 345

The first observation of Decision (Classification) Tree model is that there are 13 nodes in this tree, and this model misclassifies 0.07826 of the variables

One interpretation of this tree is that a man can identify himself as popular if his number of dead relations is more than 1, he is in book 1, his age is older than 20.

tree_pred  No Yes
      No  294  17
      Yes  10  24

It can be argued here that %92 will be enough for the correct prediction of the training dataset. However, there are only 72 people who have the popularity of 0.5 in the original dataset. One approach could be removing popularity from this dataset. But this is not a feasible idea when it comes to who is going to live or not. No one is going to make plans for Torrek whose popularity is 0 (A Wiki of Ice and Fire). Eventually, she is going to die. This is obvious her importance of living cannot be comparable to Tywin Lannister (Game of Thrones Fandom). All in all, obtaining a bit lower training result might be better.

Pruning Graph - Cross Validation and Deviances

After pruning 5 has been decided for the correct size.

Classification Tree after pruning - 5 Branches

Unfortunately, there is one "Yes" in this tree. So, finding a popular person might depend on a newer classification rate. 0.45 might not be the optimal choice.

tree_pred  No Yes
      No  294  17
      Yes  10  24

Training misclassification rate is 27/345.

tree_predT No Yes
       No  75   9
       Yes  1   1

Misclassification rate for test data set is 10/86.

Confusion Matrix and Statistics

          Reference
Prediction  No Yes
       No  294  17
       Yes  10  24
                                          
               Accuracy : 0.9217          
                 95% CI : (0.8882, 0.9478)
    No Information Rate : 0.8812          
    P-Value [Acc > NIR] : 0.009423        
                                          
                  Kappa : 0.5965          
                                          
 Mcnemar's Test P-Value : 0.248213        
                                          
            Sensitivity : 0.9671          
            Specificity : 0.5854          
         Pos Pred Value : 0.9453          
         Neg Pred Value : 0.7059          
             Prevalence : 0.8812          
         Detection Rate : 0.8522          
   Detection Prevalence : 0.9014          
      Balanced Accuracy : 0.7762          
                                          
       'Positive' Class : No

Model's accuracy for training data set is 92%. Accuracy's confidence interval is between (0.8882, 0.9478). Kappa score is 0.5965 which can be stated as moderate (McHugh, 2012).

Confusion Matrix and Statistics

          Reference
Prediction No Yes
       No  75   9
       Yes  1   1
                                          
               Accuracy : 0.8837          
                 95% CI : (0.7965, 0.9428)
    No Information Rate : 0.8837          
    P-Value [Acc > NIR] : 0.58320         
                                          
                  Kappa : 0.1331          
                                          
 Mcnemar's Test P-Value : 0.02686         
                                          
            Sensitivity : 0.9868          
            Specificity : 0.1000          
         Pos Pred Value : 0.8929          
         Neg Pred Value : 0.5000          
             Prevalence : 0.8837          
         Detection Rate : 0.8721          
   Detection Prevalence : 0.9767          
      Balanced Accuracy : 0.5434          
                                          
       'Positive' Class : No

Model's accuracy for test data set is 0.8837%. Accuracy confidence interval is between (0.7965, 0.9428). In the previous example, the gap in the CI is close. Kappa score is 0.5965 which can be stated as no agreement (McHugh, 2012).

Bagging

Bagged CART 

345 samples
 12 predictor
  2 classes: 'No', 'Yes' 

No pre-processing
Resampling: Cross-Validated (10 fold) 
Summary of sample sizes: 311, 311, 311, 310, 310, 310, ... 
Resampling results:

  Accuracy   Kappa    
  0.8929412  0.4334649

In Bagging, model's accuracy is 0.8929412%. Kappa score is 0.4334649 which can be stated as moderate (McHugh, 2012).

Variance Importance of Variables - Train Dataset

The first observation is that age, percentage likelihood of death, and the number of dead relations. Classification Tree has these three variables too. Also, being in book1 finds itself room here.

Test Results

Bagged CART 

86 samples
12 predictors
 2 classes: 'No', 'Yes' 

No pre-processing
Resampling: Cross-Validated (10 fold) 
Summary of sample sizes: 78, 77, 78, 77, 77, 78, ... 
Resampling results:

  Accuracy   Kappa    
  0.8833333  0.3891925

Test data set's accuracy is 0.8833333%. Kappa score is 0.4334649 which can be stated as moderate (McHugh, 2012).

Variance Importance of Variables - Test Dataset

For the test data set, the number of dead relations is the most important variable. Percentage likelihood of death and book 1 follow. Book 2 introduced itself as one of the highest predictors.

Misclassification Rate

Confusion Matrix and Statistics

          Reference
Prediction  No Yes
       No  304   0
       Yes   0  41
                                     
               Accuracy : 1          
                 95% CI : (0.9894, 1)
    No Information Rate : 0.8812     
    P-Value [Acc > NIR] : < 2.2e-16  
                                     
                  Kappa : 1          
                                     
 Mcnemar's Test P-Value : NA         
                                     
            Sensitivity : 1.0000     
            Specificity : 1.0000     
         Pos Pred Value : 1.0000     
         Neg Pred Value : 1.0000     
             Prevalence : 0.8812     
         Detection Rate : 0.8812     
   Detection Prevalence : 0.8812     
      Balanced Accuracy : 1.0000     
                                     
       'Positive' Class : No

Bagging misclassification results are concerning. Because its Accuracy is 1. Some adjustments need to be made to the model if Bagging is to be chosen. Kappa is 1 which is a perfect agreement (McHugh, 2012).

Confusion Matrix and Statistics

          Reference
Prediction No Yes
       No  76   1
       Yes  0   9
                                          
               Accuracy : 0.9884          
                 95% CI : (0.9369, 0.9997)
    No Information Rate : 0.8837          
    P-Value [Acc > NIR] : 0.0002976       
                                          
                  Kappa : 0.9409          
                                          
 Mcnemar's Test P-Value : 1.0000000       
                                          
            Sensitivity : 1.0000          
            Specificity : 0.9000          
         Pos Pred Value : 0.9870          
         Neg Pred Value : 1.0000          
             Prevalence : 0.8837          
         Detection Rate : 0.8837          
   Detection Prevalence : 0.8953          
      Balanced Accuracy : 0.9500          
                                          
       'Positive' Class : No

Same scenario in here too. Bagging misclassification results are concerning. Because its Accuracy is 0.9884. Some adjustments need to be made to the model if Bagging is to be chosen. Kappa is 0.9884 which is an almost perfect agreement (McHugh, 2012).

Random Forest

Before moving on to Cross-Validation, the misclassification rate has been examined.

    predictForest
       No Yes
  No  297   7
  Yes  19  22

Training misclassification rate is 26/345. Almost the same result with the Classification Tree.

     predictForestt
      No Yes
  No  76   0
  Yes 10   0

For the test data set, misclassification rate is 10/86. The same result occurs here as it occurs for the Classification Tree.

In order, the number of dead relations, book 1 and percentage likelihood of death are the most important variables.

The interpretation of this graphic is that the error rate decreases by 11% in splits with a number of dead relations. Also, the error rate decreases by 5% in splits with book 1 and the percentage of likelihood of death.

Cross-Validation for Random Forest

 predictionCV
       No Yes
  No  294  10
  Yes  17  24

predictionCVt
      No Yes
  No  76   0
  Yes 10   0

After a Cross-Validation, the same result has been reached for misclassification rates. So, the interpretation will not be made again.

Confusion Matrix and Statistics

          Reference
Prediction  No Yes
       No  294  17
       Yes  10  24
                                          
               Accuracy : 0.9217          
                 95% CI : (0.8882, 0.9478)
    No Information Rate : 0.8812          
    P-Value [Acc > NIR] : 0.009423        
                                          
                  Kappa : 0.5965          
                                          
 Mcnemar's Test P-Value : 0.248213        
                                          
            Sensitivity : 0.9671          
            Specificity : 0.5854          
         Pos Pred Value : 0.9453          
         Neg Pred Value : 0.7059          
             Prevalence : 0.8812          
         Detection Rate : 0.8522          
   Detection Prevalence : 0.9014          
      Balanced Accuracy : 0.7762          
                                          
       'Positive' Class : No

For Random Forest, accuracy for the training data set is 0.9217%. The accuracy confidence interval is between (0.8882, 0.9478) which is the same as the Classification Tree. Kappa score is 0.5965 (moderate) (McHugh, 2012). This is another same result too.

Confusion Matrix and Statistics

          Reference
Prediction No Yes
       No  76  10
       Yes  0   0
                                          
               Accuracy : 0.8837          
                 95% CI : (0.7965, 0.9428)
    No Information Rate : 0.8837          
    P-Value [Acc > NIR] : 0.583199        
                                          
                  Kappa : 0               
                                          
 Mcnemar's Test P-Value : 0.004427        
                                          
            Sensitivity : 1.0000          
            Specificity : 0.0000          
         Pos Pred Value : 0.8837          
         Neg Pred Value :    NaN          
             Prevalence : 0.8837          
         Detection Rate : 0.8837          
   Detection Prevalence : 1.0000          
      Balanced Accuracy : 0.5000          
                                          
       'Positive' Class : No

Model's accuracy for test data set is 0.8837%. Accuracy confidence interval is between (0.7965, 0.9428). Both result same as the Classification Tree. Kappa score is here, however, zero which interpreted as no agreement (McHugh, 2012).

Conclusion for Classification Tree, Bagging, and Random Forest

Comparison for Classification Tree, Bagging, and Random Forest

The interpretation and comparisons will start with Area Under the Curves. Because previously no comments have been made about them. 77% for AUC can be stated as a good measurement for Classification Tree and Random Forest (Ludidi et. al, 2012; Tuan et. al, 2008; Jong Won & Sun Hae, 2015). As for Bagging, obtaining 1, as a result, a bit concerning one. F1 scores can help here to interpret the table. Classification Tree's and Random Forest's F1 scores are close to their AUC scores. But not on the expected level. If they had really close results they could be stated as good classifiers (Chavez-Badiola et. al, 2020). As for the misclassification rates, Random Forest has the lowest value. Overall, CT and RF could be the chosen model for this scenario while constructing the model.

For the test data sets, the Classification Tree has the highest AUC score. This time it was different from Random Forest. Speaking of Random Forest, its F1 score is very low which makes this model a bad classifier (IBM). Test results did not surpass the training results. This can be stated as a good outcome. Classification Tree's F1 score is closer than its training result. Because in the training result the gap between CT's AUC and F1 score was a bit large.

In a nutshell, the Classification Tree can be selected as the final model. This model proved its consistency when comparing with the others. A couple of suggestions for future researchers; the first changing the popularity measure could be a good idea in terms of analyzing this data set with a different lens. The second is that other classifier might be tried. The number of dead relations could be excited to look at when it comes to understanding the Game of Thrones universe. The third one is that adding more classifier might be colourful. Meaning, popularity and being married (or not) as combined classifiers come to minds as a different approach to this data set. The fourth one, to find a better approach for misclassification rates, looking for other researchers work might be a plausible idea, for instance (Bielsa et. al, 2012; Zhou et. al, 2020). Because there could be plenty of reasons.

Bibliography

Bechky, P. S. (2015). The International Law of Game of Thrones. Alabama Law Review Online, 67(1).
Bielsa, S., Porcel, J. M., Castellote, J., Mas, E., Esquerda, A., & Light, R. W. (2012). Solving the Light's criteria misclassification rate of cardiac and hepatic transudates. Respirology, 17(4), 721-726.
Chavez-Badiola, A., Farias, A. F. S., Mendizabal-Ruiz, G., Garcia-Sanchez, R., Drakeley, A.
J., & Garcia-Sandoval, J. P. (2020). Predicting pregnancy test results after embryo transfer by image feature extraction and analysis using machine learning. Scientific reports, 10(1), 1-6.
Diamant, C. De-centring Human Agency in Pop Culture: the Slave Religions of R’hllor and the Many-Faced God in A Song of Ice and Fire by George R. Martin.
Gray, T. (1747). An ode on a distant prospect of Eton College (p. 51). London: R. Dodsley and sold.
Jong Won, M., & Sun Hae, L. (2015). Validation of the K6/K10 scales of psychological distress and their optimal cutoff scores for older Koreans. The International Journal of Aging and Human Development, 80(3), 264-282.
Ludidi, S., Conchillo, J. M., Keszthelyi, D., Van Avesaat, M., Kruimel, J. W., Jonkers, D. M., &
Masclee, A. A. M. (2012). Rectal hypersensitivity as hallmark for irritable bowel syndrome: defining the optimal cutoff. Neurogastroenterology & Motility, 24(8), 729-e346.
Martin, G. R. (2005). A feast for crows. Bantam.
Martin, G. R. (2011). A dance with dragons (Vol. 5). Bantam.
Martin, G. R. (2012). A clash of kings (Vol. 2). Bantam.
McHugh M. L. (2012). Interrater reliability: the kappa statistic. Biochemia medica, 22(3), 276–282.
Tuan, N. T., Adair, L. S., He, K., & Popkin, B. M. (2008). Optimal cutoff values for overweight: using body mass index to predict incidence of hypertension in 18-to 65-year-old Chinese adults. The Journal of nutrition, 138(7), 1377-1382.
Zhou, X., Wang, X., Hu, C., & Wang, R. (2020). An analysis on the relationship between uncertainty and misclassification rate of classifiers. Information Sciences, 535, 16-27.