Law of Parsimony Statistics

The principle of saving, also known as Occam`s razor, explains the choice of the simplest explanation suitable for the best results when we have more than one option to choose from. When we apply the principle of saving, we tend to choose the phenomena with the least entity. However, in the principle of saving, it is rather a question of considering the simplest and most relevant explanation. We can therefore say that “the easiest hypothesis containing all the information necessary to apprehend the experience in which we find ourselves” justifies the principle of saving. We can use the principle of economics in many scenarios or events in our daily lives, including predictions from data science models. For our next example of saving on accuracy, we look at Netflix as a case study to describe the trade-off between economy and accuracy. In 2006, Netflix announced a Netflix prize worth $1 million that improved its referral system by 10%. The objective was to reduce the mean square error from 0.9525 to 0.8527 or less. One year after the start of the contest, the mean square error was improved by 8.43% and a model with a final combination of 107 algorithms was implemented in the Netflix recommendation system.

Two years later, the magic mark of 10% was crossed. However, Netflix analyzed some of the methods used for the model, which reduced the mean square error below 0.8527, and concluded that the additional accuracy gains did not seem to justify the technical effort required to integrate the methods into a production environment. Therefore, Netflix opted for the more economical model, which was less accurate. In other words, we can say that the question of whether we increase complexity to achieve additional accuracy is not a data science decision, but a business one. Applied to statistics, a model that has few parameters but achieves a satisfactory degree of fit quality should be preferred to a model that has a set of parameters and achieves only a slightly higher degree of quality. There are two main approaches to modelling in statistics. One is to develop a model that is easy to interpret and that explains the relationship between X and Y. The other is to create a model that provides accurate predictions, regardless of the shape of X and the complexity of the model. In a perfect world, we want to create a model that is simple, interpretable and has the highest predictive power. In reality, however, this is not feasible.

In this blog post, I will compare the two modeling approaches and explain in which situations economy or accuracy are preferred. The context of the quote is irrelevant to my point here, which is that it seems that “the law of thrift” is used to support a naïve view of causality in which an event can only have an “explanation.” The law of savings tells us that if there are alternative explanations of events, the simplest one is probably correct. In the field of biology, when it comes to determining evolutionary relationships between different species; This relationship can be determined by the application of phylogenetic trees, where a tree is constructed by identifying common ancestors. The principle of saving is applicable here if one chooses the phylogenetic tree that shows the least change. Epstein, R. (1984). The principle of savings and some applications in psychology. The Journal of Mind and Behavior, 119-130. Comparison between hierarchical and stepwise methods: Hierarchical and stepwise methods involve adding predictors to the stepwise model, and it is useful to know that these additions improve the model (Field, 2013). Since higher R2 values indicate better fit, an easy way to determine if a model has improved by adding predictors would be to see if R2 is higher for the new model than for the old model. However, it will always become larger as predictors are added, so the problem is more likely to be whether it will become significantly larger (Field, 2013). The magnitude of the change in R2 can be assessed using the following equation, since the F statistic is also used to calculate the significance of R2 (Field, 2013) Consider two cases: case 1, where there is a total of 8 supporting evidence to explain an event, and case 2, where there are 5 supporting evidence, to explain an event.

So, on the principle of saving, we tend to choose case 2, provided that all the evidence is important and relevant. There are a few pitfalls to creating models for economy and accuracy. This is an under- or over-adjustment. So if we create a very simple model, there`s a risk that we won`t pick up the real signal. These models do not match the data and we fail in the relationship between X and Y. On the other hand, if we want to create a model for predictive purposes, we want to increase the accuracy of the prediction. This is often achieved with very complex models. However, the more complex a model is, the more we run the risk of capturing noise from our training data. So we have an almost perfect fit for our training data, but for new observations, our model predicts poorly. This phenomenon is called overfitting.

Akaike weights are used in AIC. There is generally a trade-off between the quality of adjustment and the economy: low-economy models (i.e., models with many parameters) tend to be more appropriate than high-economy models. This is generally not a good thing; Adding additional parameters usually results in a good model that matches the available data, but the same model is likely to be useless for predicting other data sets. The idea behind economic models comes from Occam`s razor, or “the law of brevity” (sometimes called lex parsimoniiae in Latin). The law states that you must not use more “things” than necessary; In the case of business models, these “things” are parameters. Economic models have an optimal economy or just the right amount of predictors needed to properly explain the model. Developing and validating a model based on the same dataset is sometimes referred to as data dredging. Although there is an underlying relationship between the variables and stronger relationships are likely to result in stronger values (e.g., higher t-statistics), they are random variables and the realized values contain an error. So when we select variables based on higher (or lower) realized values, they can be based on their underlying actual value, error, or both. If we do that, we will be just as surprised as the coach after the second race.

This is true whether we select variables based on high t-statistics or low intercorrelations. One of the most important themes in this book is the simplification of models. The principle of saving is attributed to the early 14th century English nominalist philosopher William of Ockham, who insisted that with an equally good set of explanations for a particular phenomenon, the correct explanation is the simplest explanation. He is called Occam`s razor because he “shaved” his explanations to the bare minimum: his point was that when explaining something hypothetical, one should not multiply unnecessarily. In particular, for explanatory purposes, things whose existence is not known should not be postulated as existing, except in cases of absolute necessity. For statistical modelling, the principle of economy means that: minitab.com. (2015). The danger of overfitting regression models. Excerpt from blog.minitab.com/blog/adventures-in-statistics-2/the-danger-of-overfitting-regression-models. Speaking of savings, I came across the following quote from Commentary magazine (page 80 in the December 2004 issue): To reiterate, the importance of economy over accuracy is irrefutable and depends on the purpose and resources of an individual or company.

However, we have answered in which situations economy and precision are desired. The need for economy and interpretability explains, for example, why logistic regression is preferred to discriminant analysis, since coefficients can be clearly estimated and have the importance of logarithmic ratings.

Cartelería Digital :: dada media ::