img - Nigeria's Premier Online Forum - Machine learning predicts World Cup winner
Welcome, Guest.
Did you miss your activation email?

Date: March 22, 2019, 04:05:28 AM


Machine learning predicts World Cup winner: Sports : - Nigeria's Premier Online Forum


Machine learning predicts World Cup winner

By: dayan (M) |Time : June 13, 2018, 07:58:35 PM
Researchers have predicted the outcome after simulating the entire soccer tournament 100,000 times.

by Emerging Technology from the arXiv

The 2018 soccer World Cup kicks off in Russia on Thursday and is likely to be one of the most widely viewed sporting events in history, more popular even than the Olympics. So the potential winners are of significant interest.

One way to gauge likely outcomes is to look at bookmakers’ odds. These companies use professional statisticians to analyze extensive databases of results in a way that quantifies the probability of different outcomes of any possible match. In this way, bookmakers can offer odds on all the games that will kick off in the next few weeks, as well as odds on potential winners.

An even better estimate comes from combing the odds from lots of different bookmakers. This approach suggests Brazil is the clear favorite to win the 2018 World Cup, with a probability of 16.6 percent, followed by Germany (12.8 percent) and Spain (12.5 percent).

But in recent years, researchers have developed machine-learning techniques that have the potential to outperform conventional statistical approaches. What do these new techniques predict as the likely outcome of the 2018 World Cup?

An answer comes from the work of Andreas Groll at the Technical University of Dortmund in Germany and a few colleagues. These guys use a combination of machine learning and conventional statistics, a method called a random-forest approach, to identify a different most likely winner.

First some background. The random-forest technique has emerged in recent years as a powerful way to analyze large data sets while avoiding some of the pitfalls of other data-mining methods. It is based on the idea that some future event can be determined by a decision tree in which an outcome is calculated at each branch by reference to a set of training data.

However, decision trees suffer from a well-known problem. In the latter stages of the branching process, decisions can become severely distorted by training data that is sparse and prone to huge variation at this kind of resolution, a problem known as overfitting.

The random-forest approach is different. Instead of calculating the outcome at every branch, the process calculates the outcome of random branches. And it does this many times, each time with a different set of randomly selected branches. The final result is the average of all these randomly constructed decision trees.

This approach has significant advantages. First, it does not suffer from the same overfitting problem that plagues ordinary decision trees.  It also reveals which factors are most important in determining the outcome.

So if a particular decision tree includes lots of parameters, it becomes easy to see which ones have the biggest impact on the outcome and which do not. These less important factors can then be ignored in future.

Groll and co use exactly this approach to model the 2018 World Cup. They model the outcome of each game the teams are likely to play and use the results to construct the most probable course of the tournament.

Groll and co begin with a wide range of potential factors that might determine the outcome. These include economic factors such as a country’s GDP and population, FIFA’s ranking of national teams, and the properties of the teams themselves, such as their average age, the number of Champions League players they have, whether they have home advantage, and so on.

Interestingly, the random-forest approach allows Groll and co to include other ranking attempts, such as the rankings used by bookmakers.

Plugging all this into the model provides some interesting insights. For example, the most influential factors turn out to be the team rankings created by other methods, including those from bookmakers, FIFA, and others.

Other important factors include GDP and the number of Champions League players on the team. Unimportant factors include the country’s population, the nationality of the coach, and so on.

The predictions arrived at through this process differ from others in some important ways. For a start, the random-forest method picks out Spain as the most likely winner, with a probability of 17.8 percent.

However, a big factor in this prediction is the structure of the tournament itself. If Germany clears the group phase of the competition, it is more likely to face strong opposition in the 16-team knockout phase. Because of this, the random-forest method calculates Germany’s chances of reaching the quarter-finals as 58 percent. By contrast, Spain is unlikely to face strong opposition in the final 16 and so has a 73 percent chance of reaching the quarter-finals.

If both make the quarter-finals, they have a more or less equal chance of winning. “Spain is slightly favored over Germany mainly due to the fact that Germany has a comparatively high chance to drop out in the round-of-sixteen,” say Groll and co.

But there is an additional twist. The random-tree process makes it possible to simulate the entire tournament, and this produces a different result.

Groll and co simulated the entire tournament 100,000 times. “According to the most probable tournament course, instead of the Spanish the German team would win the World Cup,” they say.

Of course, because of the huge number of permutations of games, this course is still extremely unlikely. Groll and co put the odds at about 1 in 100,000.

So there you have it. At the beginning of the tournament, Spain has the best chances of winning, according to Groll and co. But if Germany makes the quarter-finals, it then becomes the front-runner.

The tournament kicks off on Thursday, when the hosts, Russia, take on Saudi Arabia. Sadly, neither of these teams looks likely to make even the quarter-finals.



Re: Machine learning predicts World Cup winner

By: dayan (M) |Time : June 13, 2018, 08:04:25 PM
I like things like this because it provides an opportunity to know how accurate "these machines" can get to predicting human-centric activities.
As long as one never loses sight of the fact that the predictions are derived 100% from past outcomes, then one won't be heartbroken if a totally underrated team (underrated by the machines, of course) ends up winning the cup.
If I were a betting man, this would have provided me with an opportunity to make a killing by betting against the machines!  8)

Re: Machine learning predicts World Cup winner

By: dayan (M) |Time : June 16, 2018, 04:44:59 PM
Today, Iceland held Argentina to a 1:1 draw.
I can almost bet that the "machine learning" would have predicted a victory for Argentina, but lets keep watching this world cup unfold.
Hahaha.  :P

Re: Machine learning predicts World Cup winner

By: dayan (M) |Time : June 18, 2018, 02:44:24 AM
Today, Mexico defeated the defending Champions Germany by 1:0
My guess is that "Machine learning" didn't see this coming.
But to be fair it the machine, it is still early days.  :))

Re: Machine learning predicts World Cup winner

By: dayan (M) |Time : June 27, 2018, 08:54:32 PM
So, Machine Learning couldn't see Germany's exit coming.
That is why machines can never even begin to predict anything futuristic outside of numbers already known.
NOBODY (ie human system) KNOWS THE FUTURE!

Re: Machine learning predicts World Cup winner

By: dayan (M) |Time : July 10, 2018, 09:41:43 PM
Finally, the "machine learning" FAILED to even come close to predict eventual winner of this world cup, because all its predicted possible winners are out already.

Re: Machine learning predicts World Cup winner

By: princepoint |Time : July 10, 2018, 11:48:07 PM
Let watch and see different predictions cat pig now machine Awaiting result see. How it goes France already reach final

0 Members and 1 Guest are viewing this topic. Reply

web site traffic statistics