Can artificial intelligence and other types of data analytics pick a Final Four in this year’s NCAA basketball tournament? Given all the computing power available these days and the wealth of data available, picking winners should be a no brainer. Add in the power of artificial intelligence and the results should be a foregone conclusion.
Hold your dribble.
When it comes to basketball, data analytics may be as arcane as astrology when it comes to telling the future on the hardcourt. St. John’s University coach Rick Pitino was totally mystified when his team was left out of the tournament. The NCAA’s own data analytics system, called NET, had ranked St. John’s at number 32 which seemed to indicate that the Red Storm’s inclusion in the 68-team event was an easy layup. In the end, St. John’s wasn’t even listed as a bubble team.
“I think we all should probably never mention that word (NET) again because it’s fraudulent,” Pitino told the sporting press. “I think the NET is something that shouldn’t even be mentioned anymore.” Pitino also named Big East rival Seton Hall University as a fellow victim.
Another snubbed team was the Providence: “I think the analytics are bull—,” said coach Kim English bluntly. Even teams that made the cut were bent out of joint. Coach Greg McDermott of number three seeded Creighton, noted that the Big East was “the second-best league in the country and we have three teams in the top three seed lines” while other conferences have more teams represented. The SEC and Big 12 each have eight teams in the tournament.
The coaches’ remarks might be colored by some bitterness but they are not alone in their opinion. Sheldon H. Jacobson, a computer science professor at the University of Illinois at Urbana-Champaign, also believes the NCAA selection process is broken, citing the number of head-scratching upsets in 2023 that may indicate that team seedings or rankings in the tournament may not accurately reflect team performance. Jacobsen notes that in 2023, no team seeded 1, 2, or 3 reached the Final Four, the first time that’s happened since 1986. The problem may be that the BET system weighs all games played during the season equally, thereby leading to an incorrect seeding. Early season games rarely provide a good assessment of how well a team is playing in March. Seeding Purdue at number one in 2023 was not an accurate reflection of the team’s performance at the end of the season and Purdue was eliminated in the first round.
One underrated factor may be the rankings of mid-major teams. “Mid-majors are underrated by the selection committee,” said Jacobson in an email. “Power conference teams gain nothing to play them early in the season and have everything to lose. When forced to play them, the results can be unexpected.”
Hindsight is one thing but what’s a hoops fan to do in 2024 when the NCAA’s own data analytics doesn’t appear to offer any reliable guidelines to bracket picks? One key indicator may be how well a team is playing in the games running up to the tournament. A “bracket buster” may be a team that has been inconsistent over the course of the season, says Jacobson. Meanwhile, a team’s record over the past seven games may be a key indicator of how well the team is playing as the tournament begins.
“There is a lot of debate in the academic world about players or teams with a ‘hot hand,’” says Jackie Silverman-Lerner, assistant professor of marketing at the University of Delaware via email. “It might not last all the way to the Final Four but a team that was ‘hot’ over the last couple of weeks of the season may able to carry that momentum into the first few rounds of the tournament better than, say, a team with the same record but fewer concentrated wins at the end of the season.”
“Bracketologists” also are turning to artificial intelligence for help, amassing data from a team’s past performance that includes box score information that includes a team’s free-throw percentage, turnovers and assists to develop what they hope are predictive algorithms. One challenge is that AI relies on extremely large datasets while the information regarding NCAA tournament teams is a relatively small sample size from an AI perspective.
What artificial intelligence and other forms of data analytics can’t account for is what Silverman calls “noise,” luck and other idiosyncratic factors that can’t be quantified by even the best models. “Unexpected injuries, boosts in confidence and lucky plays can all impact a game’s outcome,” she says. “Of course, when a frontrunner experiences these boons they simply win by more points but when an underdog team gets lucky, it can completely change the game.” Or your brackets.
As the tournament begins, Houston, Purdue, North Carolina and UConn (University of Connecticut) are ranked as number one seeds in their assigned regions. So how likely are all these teams to make it to the Final Four?
“The expected rarely occurs,” notes Jacobson. “Games are played on the court, not on a computer.”