Data Tells How Napoli Avoids a Capitulation and Keeps Up

Posted on Dec 11, 2021

The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

Data Science Objective

Data science is used here to analyze SSC Napoli, which is a European football club based in Naples, Italy and was founded in 1926.Β 


Throughout their history, they have spent the majority of their time playing in the Italian topflight, Serie A. They are the largest club located south of Rome and cater to one of the most passionate fanbases across the country, if not the world. The peak of their success came between 1984-1991, where they were led to two Serie A titles and a European Cup by the great Diego Armando Maradona.

Despite countless top tier players and a loyal fanbase, the club has failed to reclaim the title glory they’ve been chasing. The inability to retain key talent long term, accusations of corruption going against the club, and most importantly a financial bankruptcy in 2004 which had them relegated to the third division of Italian soccer all played a part in the failure to reach their goals.

Economic Difficulties

Post-bankruptcy, the club was bought by film mogul Aurelio de Laurentiis and has undergone a complete overhaul. This period under de Laurentiis’ stewardship has been one of the brightest in the history of the club with numerous appearances in the Champions League, a batch of elite/world class talent, and a scouting network that have allowed them to discover undervalued players which can be sold off for a profit.

However, the last two years saw them finish 7th and 5th in the league, causing them to miss out on what could amount to hundreds of millions of dollars in Champions League revenue. With a high wage bill, transfer expenditure records broken, revenues slashed, and COVID crippling European football clubs financially, the 2021/22 season may prove pivotal to the sustained success of the club going forward.

Present Day

To start the 2021/22 season, Napoli got off to one of the best starts in the history of the league. Amassing a total of 31 points from a possible 33 points in their first 11 matches, they sat tied for first place with AC Milan. However as it’s written in the history of Napoli, when things go well they must come back to Earth. The most important players on the team suffered injuries, will be out for the Africa Cup of Nations in January, and lesser quality players are now being exposed.

Currently through 16 matches of the Serie A season, Napoli now sit in 3rd position with 36 points. Although their swanky form has deteriorated, they are only 2 points off the top spot in the table and the gap between 1st and 4th is 4 points. With the next transfer window opening on January 1st and expiring on January 31st, can Napoli identify replacements to continue their title push and ensure financial stability going forward?

Current League Standings

Napoli's Trajectory Over the Course of the Season

The line chart below shows the flatting of Napoli's points per game over the course of their previous 5 league matches.

Napoli had a relatively fit squad through the first 11 matches. Throughout that span, they accumulated an average of 2.82 points per match. Between the Matchday 12 and Matchday 16, the club has earned an average of 1 point per match. This is on par with teams facing a relegation battle. The effect of the injuries have clearly taken a toll on the Neapolitans.

The Problem - Injuries

The box plot below shows that Napoli have the highest median minutes per match out of any team in Serie A, when filtering for the top 11 players with the most minutes played. This means they are relying on a core group of individuals to play a substantial amount of time.

The sliced data frame below shows Napoli also have the 4th highest average minutes per match from their 11 most played individuals.

Data Tells How Napoli Avoids a Capitulation and Keeps Up

By looking at the data from Napoli's 11, we see that 7 individuals play an average of 90 minutes per match or more. If we were to omit Ospina (goalkeepers are unlikely to pick up fatigue related injuries), that's 60% of Napoli's outfield players averaging 90 minutes or more!

Data Tells How Napoli Avoids a Capitulation and Keeps Up

This data is further skewed, as Osimhen was red carded (taken out of the match due to a penalty) in the 22nd minute of the opening match and taken out of the match against Inter in the 55th minute due to injury. In addition to this, Lorenzo Insigne has been subbed off earlier in matches due to a persistent muscle fatigue issue that must be nursed.

From the data above, we can conclude that Napoli reuse the most players in the following positions: Fullback, Centerback, and Central Midfielders. In order to sustain their title run, or even sustain a top 4 position for the revenue boost guaranteed from Champions League football, it is essential for the club to improve the squad depth in these areas.

Transfer Targets

Key Criteria

When identifying transfers, we will be sticking to the following key criteria:

First criteria, all players must be a player in Serie A. As Napoli are in the midst of a title race, it is crucial to bring in players that are familiar with the style of play in Italy and have knowledge playing against their opponents. This also helps avoid registration restrictions.

Second criteria, players must have at least 630 minutes of game time under their belt in the league this season. This allows us to have an accurate look at the data, as a players true style starts to get reflected by the numbers.

Secondary Criteria

We will also be sticking to the following other less important criteria:

Third criteria, we subscribe to the philosophy that a team's defense can be as bad as their worst defender, but the attack can be as strong as the best attacker. For this reason, there is a stronger emphasis on defending metrics when looking at defensive minded players, rather than how they can progress the ball.

Last criteria, we will restrict the total number of potential signings made to 2 players. This is because Napoli missed out on Champions League revenue the previous two seasons, was hurt hurt financially due to COVID, and because they will need to have funds available to sign their current loan player, Zambo Anguissa, on a permanent basis. Since 4 of the top 5 outfield players with the most minutes per match are full backs or center backs, these are the areas we will focus on.

Full Backs

The scatter plot below shows that Mario Rui is performing worse than the league average for defensive duel win %, while having more defensive duels per 90 minutes than the average fullback. This tells us that due to his subpar defensive duel win %, opposition teams are addressing him as the weak link and targeting his side in their attack.

Below, we will take a deeper dive into how he compares relative to other outside backs of the big 7 clubs in Italy. This will allow us to justify whether his defensive abilities are suitable relative to other outside backs playing for clubs competing for Champions League qualification.


What we discover is damning against Rui. Every player rated below him is either a natural wing back (an outside back who has more of an attacking role; typically deployed in formations with 3 central defenders) or a player named Hysaj, who Napoli let go on a free due to a poor quality of play.

In Napoli's system, it is crucial for defenders to pass the ball successfully. Since they often play out of the back and have full backs support the attack, they must be capable of distributing the balls to players further up the field or in dangerous areas. However, as stated before, a defense is as bad as its weakest link. For that reason, it is essential for Napoli to identify a left back who is a competent passer, but more importantly has improved defensive metrics.

Below, we filter our data frame to find players who are above average in both defensive duel win % rate and accurate pass %. To be specific, this includes all left backs with a pass completion % greater than or equal to 85% and has a defensive duel win % rate greater than or equal to 60%. This method identifies Rogerio, a young Brazilian left back playing for Sassuolo.

Sassuolo have an attacking style of play and utilize their fullbacks in a similar way to Napoli. Since he is only 23 years old, this also gives Napoli time to continue his development on a larger stage and flip him for a profit down the road.

Data Tells How Napoli Avoids a Capitulation and Keeps Up

Central Defender

When looking at central defenders, we will look at three categories: defensive duels, aerial prowess, and passing. First, we will start with defensive duels.

The data below shows us two things:

First, both Koulibaly and Rrahmani have defensive duel win rates of 69.7% and 67.65%, respectively. This is superior to both starting fullbacks for Napoli, which is logical as their role has a larger emphasis on defending.

Second, both defenders are encountering less duels per 90 minutes than the fullbacks. This could mean that teams identify Napoli as being stronger defensively in the middle of the park, and are attacking from out wide as a result.

When looking at aerial metrics, Napoli's duo in the center average a win rate of 54.76% against a league average of 55%. When looking to provide depth, it may be beneficial to target a player who is above the league average for aerial duel win %. Since we see teams attacking Napoli from out wide (evident from the defensive duels per 90 at the fullback positions compared to the center back positions), this may leave Napoli exposed to balls crossed into the box at target men.

Lastly, we identified how Napoli like to play the ball out of the back. This means it is important for the central defenders to have a competent passing ability, in order to prevent turnovers in dangerous areas. Napoli excels in this category, with an accurate pass % of 93.38% compared to the league average of 88.9%.

The short list for center backs created below will pull central defenders on par with Napoli's current defensive duel win rate %, an above average aerial win rate %, and an above average passer of the ball.

Since Demiral just arrived at Atalanta on loan from Juventus, we know he is an unrealistic target. This leaves us with Gian Marco Ferrari, an Italian central defender for Sassuolo in the prime of his career at 29 years of age.

Key Findings

From the data presented above, we can come to the following conclusions:

1) Napoli's league form has taken a significant dive, which is reflected in their average points per match from Matchday 1-11 compared to Matchday 12-16. During their previous 5 matches, injuries became a common theme within the club.

2) When looking at every Serie A team's top 11 players with the most minutes played, it is evident that Napoli are relying on a core group of players more than their counterparts. No other club's top 11 players have a minutes per match median above 90 minutes, while Napoli's is at 92.39 minutes. This is further skewed as we know players like Osimhen and Insigne have been removed early from matches involuntarily due to injuries and a red card.

Secondary Findings

4) A common theme regarding our shortlists are players from Sassuolo. As they employ a similar play style and recruit players with strong metrics in key areas, it would be wise for Napoli to earmark the club as a potential pipeline of talent.

About Author

Kosta Marcopulos

My name is Kosta Marcopulos and I worked as an Assistant Buyer for Ross Stores after graduating with a degree in finance and economics from the University of Alabama. After working in the fashion industry for two years,...
View all posts by Kosta Marcopulos >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI