Every year there is talk of how a team near the top of the table only won because they beat the teams around them or failed to prosper because they couldn’t get enough points off the teams below them (Liverpool, in particular, being a case in point the this season)
Let’s have a look at that for the history of the Premier League as far as the top teams go. This year, there have been, realistically, six teams vying for the top four places which guarantees entry into the following year’s, exclusive, Champions League. I will, thus, concentrate on how the ‘Top Sixes’ have fared against each other over the 25 years of the Premier League
Let’s load the libraries and data required. The latter has been personally accumulated over many years but similar data can also be obtained from James Curley’s engsoccerdata package
library(DT)
library(knitr)
library(plotly)
library(crosstalk)
library(htmltools)
library(tidyverse)
standings <- readRDS("data/standings.rds")
glimpse(standings)
## Observations: 19,492
## Variables: 20
## $ season <chr> "2004/05", "2004/05", "2004/05", "2004/05", "200...
## $ final_Pos <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ team <chr> "Chelsea", "Chelsea", "Chelsea", "Chelsea", "Che...
## $ GF <int> 1, 3, 1, 2, 3, 0, 1, 3, 4, 1, 3, 1, 0, 1, 3, 2, ...
## $ GA <int> 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, ...
## $ gameDate <date> 2005-05-15, 2005-05-10, 2005-05-07, 2005-04-30,...
## $ tmGameOrder <int> 506, 505, 504, 503, 502, 501, 500, 499, 498, 497...
## $ tmYrGameOrder <int> 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, ...
## $ venue <chr> "A", "A", "H", "A", "H", "H", "H", "A", "H", "H"...
## $ MATCHID <int> 5285, 5277, 5265, 5253, 5243, 5235, 5217, 5212, ...
## $ OppTeam <chr> "Newcastle U", "Man. Utd.", "Charlton", "Bolton"...
## $ points <dbl> 1, 3, 3, 3, 3, 1, 1, 3, 3, 3, 3, 3, 1, 3, 3, 3, ...
## $ cumGF <int> 72, 71, 68, 67, 65, 62, 62, 61, 58, 54, 53, 50, ...
## $ cumGA <int> 15, 14, 13, 13, 13, 12, 12, 11, 10, 9, 9, 8, 8, ...
## $ cumPts <dbl> 95, 94, 91, 88, 85, 82, 81, 80, 77, 74, 71, 68, ...
## $ cumGD <int> 57, 57, 55, 54, 52, 50, 50, 50, 48, 45, 44, 42, ...
## $ allGames <dbl> 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, ...
## $ position <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ res <chr> "Draw", "Win", "Win", "Win", "Win", "Draw", "Dra...
## $ tt <chr> "<table cellpadding='4' style='line-height:1'><t...
The ‘standings’ data.frame has a row for each team for each round so, firstly, I need to construct a subset restricted to end-of-season data for the top six each year, retaining just the team, season, final position and points (3 for win, 1 for tie/draw) fields
topSixes <- standings %>%
group_by(season) %>%
filter(tmYrGameOrder==max(tmYrGameOrder)&final_Pos<=6) %>%
select(team,season,final_Pos,cumPts)
# test for specific season
(topSixes %>%
filter(season=="2015/16"))
## # A tibble: 6 x 4
## # Groups: season [1]
## team season final_Pos cumPts
## <chr> <chr> <int> <dbl>
## 1 Leicester C 2015/16 1 81
## 2 Arsenal 2015/16 2 71
## 3 Tottenham H 2015/16 3 70
## 4 Man. City 2015/16 4 66
## 5 Man. Utd. 2015/16 5 66
## 6 Southampton 2015/16 6 63
As you may recall, last year Leicester convincingly won the title and Arsenal did their perennial - up until then at least - eclipsing of local rivals, Spurs
We now want to construct a minitable for the encounters between these clubs. It usually pays to look at one season first and then extend to all years, using the mapping functions from the purrr package (part of the tidyverse package loaded above)
# top teams for 2016/17
oppsOneYear <- topSixes %>%
filter(season=="2016/17") %>%
.$team
#function to create results for club
make_OneYeartable <- function(x) {
temp <- standings %>%
filter(season=="2016/17"&team==x&OppTeam %in% oppsOneYear) %>%
summarize(totPoints=sum(points),totGF=sum(GF),totGA=sum(GA),totGD=totGF-totGA)
cbind(team=x,temp)
}
## apply the function to each of the six teams. Use map_df as we want a data.frame as outcome
dataOneYear <-map_df(oppsOneYear, make_OneYeartable)
dataOneYear
## team totPoints totGF totGA totGD
## 1 Chelsea 16 16 14 2
## 2 Tottenham H 15 13 10 3
## 3 Man. City 10 11 15 -4
## 4 Liverpool 20 16 9 7
## 5 Arsenal 9 15 18 -3
## 6 Man. Utd. 10 7 12 -5
Arsenal’s relative failure against the top clubs cost them a place in the Champions League as they finished just one point shy of fourth place
OK, now we can look at applying the process to all seasons. This requires use of the map2_df() function
make_table <- function(x,y) {
# need to create a separate top 6 for each season
opps <- topSixes %>%
filter(season==x) %>%
.$team
temp <- standings %>%
filter(season==x&team==y&OppTeam %in% opps) %>% #10 games
summarize(totPoints=sum(points),totGF=sum(GF),totGA=sum(GA),totGD=totGF-totGA)
# add both season and team to the data
cbind(season=x,team=y,temp)
}
# use map2 is specialised to iterate over two arguments
data <-map2_df(topSixes$season,topSixes$team, make_table)
# add back in the relevant full table data (still needs bit of work)
data <- data %>%
left_join(topSixes) %>%
# create minitable position
arrange(desc(totPoints),desc(totGD),desc(totGF),team) %>%
group_by(season) %>%
mutate(pos=row_number()) %>%
ungroup() %>%
mutate(other_pts=cumPts-totPoints) %>%
select(season,team,pos,points=totPoints,GD=totGD,other_pts,all_pos=final_Pos,all_pts=cumPts) %>%
arrange(season,pos)
head(data)
## # A tibble: 6 x 8
## season team pos points GD other_pts all_pos all_pts
## <chr> <chr> <int> <dbl> <int> <dbl> <int> <dbl>
## 1 1992/93 Man. Utd. 1 19 7 65 1 84
## 2 1992/93 Blackburn 2 18 13 53 4 71
## 3 1992/93 Aston Villa 3 14 0 60 2 74
## 4 1992/93 Liverpool 4 13 -2 46 6 59
## 5 1992/93 Norwich C 5 13 -10 59 3 72
## 6 1992/93 QPR 6 7 -8 56 5 63
In 1992/93, Year One of the Premier League, Blackburn and Liverpool (sound familiar) suffered by not taking advantage of lesser teams
One piece of trivia for this season is that Manchester United have only managed one goal away from home against others in the top six and that was a consolation goal in the last of them, against Spurs
teams <- topSixes %>%
filter(season=="2016/17") %>%
.$team
# mini1 <- standings %>%
# filter(team=="Man. Utd." & season=="2016/17" & OppTeam %in% teams) %>%
# arrange(gameDate) %>%
# mutate(result=paste0(GF,"-",GA)) %>%
# select(Opponent=OppTeam,date=gameDate,venue,result) %>%
# DT::datatable(width=400,class='compact stripe hover row-border order-column',rownames=FALSE,options= list(paging = FALSE, searching = FALSE,info=FALSE))
#
#
# ## not working as in other files?? - though is in system
# #htmlwidgets::saveWidget(mini1, file = "mini1.html", selfcontained = TRUE)
standings %>%
filter(team=="Man. Utd." & season=="2016/17" & OppTeam %in% teams) %>%
arrange(gameDate) %>%
mutate(result=paste0(GF,"-",GA)) %>%
select(Opponent=OppTeam,date=gameDate,venue,result) %>%
kable()
Opponent | date | venue | result |
---|---|---|---|
Man. City | 2016-09-10 | H | 1-2 |
Liverpool | 2016-10-17 | A | 0-0 |
Chelsea | 2016-10-23 | A | 0-4 |
Arsenal | 2016-11-19 | H | 1-1 |
Tottenham H | 2016-12-11 | H | 1-0 |
Liverpool | 2017-01-15 | H | 1-1 |
Chelsea | 2017-04-16 | H | 2-0 |
Man. City | 2017-04-27 | A | 0-0 |
Arsenal | 2017-05-07 | A | 0-2 |
Tottenham H | 2017-05-14 | A | 1-2 |
The chart below shows results for each year. Hover points for team and use the filter created using the crosstalk package to identify just one team The points are jittered to account for situation where two teams have same number of points
sd <- SharedData$new(data)
fs <- filter_select(
id = "team",
label = "Select Team",
sharedData = sd,
group = ~ team,
allLevels = FALSE,
multiple = FALSE
)
## this is needed as crosstalk does not work nicely with bootstrap, apparently
fs_nobootstrap <- fs
attr(fs_nobootstrap, "html_dependencies") <- Filter(
function(dep) {dep$name != "bootstrap"},
attr(fs_nobootstrap, "html_dependencies")
)
myChart <- sd %>%
plot_ly(x=~jitter(points),y=~season,color=~as.factor(pos), height=700) %>%
add_markers(hoverinfo="text",
text=~paste0(team,
"<br>Pos: ",pos,
"<br>Pts: ",points)) %>%
layout(title="Performance against other top 6 teams by season",
xaxis=list(title="Points Accrued"),
yaxis=list(title="")) %>%
config(displayModeBar = F,showLink = F)
## combine the selector and chart
tagList(
fs_nobootstrap,
br(),
myChart
)
Just to reinforce how disappointing this season has been - at least as far as the league is concerned - Manchester United’s points total of 10 is the joint worst they have ever registered in the PL with 2001/2. That year, they actually lost 6 of 10 againts their closest rivals. In that season, 36 yr old Laurent Blanc was a mainstay of their defence and Juan Veron patrolled the midfield
Share this post
Twitter
Google+
Facebook
Reddit
LinkedIn
StumbleUpon
Email