Stats

A simple package for analyzing pklmart data

Now let’s look at some overall statsitics we can get from the data.


source

team_first_serve_win_frac

 team_first_serve_win_frac (team_id:str)

Takes a team id and returns that team’s first serve win fraction.

team_id_test = 'T2'
t = Team(team_id_test)
team_win_frac_test = t.num_games_won/t.num_games_played
print("{} won {:.2f}% of games they played".format(get_team_name(team_id_test), team_win_frac_test*100))
Anna Leigh Waters & Leigh Waters won 66.67% of games they played

Let’s look at the win percentage when serving first, marginalized over all games


source

get_frac_first_serve_wins

 get_frac_first_serve_wins (game_df:pandas.core.frame.DataFrame)

Returns the fraction of games won by the first searver for a given df of games.


source

get_first_serve_team

 get_first_serve_team (game_id:str)

Returns the team_id of the team that served first for a given game with game_id.

first_serve_win_frac = get_frac_first_serve_wins(game)
print("The first serving team won {:.2f}% of games".format(first_serve_win_frac*100))
The first serving team won 57.41% of games

Let’s see how Jessie Irvine compares to the average


source

get_teams_from_player

 get_teams_from_player (player_id:str)

Returns the team_ids of the teams that a player with player_id played for.

player_name_test = "Jesse Irvine"
player_id_test = get_player_id(player_name_test)
player_id_test
'P1'
#Find the teams that Jesse Irvine played for
team_ids_test = get_teams_from_player(player_id_test)
for team_id in team_ids_test:
    print(get_team_name(team_id))
Jesse Irvine & Catherine Parenteau
Jesse Irvine & Anna Bright
Jesse Irvine & Lucy Kovalova

source

team_win_frac

 team_win_frac (team_id:str)

Returns the fraction of games won by a team with team_id.


source

games_played_by_team

 games_played_by_team (team_id:str)

Returns the number of games played by a team with team_id.

net_games_played = sum([games_played_by_team(team_id) for team_id in team_ids_test]) #Number of games played by Jesse Irvine on any team
avg_first_serve_win_frac_test = sum([team_first_serve_win_frac(team_id)* games_played_by_team(team_id) for team_id in team_ids_test])/net_games_played #Average first serve win frac for Jesse Irvine
avg_tot_win_frac_test = sum([team_win_frac(team_id)* games_played_by_team(team_id) for team_id in team_ids_test])/net_games_played #Average total win frac for Jesse Irvine
print("{}'s average first serve win percentage is {:.2f}%".format(player_name_test, avg_first_serve_win_frac_test*100))
print("{}'s average overall win percentage is {:.2f}%".format(player_name_test, avg_tot_win_frac_test*100))
Jesse Irvine's average first serve win percentage is 33.33%
Jesse Irvine's average overall win percentage is 66.67%
sns.barplot(x=['All first', player_name_test + " first", player_name_test + " overall"], y=[first_serve_win_frac, avg_first_serve_win_frac_test, avg_tot_win_frac_test], palette=colors)
plt.title("First Serve Advantage?")
plt.ylabel("Win Percentage")
# plt.savefig('figures/first_serve_win_percentage.pdf')
plt.show()