projections¶
-
class
pybbda.analysis.projections.
MarcelProjectionsBatting
(stats_df=None, primary_pos_df=None)[source]¶ -
COMPUTED_METRICS
= ['1B', '2B', '3B', 'HR', 'BB', 'HBP', 'SB', 'CS', 'SO', 'SH', 'SF']¶
-
LEAGUE_AVG_PT
= 100¶
-
METRIC_WEIGHTS
= (5, 4, 3)¶
-
NUM_REGRESSION_PLAYING_TIME
= 200¶
-
PLAYING_TIME_COLUMN
= 'PA'¶
-
PT_WEIGHTS
= (0.5, 0.1, 0)¶
-
RECIPROCAL_AGE_METRICS
= ['SO', 'CS']¶
-
REQUIRED_COLUMNS
= ['AB', 'BB']¶
-
compute_playing_time_projection
(metric_values, pt_values, metric_weights, pt_weights, seasonal_averages, num_regression_pt)¶ computes playing time projection. metric_values, metric_weights, and seasonal_averages are not used but are included for consistency with compute_rate_projection
- Parameters
metric_values –
pt_values – playing time values
metric_weights –
pt_weights – playing time weights
seasonal_averages –
num_regression_pt – number of playing-time units to use for regression
- Returns
-
compute_rate_projection
(metric_values, pt_values, metric_weights, pt_weights, seasonal_averages, num_regression_pt)¶ computes rate projection. the length of the x_values and x_weights have to be the same. pt_weights is not used but is included for consistency with compute_playing_time_projection
- Parameters
metric_values – float array
pt_values – float array
metric_weights – float array
pt_weights –
seasonal_averages – float array
num_regression_pt – float
- Returns
-
filter_non_representative_data
(stats_df, primary_pos_df)[source]¶ filters pitchers-as-batters. primary_pos_df is a data frame containing playerID, yearID, and primaryPos
- Parameters
stats_df – a data frame like Lahman batting
primary_pos_df – data frame
- Returns
-
get_num_regression_pt
(stats_df)¶ - Parameters
stats_df – data frame
- Returns
float
-
metric_projection
(metric_name, projected_season)¶ returns the projection for metric_name.
- Parameters
metric_name – str
projected_season – int
- Returns
data frame
-
metric_projection_detail
(metric_name, projected_season)¶ returns the projection result for metric_name, including the detailed components separately. The use case for the details is primarily debugging
- Parameters
metric_name – str
projected_season – it
- Returns
data frame
-
preprocess_data
(stats_df)[source]¶ preprocesses the data. :param stats_df: a data frame like Lahman batting :return: data frame
-
projections
(projected_season, computed_metrics=None)¶ returns projections for all metrics in computed_metrics. If computed_metrics is None it uses the default set.
- Parameters
projected_season – int
computed_metrics – list(str)
- Returns
data frame
-
seasonal_average
(stats_df, metric_name, playing_time_column)¶ seasonal average rate of metric_name
- Parameters
stats_df – data frame
metric_name – str
playing_time_column – str
- Returns
data frame
-
validate_data
(stats_df)¶
-
-
class
pybbda.analysis.projections.
MarcelProjectionsPitching
(stats_df=None, primary_pos_df=None)[source]¶ -
COMPUTED_METRICS
= ['H', 'HR', 'ER', 'BB', 'SO', 'HBP', 'R']¶
-
LEAGUE_AVG_PT
= 134¶
-
METRIC_WEIGHTS
= (3, 2, 1)¶
-
NUM_REGRESSION_PLAYING_TIME
= None¶
-
PLAYING_TIME_COLUMN
= 'IPouts'¶
-
PT_WEIGHTS
= (0.5, 0.1, 0)¶
-
RECIPROCAL_AGE_METRICS
= ['H', 'HR', 'ER', 'BB', 'HBP', 'R']¶
-
REQUIRED_COLUMNS
= ['IPouts']¶
-
compute_playing_time_projection
(metric_values, pt_values, metric_weights, pt_weights, seasonal_averages, num_regression_pt)¶ computes playing time projection. metric_values, metric_weights, and seasonal_averages are not used but are included for consistency with compute_rate_projection
- Parameters
metric_values –
pt_values – playing time values
metric_weights –
pt_weights – playing time weights
seasonal_averages –
num_regression_pt – number of playing-time units to use for regression
- Returns
-
compute_rate_projection
(metric_values, pt_values, metric_weights, pt_weights, seasonal_averages, num_regression_pt)¶ computes rate projection. the length of the x_values and x_weights have to be the same. pt_weights is not used but is included for consistency with compute_playing_time_projection
- Parameters
metric_values – float array
pt_values – float array
metric_weights – float array
pt_weights –
seasonal_averages – float array
num_regression_pt – float
- Returns
-
filter_non_representative_data
(stats_df, primary_pos_df)[source]¶ filter batters-as-pitchers. primary_pos_df is a data frame containing playerID, yearID, and primaryPos
- Parameters
stats_df – data frame like Lahman pitching
primary_pos_df – data frame
- Returns
data frame
-
get_num_regression_pt
(stats_df)[source]¶ gets the number of batters-faced for the regression component. computed as a function of fraction of games as a starter.
- Parameters
stats_df – data frame like Lahman pitching
- Returns
numpy array
-
metric_projection
(metric_name, projected_season)¶ returns the projection for metric_name.
- Parameters
metric_name – str
projected_season – int
- Returns
data frame
-
metric_projection_detail
(metric_name, projected_season)¶ returns the projection result for metric_name, including the detailed components separately. The use case for the details is primarily debugging
- Parameters
metric_name – str
projected_season – it
- Returns
data frame
-
preprocess_data
(stats_df)[source]¶ preprocesses teh data. :param stats_df: data frame like Lahman pitching :return: data frame
-
projections
(projected_season, computed_metrics=None)¶ returns projections for all metrics in computed_metrics. If computed_metrics is None it uses the default set.
- Parameters
projected_season – int
computed_metrics – list(str)
- Returns
data frame
-
seasonal_average
(stats_df, metric_name, playing_time_column)¶ seasonal average rate of metric_name
- Parameters
stats_df – data frame
metric_name – str
playing_time_column – str
- Returns
data frame
-
validate_data
(stats_df)¶
-