reports module

reports.annual_charts(ds_dict, adict, cdict, plot_year_group=True, plot_job_group=True, quantiles=10, plot_init_quarter=True, plot_running_quarter=True, pcnt_ylim=0.75, cpay_stride=500, fixed_col_name='eg_initQ', running_col_name='eg_runQ', figsize=None, date_grouper='ldate', chartstyle='ticks', verbose_status=True, tick_size=13, legend_size=14, label_size=14, title_size=14, adjust_chart_top=0.85)[source]

Generates multiple charts representing general annual attribute statistics of all calculated datasets for all employee groups FOR ALL ACTIVE EMPLOYEES (annual results for all employees).

The user may select grouping analysis by any or all of the following:

  1. longevity or date of hire year

  2. job level

  3. initial employee group list quantile membership

  4. annual employee group list quantile membership

Stores the output as images in multiple folders within the reports/<case_name>/ann_charts folder.

inputs
ds_dict (dictionary)

output of load_datasets function, a dictionary of datasets

adict (dictionary)

dataset column name description dictionary

cdict (dictionary)

program colors dictionary

plot_year_group (boolean)

if True, create chart images grouped by the date_grouper input year

date_grouper (string)

column name representing a column of dates within a dataframe. Year membership of this column will be used for grouping. Input is limited to ‘ldate’ or ‘doh’.

plot_job_group (boolean)

if True, create chart images grouped by job level held by employees

quantiles (integer)

the number of binning quantiles to measure for the initial and running (annually updated) quantile membership analysis (default is 10)

plot_init_quarter (boolean)

if True, produce output grouped by initial list quantile membership, for each employee group

plot_running_quarter (boolean)

if True, produce output grouped by annual list quantile membership, for each employee group

pcnt_ylim (float)

output chart maximum y axis value for percentage attribute charts as a float, example: .75 equals max displayed chart value of 75%

cpay_stride (integer)

y axis chart tick interval (in thousands) for charts displaying cpay (career pay)

fixed_col_name (string)

label to use for quantile number column when calculating using the initial quantile membership for all results

running_col_name (string)

label to use for quantile number column when calculating using a continuously updated quantile membership for all results

figsize (tuple)

optional size of all generated chart images. Default is None. This input will allow creation of larger chart images than the default small charts, at the price of an increase in the time required to run the function.

date_grouper (string)

‘ldate’ or ‘doh’ date column grouping attribute used when plot_year_group input is True

chartstyle (string)

any valid seaborn charting style (‘ticks’, ‘dark’, ‘white’, ‘darkgrid’, ‘whitegrid’), defalut is ‘ticks’

verbose_status (boolean)

if True, print status of calculations as function is running

tick_size (integer or float)

text size of tick labels on the output chart images

legend_size (integer or float)

text size of the legend on the output chart images

label_size (integer or float)

text size of the x and y axis labels on the output chart images

title_size (integer or float)

text size of the title on the output chart images

adjust_chart_top (float)

input to permit adjustment of the top location of the generated charts - used to ensure full chart title is captured by the save chart figure code. Defalt top position is 1.0, default vaule for this input is .85 which “shrinks” the charts slightly vertically so that the two-line chart titles are captured when saving the charts to file as images.

reports.job_diff_to_excel(base_ds, compare_ds, ds_dict, add_cpay=True, diff_color=True, row_color=True, lighten_factor=0.65, neg_color='red', pos_color='blue', zero_color='white', id_cols=['lname', 'ldate', 'retdate'])[source]

Generates a spreadsheet which reports the differential number of months spent at each job level between two outcome datasets. Results are reported for every employee.

The order of the employees shown will be the order from the “compare” dataset input.

The user may choose to apply formatting to the output spreadsheet. The generation of the output with formatting is much slower than without, however.

Stores the output within the reports/<case_name>/by_employee folder.

inputs
base_ds (dataframe)

baseline dataset

compare_ds (dataframe)

comparison dataset

add_cpay (boolean)

if True, add a “cpay_diff” column to show data model pay differential (compare vs. base)

diff_color (boolean)

if True, use the neg_color, pos_color, and zero_color inputs to color the spreadsheet job differential output

row_color (boolean)

color spreadsheet rows by employee group if True. Color will be a tint (lighter color version) of the colors used to represent the employee groups in chart output.

lighten_factor (float)

when the “row_color” input is True, this input controls the tint of the normal employee group colors to use for the cell background row coloring. The input is limited from 0.0 to 1.0 and a higher value will make the coloring lighter.

neg_color, pos_color, zero_color (color values)

this input will determine the font colors to use for negative, positive, and zero job differential values within the spreadsheet output. Inputs may by string hex values, or rgb values within tuples or lists

id_cols (list)

list of columns to include within the spreadsheet output which are in addition to the job level columns. This list (with the addition of the “order” column) will also be colored according to employee group when the “row_color” input is set to True.

reports.retirement_charts(ds_dict, adict, cdict, plot_year_group=True, date_grouper='ldate', plot_job_group=True, plot_init_quarter=True, plot_running_quarter=True, quantiles=10, pcnt_ylim=0.75, cpay_stride=500, fixed_col_name='eg_initQ', running_col_name='eg_runQ', figsize=None, chartstyle='ticks', verbose_status=True, tick_size=13, legend_size=14, label_size=14, title_size=14, adjust_chart_top=0.85)[source]

Generates multiple charts representing general attribute statistics of all calculated datasets for all employee groups AT RETIREMENT ONLY.

The user may select grouping analysis by any or all of the following:

  1. longevity or date of hire year

  2. job level

  3. initial employee group list quantile membership

  4. annual employee group list quantile membership

Stores the output as images in multiple folders within the reports/<case_name>/ret_charts folder.

inputs
ds_dict (dictionary)

output of load_datasets function, a dictionary of datasets

adict (dictionary)

dataset column name description dictionary

cdict (dictionary)

program colors dictionary

plot_year_group (boolean)

if True, create chart images grouped by the date_grouper input year

date_grouper (string)

column name representing a column of dates within a dataframe. Year membership of this column will be used for grouping. Input is limited to ‘ldate’ or ‘doh’.

plot_job_group (boolean)

if True, create chart images grouped by job level held by employees

quantiles (integer)

the number of binning quantiles to measure for the initial and running (annually updated) quantile membership analysis (default is 10)

plot_init_quarter (boolean)

if True, produce output grouped by initial list quantile membership, for each employee group

plot_running_quarter (boolean)

if True, produce output grouped by annual list quantile membership, for each employee group

pcnt_ylim (float)

output chart maximum y axis value for percentage attribute charts as a float, example: .75 equals max displayed chart value of 75%

cpay_stride (integer)

y axis chart tick interval (in thousands) for charts displaying cpay (career pay)

fixed_col_name (string)

label to use for quantile number column when calculating using the initial quantile membership for all results

running_col_name (string)

label to use for quantile number column when calculating using a continuously updated quantile membership for all results

figsize (tuple)

optional size of all generated chart images. Default is None. This input will allow creation of larger chart images than the default small charts, at the price of an increase in the time required to run the function.

date_grouper (string)

‘ldate’ or ‘doh’ date column grouping attribute used when plot_year_group input is True

chartstyle (string)

any valid seaborn charting style (‘ticks’, ‘dark’, ‘white’, ‘darkgrid’, ‘whitegrid’), defalut is ‘ticks’

verbose_status (boolean)

if True, print status of calculations as function is running

tick_size (integer or float)

text size of tick labels on the output chart images

legend_size (integer or float)

text size of the legend on the output chart images

label_size (integer or float)

text size of the x and y axis labels on the output chart images

title_size (integer or float)

text size of the title on the output chart images

adjust_chart_top (float)

input to permit adjustment of the top location of the generated charts - used to ensure full chart title is captured by the save chart figure code. Defalt top position is 1.0, default vaule for this input is .85 which “shrinks” the charts slightly vertically so that the two-line chart titles are captured when saving the charts to file as images.

reports.stats_to_excel(ds_dict, quantiles=10, date_grouper='ldate', fixed_col_name='eg_initQ', running_col_name='eg_runQ')[source]

Create a set of basic statistics for each calculated dataset and write the results as spreadsheets within the reports folder.

There are 2 spreadsheets produced, one related to retirement data and the other related to annual data.annual

The retirement information is grouped by employees retiring in future years, further grouped for longevity or initial job.

The annual information is grouped by the model year, and further grouped by 10% quantiles, either by initial quantile membership or by an annual quantile adjustment of remaining employees.

inputs
ds_dict (dictionary)

output of load_datasets function, a dictionary of datasets

quantiles (integer)

the number of binning quantiles to measure for the initial and running (annually updated) quantile membership analysis (default is 10)

date_grouper (string)

column name representing a column of dates within a dataframe. Year membership of this column will be used for grouping. Input is limited to ‘ldate’ or ‘doh’.

fixed_col_name (string)

label to use for quantile number column when calculating using the initial quantile membership for all results

running_col_name (string)

label to use for quantile number column when calculating using a continuously updated quantile membership for all results