UI decisions

1. if the user passes in a list of more than 1 model_id, or calls predict_all, we will add only one item to our DB to track features, in order to be somewhat more space efficient
1. if they pass in a single model at a time to .predict, we will save data each time
1. the user can pass in the same row of data multiple times, and we will add it to our db multiple times (they will be able to pass in `drop_duplicates=True` at analytics time
1.  our analyze_discrepancies function will take in a "print=True" param, which will use tabulate to pretty print a table of features and their results
1. analyze discrepancies will return a list of dictionaries (sorted by feature importances, if they exist)
1. each dictionary will represent a row
1. the first dictionary will contain summary information (in aggregate how much predictions differ between the two envs, how much the actuals differ, avg values for each, how many rows have 0 discrepancies, total counts of missing features, etc.)
1. each row will have feature_name, avg_val, num_missing, avg_discrepancy, median_discrepancy, avg abs discrepancy, median abs discrepancy, all of the above done as a percent of the "usable range" of that feature (95th percentile - 5th percentile at training time)
1. analyze_discrepancies will take in an optional model_id. if none, we'll look at all of our data
1. tracking avg features will not focus on discrepancies at all (tracking features becoming more or less out of whack over time is out of scope for MVP). it will only focus on showing serving time values
1. the user can pass in a feature name to track_features and we will only return results for that feature



  

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UI decisions #6

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

UI decisions #6

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions