You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
analyze feature discrepancies between the two environments
basic support for multiple models
only save one copy of data when getting predictions from multiple models
analyze prediction discrepancies between the two environments
relatively detailed analytics on feature discrepancies between the two environments
tracking avg feature values
analyzing discrepancies in labels
saving concordia
loading concordia
Borderline
impact on predictions of feature discrepancies?
get list of deltas
get "clean" data from our training features
iterate through deltas list a fixed number of times (say, 10 times)
modify the "clean" data by the amount of the delta
get predictions on the modified data
compare that to our baseline predictions
goal is "in isolation, this feature discrepancy by itself causes X delta in our predictions"
which really gets us to "if i only have a certain amount of time to make predictions as accurate as possible, where should i start?"
we are explicitly NOT taking into account prevalence of deltas- we'll just report out that figure separately. the interesting part here is "when this feature is messed up, what impact does that have (on average). the use case in mind here is a feature that's only missing sometimes, and we want to understand whether it's worth looking into further or not.
alternative is just to sort by feature importances
explicitly not in MVP:
any kind of integration with charting or alerting software
predict_all
list_models
any kind of error/value checking before getting predictions (ie, if we're missing features, let the user choose whether to warn, ignore, or raise an error)
prediction_id
two different forms of live env tracking- one for testing and one for live models
Borderline
explicitly not in MVP: