Skip to content

MVP Scope #8

@ClimbsRocks

Description

@ClimbsRocks
  • Get live predictions
  • Save live features to db
  • save training features to db
  • save training preds to db
  • save live preds to db
  • work for both predict and predict_proba
  • analyze feature discrepancies between the two environments
  • basic support for multiple models
  • only save one copy of data when getting predictions from multiple models
  • analyze prediction discrepancies between the two environments
  • relatively detailed analytics on feature discrepancies between the two environments
  • tracking avg feature values
  • analyzing discrepancies in labels
  • saving concordia
  • loading concordia

Borderline

  • impact on predictions of feature discrepancies?
    • get list of deltas
    • get "clean" data from our training features
    • iterate through deltas list a fixed number of times (say, 10 times)
    • modify the "clean" data by the amount of the delta
    • get predictions on the modified data
    • compare that to our baseline predictions
    • goal is "in isolation, this feature discrepancy by itself causes X delta in our predictions"
    • which really gets us to "if i only have a certain amount of time to make predictions as accurate as possible, where should i start?"
    • we are explicitly NOT taking into account prevalence of deltas- we'll just report out that figure separately. the interesting part here is "when this feature is messed up, what impact does that have (on average). the use case in mind here is a feature that's only missing sometimes, and we want to understand whether it's worth looking into further or not.
    • alternative is just to sort by feature importances

explicitly not in MVP:

  • any kind of integration with charting or alerting software
  • predict_all
  • list_models
  • any kind of error/value checking before getting predictions (ie, if we're missing features, let the user choose whether to warn, ignore, or raise an error)
  • prediction_id
  • two different forms of live env tracking- one for testing and one for live models
  • support for only shadow models
  • handling shadow models at analytics time
  • feature_imoprtances

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions