Aloha as an on demand QA tool

After speaking with a bunch of people I think we should consider repositioning Aloha as a full service model representation language.  Aloha already solves a key problem in the world of ML, a tight coupling between feature functions and the model.  One of the other key concepts I think we should address is determining the correctness of data at score time.  In order to do this I propose the following framework.  At train time we should record statistics on the features in the model.  This should include at least
1.  The P(occurrence) of the feature
2.  The mean value of the feature
3.  The variance of the feature

If we can record these then at score time we can have a field which is called something like "QA window" that will continue to calculate these values using streaming algorithms.  We can then have another field called a policy.  This already exists somewhat with the `missingValues` field but I think we should extend this to include at least three options.

1.  Nothing
2.  Notify
3.  Refuse to score

This will necessitate a notification system be built into Aloha as well, which honestly I don't think is too hard.  If we had an email and message notification system I think that would be sufficient.

If all of this existed Aloha could be marketed as a very full featured model representation language, far superior to PMML (and frankly anything I can think of) and get us much closer to a true self service modeling framework.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aloha as an on demand QA tool #119

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Aloha as an on demand QA tool #119

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions