Full product evaluation loop

Define realistic scenarios for every major capability, fix failures, and rerun the whole evaluation set.

Evaluation Matthew Berman evaluation qa release

Use this to test a product as users actually experience it.

Stop when the full scenario set passes against the original criteria.