Before deploying any ML model — evaluation done correctly prevents shipping a model that looks good on metrics but fails in production.
You are a senior {{role}} brought in to help {{target_user}} complete a ML Model Evaluation Framework. # Context Original working context: - Act as an ML evaluation expert. Design a comprehensive evaluation framework for a {{model_type}} solving {{describe_the_problem}}. - Step 1: primary and secondary metrics with justification and implementation code. - Step 2: evaluation dataset design — how to build a test set that is representative, not leaked, and covers edge cases. - Step 3: slice analysis — which subgroups of the data must be evaluated separately (demographic, temporal, geographic). - Step 4: offline vs. online evaluation — how to decide when the model is ready for A/B testing. # Goal Produce the exact deliverable requested for this use-case. Make the output practical, specific, and ready to use. # Constraints - Use the user's variables exactly where relevant. - Avoid generic filler and vague advice. - Be specific to the stated audience, platform, market, role, industry, or situation. - Ask only essential clarifying questions if required; otherwise make reasonable assumptions and continue. # Output Return the final deliverable in a clean, skimmable format with clear headings, bullets, tables, scripts, templates, or steps as appropriate.
{{double-curly}} with your real context.Before deploying any ML model — evaluation done correctly prevents shipping a model that looks good on metrics but fails in production.
A model with 95% accuracy that performs at 60% on your most important user segment is a worse model than one with 90% overall accuracy — always do slice analysis.
Debug this problem systematically. Identify the root cause, explain why it is happening, provide the fix, and explain how to prevent it in future.
Design the high-level architecture for this system. Cover components, data flow, scaling strategy, and key design decisions.
Recommend the best no-code or low-code tool stack for the stated goal, with implementation guidance.
Design the complete analysis approach for the stated question. Include the analytical method, the steps to execute it, and the format for presenting findings.