Test Cases for Problems
Test cases are used to automate the marking of student code submissions.
There are four types of test cases, which are described below. By default, all test cases are Validate test cases. This is the vast majority of test cases used on Grok.
Enabling other types of Test Cases
By default, all test cases are Validate test cases. You can enable the other types of test cases using the Show Advanced option on the Problem, see screenshot below. Then change the Test Case Type to one of the below types.
Types of Test Cases
By default, all test cases are Validate test cases. See above for how to enable other types of test cases.
Type of Test Case | Correctness Typically used to assess the correctness of the solution |
Elegance Typically used to assess the style & elegance of the solution (rather than the correctness). |
Normal | Validate
|
Suggest
|
Assignment | Assess
|
Review
|
What's the practical difference between Validate and Suggest test cases?
Essentially, it is:
- The user experience of passing vs not passing the Validate tests. This does not impact tutor ability to see the students' latest Marked submission
- Validate test cases only run if the previous one passed, whereas ALL the Suggest test cases run. There is an advanced option to "group" Validate test cases in which case all tests in the group will run at once. See below about setting this up.
Using Test Cases for Assessments
With test cases, what Grok envisions is that one would set an appropriate minimum acceptance criteria using the Validate test cases (e.g. Is the output format correct? Does it pass the example input/output pairs in the problem description? etc). One would use the Assess test cases to test for harder or more nuanced cases. The idea being that students who pass more Assess test cases will achieve higher marks.
Generally we (Grok) assume that any student who passes the Validate test cases will receive a passing mark for this assessment. That's not a requirement, of course. But it's our recommendation, and how the system was designed to work. That's also not to say that you can't still assess the students who don't pass all the Validate cases - you absolutely can assess their submissions and assign them a mark.
Checker Options
Each test case has a number of optional checker options which are passed to the checker. The appropriate checker must be used for these options to pass through successfully. The most common output checker used for Python code is the "Differ". This diffs the input and output of the program with the expected input and output of the test case.
Friendliness
The friendliness options only apply when using the Differ checker. They are set in the following interface:
The differ operates by comparing the expected input and output to the actual input and output.
- Space (ignore whitespace) - removes all whitespace from the actual and expected before diffing
- Punct (ignore punctuation) - removes all punctuation from the actual and expected before diffing
- Case (ignore case) - converts the actual and expected to lowercase before diffing
- Sort (ignore line order) - lexicographically sorts all of the lines of actual and expected before diffing
- Sort | Uniq (ignore line order and duplicates) - does `sort` and then removes duplicate lines in actual and expected before diffing
- Norm floats (round floating point numbers) - round all floating point numbers to the provided number of significant figures in actual and expected before diffing (note significant figures not decimal places)
- Slice from (ignore lines from N) - remove the first N lines of output from actual and expected before diffing. This happens before sort and sort | uniq.
- Slice to (ignore lines to N) - remove after N from actual and expected before diffing. If set to a negative number, it counts down from the last line (like Python slices). This happens before sort and sort | uniq.
Grouping Tests
- Test 1: Simple validation test with example to ensure code runs at all
- Tests 2 - 5: Validation tests of edge cases. Only run if Test 1 passes.
- Tests 6 - 8: Assess tests to assist manual marking. Only run if all Tests 2 - 5 pass.
Changing behaviour when a Validation test fails
Another way to get the results of all tests, even when an earlier validation test fails, is to alter the problem's validation test behaviour.
Options are:
- Stop on fail - If a validation test fails, do not run any further tests beyond the current group, and do not show the results of any later tests in the current group
- Stop on fail but show all group results - As above, but show all results for the current test group (even if they occur after the failing test)
- Continue on fail - Run and show results of all tests, regardless of earlier test failures.