All Collections
Projects
Creating Your Project
How to author SCTs for Workspace projects
How to author SCTs for Workspace projects
George Boorman avatar
Written by George Boorman
Updated over a week ago

Submission Correctness Tests (SCTs) are scripts to check that a student's solution matches the project solution. Once SCTs have been authored, you can check them by previewing your project from the Teach Editor and running the solution code. This will produce one of four outputs (see below).

How to approach SCTs

SCTs are designed to check any outputs that must be produced as per the project's Instructions. They should follow a general logic of:

  1. Check that the variable a learner needs to create exists.

  2. Check that the variable is the correct data type/class.

  3. Check that the data is correct, e.g., if it's a pandas DataFrame then first check the column names, then the values, subsetting if necessary to enable granular feedback on where learners might have gone wrong.

When authoring SCTs you should put yourself into the learners' shoes, thinking about the possible ways they may try to solve your project. This will allow you to design tests that allow the project to be solved in various ways, as long as the end results are correct.

SCTs are not designed to check intermediate steps used to produce the final output. As an example, in this project, learners work with LAPD crime data and are required to find the hour of day with the largest volume of crime, saving as an integer variable called peak_crime_hour. In order to find this information they may choose to extract the hour of day from a column called "TIME OCC" (time that the crime occurred) and store as a new column in the DataFrame called "HOUR OCC". However, we haven't specified that they need to create this column, so we should not test for it's existence.

Writing SCTs

The way the SCTs are formatted in projects is a list of assertions written in the native language (Python for Python or SQL projects, R for R projects). The Workspace will evaluate each of the assertions in order until one fails. If a failure occurs then this error will be passed up to the project and displayed to the learner.

We encourage you to catch all possible errors and return these as custom messages using the assertions, for example, it would make sense to check that df is defined before checking how many columns it has. This technique will allow you to give the learner more meaningful feedback than generic error messages.

Conventions

We recommend to start with simple tests and increase in complexity. For example, first check the existence of a variable, then check the data type, then check the number of columns, then the names of columns, then the values in the columns, etc.

Use comments to help describe to future content developers what your tests are doing.

For example, # This test checks that the column names are correct.

Always return a string describing to the user what is wrong when the assertion fails.

For example, “The output should contain 4 columns: "col_one", "col_two", "col_three", and "col_four", please check your code and try again”.

Examples of effective tests

Python/SQL

Please use these Python tests, which are from this project, as inspiration for what is possible. Feel free to try things beyond this too.

import numpy as pd
import pandas as pd
# Question 1
# Check peak_crime_hour exists
assert peak_crime_hour is not None, "Did you create a variable called peak_crime_hour?"

# Check the data type
assert isinstance(peak_crime_hour, int) or isinstance(peak_crime_hour, np.int64), "Did you create an integer variable called peak_crime_hour."

# Check peak_crime_hour is correct
assert peak_crime_hour == 12, "Did you identify the hour of day with the largest volume of crime?"

# Question 2
# Check peak_night_crime_location exists
assert peak_night_crime_location is not None, "Did you create a variable called peak_night_crime_location?"

# Check the data type
assert isinstance(peak_night_crime_location, str), "Did you save peak_night_crime_location as a string data type?"

# Check peak_night_crime_location is correct
assert peak_night_crime_location == "Central", "Did you find the location with the most crime between 10pm and 4am? Expected somewhere else."

# Question 3
# Check victim_ages exists
assert victim_ages is not None, "Did you create a variable called victim_ages?"

# Check victim_ages is a pandas Series
assert isinstance(victim_ages, pd.Series), "Did you save victim_ages as a pandas Series?"

# Correct labels
# Store the correct labels
victim_ages_labels = ['0-17', '18-25', '26-34', '35-44', '45-54', '55-64', '65+']

# Check labels exist in index
assert sorted(victim_ages.index.tolist()) == sorted(victim_ages_labels), "Did you split victim ages into groups aligned with the Instructions? Remember to exclude negative age values when creating bins."

# Correct number of crimes per age bracket
correct_victim_age_values = [4528, 28291, 47470, 42157, 28353, 20169, 14747]

# Check values exist in Series
assert sorted(victim_ages.values.tolist()) == sorted(correct_victim_age_values), "Expected a different count of crimes per age range in victim_ages."

# Correct answer
correct_victim_ages = dict(zip(victim_ages_labels, correct_victim_age_values))

# Individually check the number of crimes per age bracket are correct
for key, val in correct_victim_ages.items():
assert victim_ages.loc[key] == val, f"Expected a different number of crimes for {key} age group."

R Projects

Here is an example of R SCTs, which use the stopifnot() function.

# Check if the variable best_feature_df is defined
stopifnot("You should have a dataframe called best_feature_df, please make sure you have assigned this." = exists("best_feature_df"))

# Check if best_feature_df is the right data type
stopifnot("You should have a dataframe called best_feature_df, it looks like you have a different data type assigned to best_feature_df, please check and try again." = class(best_feature_df) == "data.frame")

# Check if best_feature_df contains one row and two columns
stopifnot("The output should contain one row and two columns, please recheck your best_feature_df." = all.equal(dim(best_feature_df), c(1, 2)))

# Check for the correct feature
stopifnot("Expected a different feature in best_feature_df, please recheck your model evaluation." = best_feature_df[1,1] == "driving_experience")

# Check accuracy score
stopifnot("Expected a different accuracy score in best_feature_df, please recheck your model evaluation." = round(best_feature_df[1,2], 1) == 0.8)

Types of outputs

Below are the types of outputs produced following execution of SCTs, which run when a learner clicks "Submit Project" (note that this button will trigger the running of all cells in the notebook, following by the execution of sct.py / sct.R).

1. Success

Triggered when none of the SCTs throw an error including an AssertionError (assertion doesn’t match).

Always returns exactly what’s shown above, and then triggers the end of the project.

2. AssertionError

Triggered when the SCTs don’t match and therefore one returns an AssertionError and message.

This one will return “Your solution doesn’t look quite right” followed by the exact message returned by your AssertionError.

3. NameError

Triggered when your SCT attempts to access a variable that is now defined. Such as asserting df = something, but the learner hasn’t created a variable called df yet.

This always returns the message above, with the variable name at the end. You may wish to assert if a variable exists and return a custom message instead.

4. Any other error

Triggered when any other error is returned in the SCTs or the learners code.

This message is not customizable.

Did this answer your question?