The Instructions outline the challenge the learner needs to complete and provides details about the format that the answer needs to be provided in, i.e., the name of the variable(s), the number (and names) of columns, data type(s), and any quality restrictions such as missing values. When writing this section, think about the minimum information that learners will require in order to satisfy the SCTs (see this article for more information) and successfully complete the project.
It should focus on information to understand what is expected of them along with the answer format. It does not include information about how to complete the project (this should only be included in the Guides and Resources.
The vision is that learners who have completed the prerequisite course(s) or developed equivalent knowledge and skills elsewhere will be capable of completing the project based solely on reading the Context and Instructions, rather than relying on other sections such as Guides or Resources. The approach of their solution may not align with the Solution Code developed, but as long as the Submission Correctness Tests (SCTs) are met (put another way, the quality criteria for the output(s) are satisfied) then the attempt would be considered successful and the learner deemed to have demonstrated the Knowledge, Skills, and Abilities (KSAs) for the project.
Write 1-4 sentences explicitly detailing what learners must achieve. This should not detail how they complete the project.
Be specific about output/variable requirements. If learners must produce a string variable called
peak_performance then let them know this information, otherwise they might not pass the project's Submission Correctness Tests (SCTs). Likewise, if they must create a pandas DataFrame with specific columns then this information needs to be included.
Use one Instruction, in bullet point format, for each key output that learners will need to produce.
Limit to five Instructions. Best practice is one to three Instructions. If you need more than five then the project is likely too complex, too large, or you are being too prescriptive in the Instructions.
Utilize markdown syntax. For example, if you are referring to a variable then use backticks to render the name of the variable as code so it is clear to learners that this is part of the syntax they will need to use.
Common problems and their solutions
Being too vague. While instructions should allow learners to interpret what is needed and action in their own way, it is important to be specific about what the outcome should be in terms of outputs and performance requirements. For example, rather than saying “Build a model to predict Y and evaluate the model’s performance” you should say “Build a model, stored as
classification_model, to predict the target Y with an f1 score of at least 0.75.”.
Copy and paste. While it might be appropriate to include certain syntax, such as referring to a column name, you should avoid providing so much detail that learners can copy the instructions to execute the code required.
Example Python project Instructions
The following are Instructions from Analyzing Crime in Los Angeles:
The LAPD has asked you to help them by finding answers to the following questions:
Which hour has the highest frequency of crimes? Store as an integer variable called
Which area has the largest frequency of night crimes (crimes committed between 10pm and 4am)? Save as a string variable called
Identify the number of crimes committed against victims by age group (<18, 18-25, 26-34, 35-44, 45-54, 55-64, 65+). Save as a pandas Series called
Example SQL project Instructions
The following are Instructions from Exploring London's Travel Network:
Write three SQL queries to answer the following questions:
What are the most popular transport types, measured by the total number of journeys? The output should contain two columns, 1)
TOTAL_JOURNEYS_MILLIONS, and be sorted by the second column in descending order. Save the query as
Which five months and years were the most popular for the Emirates Airline? Return an output containing
JOURNEYS_MILLIONS, with the latter rounded to two decimal places and aliased as
ROUNDED_JOURNEYS_MILLIONS. Exclude null values and save the result as
Find the five years with the lowest volume of
Underground & DLRjourneys, saving as
least_popular_years_tube. The results should contain the columns