The ability to write functions or methods is a fundamental skill for data scientists, so it is important that this technique is taught and used regularly throughout the DataCamp curriculum. However, getting students functions to write functions introduces a number of challenges.
It generates a lot of extra cognitive load for the students, since they have to reason about what the function does and the syntax of the function and worry about the values of variables that they can't easily access directly.
Providing accurate, targeted feedback with the SCTs is harder than with a script.
Feedback from students suggests that writing functions is one of the topics on DataCamp that they find trickiest.
A great strategy is to break the work in two steps: write a script first, then convert it into a function. This separates the tasks of "how do I get the right answer?" and "how do I write a function?".
Depending on the complexity of the function you can have students do this either in a sequential exercise, or in separate, consecutive exercises.
A great bonus of this approach is that it will help students internalize the "prototype -> productionalize" workflow that many software engineers and data scientists follow.
Example
You want the students to write a function that calculates the geometric mean. This can be done in two sequential exercises, each with two steps.
Ex 1, Step 1: Write a script performing the computation.
Sample code
# Compute the geometric mean of x
___(___(___)))
Solution code
# Compute the geometric mean of x
exp(mean(log(x)))
Ex 1, Step 2: Convert the previous step to a function
Sample code
# Convert the script to a geomean function
exp(mean(log(x)))
Solution code
# Convert the script to a geomean function
geomean <- function(x) {
exp(mean(log(x)))
}
Ex 2, Step 1: Introduce missing values
Sample code
# Update this to ignore missing values
exp(mean(log(x)))
Solution code
# Update this to ignore missing values
exp(mean(log(x), na.rm = TRUE))
Ex 2, Step 2: Convert the previous step to a function
Sample code
# Convert the script to a geomean function
exp(mean(log(x), na.rm = TRUE))
Solution code
# Convert the script to a geomean function
geomean <- function(x, na.rm = FALSE) {
exp(mean(log(x), na.rm = na.rm))
}
As you can see, even getting the students to write a fairly simple function can take a lot of time. While this is a very introductory example and the difficulty level that you can get away with depends on the Roles that you are targeting, you should err on the side of getting the students to write shorter functions with fewer input arguments.