It's important to think about the jargon that your DataCamp content will use before you begin developing. If a student cannot follow because terms haven't been defined, you will likely receive a low rating and few completions. To prepare for development, write a list of technical terms, jargon, and acronyms that will be used in your content that will likely need to be defined or discussed. Listed below are some common questions, problems, and suggestions to help you think about how to approach jargon.

What can I assume that the students will know already?

It depends a lot on the prerequisites for the course. For example, introductory statistical courses need to explain what a statistical model is. For most statistical courses, you don't need to do that since they will have the introductory courses as a prerequisite.

Think of synonyms

Often there are several different words for a particular concept. For example, columns in a rectangular dataset may also be referred to as fields or features or variables, depending upon context. Many students may have head one term but not the others, so it can be useful to explain synonyms to them. Once you've used and explained a term/synonym, you should pick one consistently use that term/synonym in your content.

I can't think of anything

Split the problem up into different areas. Are there any terms related to the code? Are there any terms related to the statistical modeling techniques? Are there any terms from the application domain? Are there any terms that might be needed to explain the dataset?

Examples

From a course on experimental design. This has an extensive list of statistical terms.

  • Randomization, replication, blocking, Latin Square, Greco-Latin Squares, factorial, ANOVA, T-test, F-test, normality, qqplot, variance, type I/II error, null/alternative hypothesis, effect size, factor/categorical variable

From a course on clinical trials analysis. This includes both statistical terms and domain-specific terms.

  • Bias, blinding, randomization, imbalance, covariates, endpoints, power, multiplicity, significance, non-inferiority, equivalence, bioequivalence.

From a course on data privacy. This is an extensive list of both statistical terms and privacy-related terms.

  • Statistical Disclosure Limitation, Data Synthesis, K-Anonymity, Neighboring Databases, Randomized Response, Differential Privacy, Composition Rules, Group Privacy, Post-processing, Global Sensitivity, Histogram queries, Laplace Mechanism, Exponential Mechanism, Differential Privacy Data Synthesis
Did this answer your question?