Can I use data with profanity or sensitive content?
Many learners are taking DataCamp content at work, so in general, you should ensure that your content (including the datasets) is safe for work. If a learner gets offended and stops taking your course or project, they will stop learning, and you'll lose revshare (because they didn't complete your content).
Social media datasets pose a problem. Twitter data, in particular, is notorious for containing profanity and offensive content. Analysis of social media is an important skill for data scientists, so it is absolutely allowed to use those datasets, as long as you follow these principles:
Notify the learners
If there is some profanity or offensive content in the dataset, add a note to the first exercise or task where they see this data. For example:
> Be aware that this is real data from Twitter, and there is some use of profanity.
Don't push an agenda
If the dataset includes political or religious content, you must not push your own agenda or viewpoint. Remain as neutral as possible.
Try to avoid printing the sensitive content
Users actively digging through the dataset to find sensitive content is different from an exercise requiring them to print it on screen. If possible, avoid printing, or cherry-pick the safe bits of the dataset to print.
Choose safer topics
If the learning objective allows it, just choose a dataset that isn't likely to offend learners.