SQL Server requires a specific set of steps in order to properly create and load a database into the course editor.
Below are the steps to ensure your database is properly loaded into the DataCamp Teach editor:
1. Set up your CSV files.
Each table in the database you use will need to be exported into a separate, UTF-8 encoded .csv file (this is an option in Microsoft Excel) and uploaded to the course editor in the
course-specs branch. Make sure there are no commas or special characters in the data contained in these files before you upload them to your course.
2. Creating your SQL file.
To set up your CSV files as a relational database, you need to write a .sql file and upload it to the course editor in the course-specs branch. The .sql file should include all of the information in this example file.
To properly initialize your database, the .sql file should:
- Drop + create a table for each .csv file in your database.
- Create the table with one column corresponding to each column in the CSV file that specifies the data type.
- Replace the table names in the .sql file in each “BULK INSERT” statement with the names of the .csv files you intend to include, with the term “-pipe” at the end. So if you have a table called “soccer”, you should bulk insert “soccer-pipe.csv”. This will be explained in Step 3.
- Include ‘GO’ after every statement.
- Upload the .sql file to the
course-specsbranch of your course in the Teach editor.
In general, we recommend you use only one .sql file per course. Your tables are not set up with primary/foreign keys so you can have unrelated tables in the same .sql file. When setting up your exercises, you will specify which tables you want students to see (more below).
3. Update your requirements.sh file.
After setting up your .sql file, you will need to edit the requirements.sh file in the
course-specs branch of your GitHub repository. This file will initialize the database and allow you to load the exercises and make tables available for your students to view in the exercises.
Here is an example file:
Follow these steps to initialize your database:
- Uncomment the
wgetstatement in line 31 (remove the # sign). Copy the link to your .csv files in the course editor. Paste them into the code on line 31, replacing the existing link. Each .csv file link will need to be pasted with its own
wgetstatement on a separate line.
- Copy the link to your .sql file into the course editor with a
wgetstatement, similar to the previous step (See line 32). Paste this AFTER the .csv file
In order to properly import the .csv files, a few additional processing steps are necessary, which are set up using the requirements.sh file.
- Set up the code in line 35 to strip the header from every .csv file in your database. (one separate line per file). Replace ‘UFOData-small.csv’ with the name of your .csv file, and give it a name with ‘-noheader.csv’ at the end.
- Set up the code to replace the new lines with pipes (how the data is read from one line to the next). Using the file you just created (with ‘-noheader.csv’ at the end), replace ‘UFOData-pipe.csv’ with your filename and ‘-pipe.csv’ at the end. *Note: This is why your .sql file has to include the .csv file with -pipe.csv at the end!
- Uncomment lines 41 and 42. Replace UFOData.sql with the name of your .sql file.
Here is what the file should look like (see the red lines for what's been changed):
4. Set up your pre-exercise code.
In every exercise you create on the Teach editor, you will need to load the database using the pre-exercise code. Paste the following code into each exercise’s pre-exercise code, replacing DATABASENAME and TABLE, keeping them in single quotations:
connect(full_path = "mssql+pyodbc://$DB_USER:$DB_PASSWORD@localhost:1433/DATABASENAME?driver=MSODBC")
set_options(visible_tables = ['TABLE'])
You may choose to load one or more tables for students to see under
set_options. This will be visible to students in their console as they work on your exercises.
- Preview your exercises. You won’t know for sure that the database has loaded properly until you get a blank query result with the visible tables listed as tabs on the bottom of your screen. See below:
- You will need to set up a complete query in the solution code that accurately interacts with your database before your exercises load properly. If your solution code is blank or your query is inaccurate (i.e., missing a comma), your preview session will break.
- You will need to carefully consider what queries you include in your exercises. Correlated subqueries, for example, will take a long time to initialize in the exercise.