Writing an item (when the database is already available)

When starting to write a SQL item, you might need to be careful of the following:

  • Add a schema diagram in the context.

  • Use ORDER BY to avoid returning different orders of the columns from the output.

  • Use LIMIT to make sure all the rows in the output should fit in the page without scrolling down.

Pre-challenge Code

You need to specify which database you are using for this item in the item pre-challenge code. See an example of a pre-challenge below, which is loading the spotify database for the item.

#load the company database

engine = connect('postgresql', 'spotify')

Schema Diagram

If you need to create a schema diagram, please use QuickDBD. We don’t have a paid version for this website, but the free version is good enough to generate a nice looking schema diagram.

After creating the schema diagram, export it as a png file and upload to the Assets in Teach.

Adding a new Database

If you want to add a new database to the assessment, you need to follow the steps below. The instruction contains an example of how to create a database with multiple tables.

Step 1 - Upload the CSV files to the Assets in Teach

This could be one file or multiple files depending on how many tables you want to include in the database.

Suppose I want to create a database called spotify, which includes three tables. I will upload them into the Assets.

Step 2 - Save as a .sql file with code inside

The code should include all the tables and their columns, and links generated from the CSV files you uploaded to the Assets in Teach.

After uploading three CSV files, a spotify.sql file is created with the following code inside:

DROP TABLE IF EXISTS "artists";

CREATE TABLE artists(

id TEXT,

followers INT,

name TEXT,

popularity INT

);

DROP TABLE IF EXISTS "tracks";

CREATE TABLE tracks(

id TEXT,

name TEXT,

artist_id TEXT,

release_date TEXT,

popularity INT,

duration_ms INT

);

DROP TABLE IF EXISTS "features";

CREATE TABLE features(

song_id TEXT,

valence INT,

dance_level INT,

energy INT,

loudness INT

);

COPY artists

FROM PROGRAM 'curl "https://assets.datacamp.com/production/repositories/5447/datasets/1fda3e9acb2fcd0128fb597ae61b2b893be5d870/spotify_artists.csv"'(DELIMITER ',', FORMAT CSV, HEADER);

COPY tracks

FROM PROGRAM 'curl "https://assets.datacamp.com/production/repositories/5447/datasets/c34b9681dc5ccf2363bdbb7c17507b260e6c4e2f/spotify_tracks.csv"'(DELIMITER ',', FORMAT CSV, HEADER);

COPY features

FROM PROGRAM 'curl "https://assets.datacamp.com/production/repositories/5447/datasets/37a9ece5f74f2a383b4b45d70c0a0069a3ee8037/spotify_features.csv"'(DELIMITER ',', FORMAT CSV, HEADER);

Step 3 - Upload the .sql file to the Assets in Teach

After saving a spotify.sql file, upload the file to the Assets in Teach.

Step 4 - Add the link of the .sql file to the requirements.sh file in the repository for the assessment with a WGET statement.

The code should include the link generated from the .sql files you uploaded to the Assets in Teach. Also, make sure the database name is the one you are trying to create in the code.

After uploading the spotify.sql file to the Assets in Teach, the following code is added to the requirement.sh file in the repository.

# downloading the database file from the internet

wget https://assets.datacamp.com/production/repositories/6140/datasets/a2805a7e8873aa3859f70c9ab1e8ef1d975090c7/spotify.sql

#initiate and start postgresql server

#all the database names should be the one you are trying to use here

service postgresql start \

&& sudo -u postgres createdb --owner repl spotify \

&& sudo -u postgres psql --echo-all --dbname spotify --file spotify.sql \

&& service postgresql stop

When the last step is done, it will trigger a build. After the build is successful, you can refresh Teach and write an item to test whether you can successfully query data from that database. If you could not query the data, don't panic!

The following checks might help you make it work:

Check 1 - Code in the .sql file

  • Are all the column names for each table reflects the CSV files?

  • Are all the data types for each column reflects the CSV files?

  • Are table names consistent?

Check 2 - Code in the requirement.sh file in the repository

  • Is the .sql file link latest?

  • Have you changed all the database name in the example codes?

If you are still experiencing issues. Please contact us in the slack channel #certification-item-writers.

Did this answer your question?