The world of databases is becoming more accessible thanks to tools that convert natural language into SQL. One such tool is Text2SQL.ai, which simplifies querying databases by allowing users to input queries in plain English and automatically generating the corresponding SQL commands. This can be particularly helpful for users who may not be familiar with SQL syntax but need to interact with databases.
In this blog, we will explore how Text2SQL.ai performs when querying the Chinook Database, a sample dataset that models a digital music store. We’ll start with a simple query and then move on to more complex queries, demonstrating how well Text2SQL.ai handles these different levels of complexity.
The Chinook Database Overview
The Chinook Database has a typical structure you’d expect from a digital music store. It includes the following tables:
- Artist: Stores information about music artists.
- Album: Contains albums by artists.
- Track: Lists individual tracks with details like duration and price.
- Genre: Describes the genre of each track.
- MediaType: Represents the format of each track.
With this schema, we’ll test how well Text2SQL.ai can generate SQL for different types of queries. We load the schema and data first.
-- Create the Artist table
CREATE TABLE Artist (
ArtistId INTEGER PRIMARY KEY AUTOINCREMENT,
Name NVARCHAR(120)
);
-- Insert sample data into Artist table
INSERT INTO Artist (Name) VALUES ('AC/DC');
INSERT INTO Artist (Name) VALUES ('Aerosmith');
INSERT INTO Artist (Name) VALUES ('Queen');
-- Create the Album table
CREATE TABLE Album (
AlbumId INTEGER PRIMARY KEY AUTOINCREMENT,
Title NVARCHAR(160),
ArtistId INTEGER,
FOREIGN KEY (ArtistId) REFERENCES Artist(ArtistId)
);
-- Insert sample data into Album table
INSERT INTO Album (Title, ArtistId) VALUES ('High Voltage', 1);
INSERT INTO Album (Title, ArtistId) VALUES ('Rocks', 2);
INSERT INTO Album (Title, ArtistId) VALUES ('A Night at the Opera', 3);
-- Create the Genre table
CREATE TABLE Genre (
GenreId INTEGER PRIMARY KEY AUTOINCREMENT,
Name NVARCHAR(120)
);
-- Insert sample data into Genre table
INSERT INTO Genre (Name) VALUES ('Rock');
INSERT INTO Genre (Name) VALUES ('Jazz');
INSERT INTO Genre (Name) VALUES ('Metal');
-- Create the MediaType table
CREATE TABLE MediaType (
MediaTypeId INTEGER PRIMARY KEY AUTOINCREMENT,
Name NVARCHAR(120)
);
-- Insert sample data into MediaType table
INSERT INTO MediaType (Name) VALUES ('MPEG audio file');
INSERT INTO MediaType (Name) VALUES ('Protected AAC audio file');
INSERT INTO MediaType (Name) VALUES ('Protected MPEG-4 video file');
-- Create the Track table
CREATE TABLE Track (
TrackId INTEGER PRIMARY KEY AUTOINCREMENT,
Name NVARCHAR(200),
AlbumId INTEGER,
MediaTypeId INTEGER,
GenreId INTEGER,
Composer NVARCHAR(220),
Milliseconds INTEGER,
Bytes INTEGER,
UnitPrice DECIMAL(10, 2),
FOREIGN KEY (AlbumId) REFERENCES Album(AlbumId),
FOREIGN KEY (MediaTypeId) REFERENCES MediaType(MediaTypeId),
FOREIGN KEY (GenreId) REFERENCES Genre(GenreId)
);
-- Insert sample data into Track table
INSERT INTO Track (Name, AlbumId, MediaTypeId, GenreId, Composer, Milliseconds, Bytes, UnitPrice)
VALUES ('It\'s A Long Way To The Top', 1, 1, 1, 'Angus Young, Malcolm Young, Bon Scott', 320000, 10485760, 0.99);
INSERT INTO Track (Name, AlbumId, MediaTypeId, GenreId, Composer, Milliseconds, Bytes, UnitPrice)
VALUES ('Back In Black', 1, 1, 1, 'Angus Young, Malcolm Young, Brian Johnson', 300000, 10485760, 0.99);
INSERT INTO Track (Name, AlbumId, MediaTypeId, GenreId, Composer, Milliseconds, Bytes, UnitPrice)
VALUES ('Sweet Emotion', 2, 1, 1, 'Steven Tyler, Tom Hamilton', 250000, 8192000, 0.99);
INSERT INTO Track (Name, AlbumId, MediaTypeId, GenreId, Composer, Milliseconds, Bytes, UnitPrice)
VALUES ('Bohemian Rhapsody', 3, 2, 1, 'Freddie Mercury', 354000, 12800000, 1.29);
Scenario 1: Simple Query
Query: List all artists.
We’ll start with a very basic query that retrieves the names of all artists in the database. This is a straightforward query where we expect Text2SQL.ai to generate a simple SELECT
statement.
Natural Language:
List all artists.
SQL Generated:
SELECT * FROM Artist;
This SQL statement is exactly what we expect. It selects all columns from the Artist table and will return all the artist names. Text2SQL.ai performs perfectly on this simple task.
Scenario 2: Mid-Complexity Query
Query: Show the albums by AC/DC.
For the next query, we want to find all albums by the artist AC/DC. This requires joining the Album and Artist tables to match the artist’s name with their albums.
Natural Language:
Show the albums by AC/DC.
SQL Generated:
SELECT Album.Title
FROM Album
JOIN Artist ON Album.ArtistId = Artist.ArtistId
WHERE Artist.Name = 'AC/DC';
Text2SQL.ai correctly generates a SQL query that joins the Album and Artist tables and filters for albums by AC/DC. It demonstrates its ability to handle more complex queries involving JOIN
operations.
Scenario 3: Complex Query
Query: Find the total duration of all tracks by Queen.
In this query, we’re asking to calculate the total duration of all tracks by the artist Queen. This requires joining the Track, Album, and Artist tables and summing up the total duration of the tracks.
Natural Language:
Find the total duration of all tracks by Queen.
SQL Generated:
SELECT SUM(Track.Milliseconds)
FROM Track
JOIN Album ON Track.AlbumId = Album.AlbumId
JOIN Artist ON Album.ArtistId = Artist.ArtistId
WHERE Artist.Name = 'Queen';
The generated SQL is spot on. Text2SQL.ai correctly uses the SUM
function to calculate the total duration of all tracks by Queen by joining the necessary tables. This shows that Text2SQL.ai is capable of handling aggregation functions and more complex joins.
Scenario 4: Ultra Complex Query
Query: Find the top 3 most expensive albums and their total track count, then return the artist with the most albums from that list.
This query is quite complex. It involves multiple steps:
- Find the top 3 most expensive albums by summing the prices of their tracks.
- Count the number of tracks for each of those albums.
- Identify the artist with the most albums among these top 3 albums.
This requires multiple joins, aggregation functions, and the use of Common Table Expressions (CTEs) to break down the query into manageable parts.
Natural Language:
Find the top 3 most expensive albums and the total number of tracks on each. Then find the artist with the most albums among these.
SQL Generated:
WITH
AlbumPrices AS (
SELECT
Album.AlbumId,
Album.Title,
Album.ArtistId,
SUM(Track.UnitPrice) AS TotalPrice,
COUNT(Track.TrackId) AS TrackCount
FROM
Album
JOIN Track ON Album.AlbumId = Track.AlbumId
GROUP BY
Album.AlbumId
),
TopAlbums AS (
SELECT
AlbumId,
Title,
ArtistId,
TotalPrice,
TrackCount
FROM
AlbumPrices
ORDER BY
TotalPrice DESC
LIMIT
3
),
ArtistAlbumCount AS (
SELECT
ArtistId,
COUNT(AlbumId) AS AlbumCount
FROM
TopAlbums
GROUP BY
ArtistId
)
SELECT
a.Name AS ArtistName,
aac.AlbumCount
FROM
ArtistAlbumCount aac
JOIN Artist a ON aac.ArtistId = a.ArtistId
ORDER BY
aac.AlbumCount DESC
LIMIT
1;
The query is correct and efficiently broken down into several steps:
- AlbumPrices: This CTE calculates the total price of each album and counts the number of tracks.
- TopAlbums: This CTE selects the top 3 albums based on their total price.
- ArtistAlbumCount: This CTE counts how many albums each artist has among the top 3 albums.
- Finally, the query returns the artist with the most albums from the top 3 most expensive ones.
This shows that Text2SQL.ai is capable of handling complex SQL scenarios involving multiple CTEs, joins, and aggregation functions. It can handle advanced use cases with ease.
Conclusion
Text2SQL.ai performed exceptionally well in our tests using the Chinook Database, from simple queries to highly complex ones. The tool is capable of generating accurate SQL for a range of use cases, whether you’re asking for a simple list of artists or performing multi-step aggregations and joins. After testing the Chinook Database, a great next step would be to explore whether Text2SQL.ai can handle more complicated scenarios, e.g. found in enterprise resource planning (ERP) systems like SAP HANA.
For non-technical users who need to query a database without writing SQL themselves, Text2SQL.ai provides a user-friendly interface that can handle even the most complex queries. Whether you’re dealing with a simple lookup or an intricate query involving multiple steps, this tool can help streamline the process and save you time.
Feel free to test Text2SQL.ai with your own database and queries—you may be surprised by just how powerful it can be!