Indexing Essentials in SQL | Atlassian (2024)

What is Indexing?

Indexing makes columns faster to query by creating pointers to where data is stored within a database.

Imagine you want to find a piece of information that is within a large database. To get this information out of the database the computer will look through every row until it finds it. If the data you are looking for is towards the very end, this query would take a long time to run.

Visualization for finding the last entry:

Indexing Essentials in SQL | Atlassian (1)

If the table was ordered alphabetically, searching for a name could happen a lot faster because we could skip looking for the data in certain rows. If we wanted to search for “Zack” and we know the data is in alphabetical order we could jump down to halfway through the data to see if Zack comes before or after that row. We could then half the remaining rows and make the same comparison.

Indexing Essentials in SQL | Atlassian (2)

This took 3 comparisons to find the right answer instead of 8 in the unindexed data.

Indexes allow us to create sorted lists without having to create all new sorted tables, which would take up a lot of storage space.

What exactly is an Index?

An index is a structure that holds the field the index is sorting and a pointer from each record to their corresponding record in the original table where the data is actually stored. Indexes are used in things like a contact list where the data may be physically stored in the order you add people’s contact information but it is easier to find people when listed out in alphabetical order.

Let’s look at the index from the previous example and see how it maps back to the original Friends table:

Indexing Essentials in SQL | Atlassian (3)

We can see here that the table has the data stored ordered by an incrementing id based on the order in which the data was added. And the Index has the names stored in alphabetical order.

Types of Indexing

There are two types of databases indexes:

  1. Clustered
  2. Non-clustered

Both clustered and non-clustered indexes are stored and searched as B-trees, a data structure similar to abinary tree. AB-treeis a “self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time.” Basically it creates a tree-like structure that sorts data for quick searching.

Indexing Essentials in SQL | Atlassian (4)

Here is a B-tree of the index we created. Our smallest entry is the leftmost entry and our largest is the rightmost entry. All queries would start at the top node and work their way down the tree, if the target entry is less than the current node the left path is followed, if greater the right path is followed. In our case it checked against Matt, then Todd, and then Zack.

To increase efficiency, many B-trees will limit the number of characters you can enter into an entry. The B-tree will do this on it’s own and does not require column data to be restricted. In the example above the B-tree below limits entries to 4 characters.

Clustered Indexes

Clustered indexes are the unique index per table that uses the primary key to organize the data that is within the table. The clustered index ensures that the primary key is stored in increasing order, which is also the order the table holds in memory.

  • Clustered indexes do not have to be explicitly declared.
  • Created when the table is created.
  • Use the primary key sorted in ascending order.

Creating clustered Indexes

The clustered index will be automatically created when the primary key is defined:

CREATETABLEfriends(idINTPRIMARYKEY,nameVARCHAR,cityVARCHAR);

Once filled in, that table would look something like this:

Indexing Essentials in SQL | Atlassian (5)

The created table, “friends”, will have a clustered index automatically created, organized around the Primary Key “id” called “friends_pkey”:

Indexing Essentials in SQL | Atlassian (6)

When searching the table by “id”, the ascending order of the column allows for optimal searches to be performed. Since the numbers are ordered, the search can navigate the B-tree allowing searches to happen in logarithmic time.

However, in order to search for the “name” or “city” in the table, we would have to look at every entry because these columns do not have an index. This is where non-clustered indexes become very useful.

Non-clustered Indexes

Non-clustered indexes are sorted references for a specific field, from the main table, that hold pointers back to the original entries of the table. The first example we showed is an example of a non-clustered table:

Indexing Essentials in SQL | Atlassian (7)

They are used to increase the speed of queries on the table by creating columns that are more easily searchable. Non-clustered indexes can be created by data analysts/ developers after a table has been created and filled.

Note: Non-clustered indexes arenotnew tables. Non-clustered indexes hold the field that they are responsible for sorting and a pointer from each of those entries back to the full entry in the table.

You can think of these just like indexes in a book. The index points to the location in the book where you can find the data you are looking for.

Indexing Essentials in SQL | Atlassian (8)

Non-clustered indexes point to memory addresses instead of storing data themselves. This makes them slower to query than clustered indexes but typically much faster than a non-indexed column.

You can create many non-clustered indexes. As of 2008, you can have up to 999 non-clustered indexes in SQL Server and there is no limit in PostgreSQL.

Creating non-clustered databases(PostgreSQL)

To create an index to sort our friends’ names alphabetically:

CREATEINDEXfriends_name_ascONfriends(nameASC);

This would create an index called “friends_name_asc”, indicating that this index is storing thenamesfrom “friends” stored alphabetically inascendingorder.

Indexing Essentials in SQL | Atlassian (9)

Note that the “city” column is not present in this index. That is because indexes do not store all of the information from the original table. The “id” column would be a pointer back to the original table. The pointer logic would look like this:

Indexing Essentials in SQL | Atlassian (10)

Creating Indexes

In PostgreSQL, the “\d” command is used to list details on a table, including table name, the table columns and their data types, indexes, and constraints.

The details of our friends table now look like this:

Query providing details on the friends table: \d friends;

Indexing Essentials in SQL | Atlassian (11)

Looking at the above image, the “friends_name_asc” is now an associated index of the “friends” table. That means thequery plan, the plan that SQL creates when determining the best way to perform a query, will begin to use the index when queries are being made. Notice that “friends_pkey” is listed as an index even though we never declared that as an index. That is theclustered indexthat was referenced earlier in the article that is automatically created based off of the primary key.

We can also see there is a “friends_city_desc” index. That index was created similarly to the names index:

CREATEINDEXfriends_city_descONfriends(cityDESC);

This new index will be used to sort the cities and will be stored in reverse alphabetical order because the keyword “DESC” was passed, short for “descending”. This provides a way for our database to swiftly query city names.

Searching Indexes

After your non-clustered indexes are created you can begin querying with them. Indexes use an optimal search method known asbinary search. Binary searches work by constantly cutting the data in half and checking if the entry you are searching for comes before or after the entry in the middle of the current portion of data. This works well with B-trees because they are designed to start at the middle entry; to search for the entries within the tree you know the entries down the left path will be smaller or before the current entry and the entries to the right will be larger or after the current entry. In a table this would look like:

Indexing Essentials in SQL | Atlassian (12)

Comparing this method to the query of the non-indexed table at the beginning of the article, we are able to reduce the total number of searches from eight to three. Using this method, a search of 1,000,000 entries can be reduced down to just 20 jumps in a binary search.

Indexing Essentials in SQL | Atlassian (13)

When to use Indexes

Indexes are meant to speed up the performance of a database, so use indexing whenever it significantly improves the performance of your database. As your database becomes larger and larger, the more likely you are to see benefits from indexing.

When not to use Indexes

When data is written to the database, the original table (the clustered index) is updated first and then all of the indexes off of that table are updated. Every time a write is made to the database, the indexes are unusable until they have updated. If the database is constantly receiving writes then the indexes will never be usable. This is why indexes are typically applied to databases in data warehouses that get new data updated on a scheduled basis(off-peak hours) and not production databases which might be receiving new writes all the time.

NOTE: Thenewest version of Postgres (that is currently in beta) will allow you to query the database while the indexes are being updated.

Testing Index performance

To test if indexes will begin to decrease query times, you can run a set of queries on your database, record the time it takes those queries to finish, and then begin creating indexes and rerunning your tests.

To do this, try using the EXPLAIN ANALYZE clause in PostgreSQL.:

EXPLAINANALYZESELECT*FROMfriendsWHEREname='Blake';

Which on my small database yielded:

Indexing Essentials in SQL | Atlassian (14)

This output will tell you which method of search from the query plan was chosen and how long the planning and execution of the query took.

Only create one index at a time because not all indexes will decrease query time.

  • PostgreSQL’s query planning is pretty efficient, so adding a new index may not affect how fast queries are performed.
  • Adding an index will always mean storing more data
  • Adding an index will increase how long it takes your database to fully update after a write operation.

If adding an index does not decrease query time, you can simply remove it from the database.

To remove an index use the DROP INDEX command:

DROPINDEXfriends_name_asc;

The outline of the database now looks like:

Indexing Essentials in SQL | Atlassian (15)

Which shows the successful removal of the index for searching names.

Summary

  • Indexing can vastly reduce the time of queries
  • Every table with a primary key has one clustered index
  • Every table can have many non-clustered indexes to aid in querying
  • Non-clustered indexes hold pointers back to the main table
  • Not every database will benefit from indexing
  • Not every index will increase the query speed for the database

Next Topic
Single quote, double quote, and backticks in MySQL queries
Indexing Essentials in SQL | Atlassian (2024)

FAQs

Why is indexing essential in database? ›

An index offers an efficient way to quickly access the records from the database files stored on the disk drive. It optimizes the database querying speed by serving as an organized lookup table with pointers to the location of the requested data.

What is indexing in SQL? ›

Indexing makes columns faster to query by creating pointers to where data is stored within a database. Imagine you want to find a piece of information that is within a large database. To get this information out of the database the computer will look through every row until it finds it.

What is the default indexing in SQL? ›

Default Indexes

Creating an index in a SQL database changes how the data is searched. In MySQL, PostgreSQL, and Microsoft SQL, the default index data structure is a B-Tree. B-Tree based indexes are created by default on any primary key, foreign key, and uniquely constrained fields.

Which indexing is better in SQL? ›

As a rule of thumb, every table should have at least one clustered index preferably on the column that is used for SELECTING records and contains unique values. The primary key column is an ideal candidate for a clustered index.

What are the essentials of a good indexing system? ›

In order to achieve its objectives a good system of indexing should possess the following essential qualities : 1) Simplicity: It should be simple to understand and'operate. 2) Economy: It should be economical in terms of money, space and efforts. 3) Efficiency: It should ensure speed and accuracy of locating files.

What are the main purposes of indexing? ›

The objective of indexing is to organize and categorize information in a way that makes it easier to retrieve and access. It involves creating a list of keywords or terms associated with specific pieces of information, making it easier to find relevant information quickly.

How do you explain indexing? ›

Indexing, broadly, refers to the use of some benchmark indicator or measure as a reference or yardstick. In finance and economics, indexing is used as a statistical measure for tracking economic data such as inflation, unemployment, gross domestic product (GDP) growth, productivity, and market returns.

What are the advantages and disadvantages of indexing in SQL? ›

They have various advantages like increased performance in searching for records, sorting records, grouping records, or maintaining a unique column. Some of the disadvantages include increased disk space, slower data modification, and updating records in the clustered index.

How does indexing speed up SQL? ›

Indexing is a technique that creates a data structure, such as a tree or a hash table, that maps the values of a column to the rows that contain them. This makes it easier for the database to find and access the relevant data without scanning the entire table.

How many indexes can be created on a table? ›

Each table can have up to 999 nonclustered indexes, regardless of how the indexes are created: either implicitly with PRIMARY KEY and UNIQUE constraints, or explicitly with CREATE INDEX . For indexed views, nonclustered indexes can be created only on a view that has a unique clustered index already defined.

How to optimize a SQL query? ›

12 Ways to Optimize SQL Queries
  1. Use indexes effectively. ...
  2. Avoid SELECT queries. ...
  3. Reduce the use of wildcard characters. ...
  4. Use appropriate data types and layouts. ...
  5. Avoid redundant or unnecessary data retrieval. ...
  6. Use EXIST() instead of COUNT() queries. ...
  7. Avoid subqueries. ...
  8. Make use of cloud database-specific features.
Feb 8, 2024

How to see index in SQL? ›

To get indexes of a table in MySQL database using the MySQL command line, you can use the SHOW INDEXES command line as follow: Command: SHOW INDEXES FROM table_name; Specify the name of the table and it will return all indexes for that table.

What are the four types of indexes in SQL? ›

What are the primary types of indexes in SQL? The primary types of indexes in SQL include Primary Key, Unique Index, Clustered Index, Non-Clustered Index, Covering Index, Full-Text Index, Filtered Index, Spatial Index, XML Index, Hash Index, and Bitmap Index.

Should I index every column in SQL? ›

But this is not always a good idea, as there can be drawbacks to adding too many secondary indexes. You should always include a primary index on every table in your database. However, too many secondary indexes can begin to cause issues in some instances.

How to do proper indexing in SQL? ›

It is recommended to start indexing the table by creating a clustered index, that covers the column(s) called very frequently, which will convert it from the heap table to a sorted clustered table, then create the required non-clustered indexes that cover the remaining queries in the system.

Why is it important to be indexed? ›

Being indexed for relevant keywords guarantees that your product appears in search results, increasing its visibility to potential purchasers.

What is the benefit of indexing data? ›

Indexing can provide several advantages for your database, such as faster query execution, reduced disk input/output operations, optimized query plans, and enhanced data quality. Indexes can help the database find the data you need quickly without scanning the entire table.

What is the reason for indexing? ›

Document indexing is a tagging and categorization process that makes it easy to locate and retrieve specific pieces of information within a given set of documents. By identifying and extracting key identifiers from within each document, indexing enables near instantaneous retrieval of any file via text-based searches.

What is the purpose of the indexing function? ›

The INDEX function returns a value or the reference to a value from within a table or range. There are two ways to use the INDEX function: If you want to return the value of a specified cell or array of cells, see Array form.

Top Articles
Roku Is Ending Support For Its First 4K Streaming Player | Cord Cutters News
Fines for violating house rules
Wcco Crime News
Gasbuddy Joliet
It May Surround A Charged Particle Crossword
Jeff Bezos Lpsg
5daysON | Hoofddorp (70089000)
Spaghetti Top Webcam Strip
How To Find Someone's IP On Discord | Robots.net
Craigslist Cars For Sale San Francisco
Wausau Pilot Obituaries
102 Weatherby Dr Greenville Sc 29615
Thomas the Tank Engine
Times Thanksgiving Meals
Xsammybearxox
Walking through the Fire: Why nothing stops Jesus’ love for you - Ann Voskamp
Rs3 Ranged Weapon
Watch Jujutsu Kaisen 2nd Season English Sub/Dub online Free on HiAnime.to
Best Internists In Ft-Lauderdale
Does Publix Have Sephora Gift Cards
Hours For Autozone Near Me
Brake Masters 208
'Blue Beetle': Release Date, Trailer, Cast, and Everything We Know So Far About the DCU Film
Huffington Post Horoscope Libra
Miller's Yig
Sam's Club Gas Price Spring Hill Fl
25Cc To Tbsp
Craigslist Vt Heavy Equipment - Craigslist Near You
Samantha Lyne Wikipedia
San Bernardino Pick A Part Inventory
Language levels - Dutch B1 / 2 –What do these language levels mean? - Learn Dutch Online
Ltlv Las Vegas
Newton Chevrolet Of Russellville Photos
How Much Does Hasa Pay For Rent 2022
201-654-6727
Craigslist Labor Gigs Albuquerque
Culvers Flavor Of The Day Freeport Il
Media Press Release | riversideca.gov
Unblocked Games 76 Bitlife
Dan And Riya Net Worth In 2022: How Does They Make Money?
Craigslist Ct Bridgeport
Kytty_Keeet
1984 Argo JM16 GTP for sale by owner - Holland, MI - craigslist
Documentaries About FLDS: Insightful Looks into the Fundamentalist Church
Busted Magazine Columbus Ohio
Borderlands 2 Mechromancer Leveling Build
Ohio (OH) Lottery Results & Winning Numbers
Busted Newspaper Lynchburg County VA Mugshots
Liberty 1098-T
Love & Basketball streaming: where to watch online?
Guy Ritchie's The Covenant Showtimes Near Century 16 Eastport Plaza
Sammyflood
Latest Posts
Article information

Author: Duncan Muller

Last Updated:

Views: 5745

Rating: 4.9 / 5 (79 voted)

Reviews: 86% of readers found this page helpful

Author information

Name: Duncan Muller

Birthday: 1997-01-13

Address: Apt. 505 914 Phillip Crossroad, O'Konborough, NV 62411

Phone: +8555305800947

Job: Construction Agent

Hobby: Shopping, Table tennis, Snowboarding, Rafting, Motor sports, Homebrewing, Taxidermy

Introduction: My name is Duncan Muller, I am a enchanting, good, gentle, modern, tasty, nice, elegant person who loves writing and wants to share my knowledge and understanding with you.