Hashing in Data Structure is a method that makes it possible to search for, add, and delete data in an average of O(1) time. It is what makes "Hash Maps," "Dictionaries" in Python, and "Objects" in JavaScript work.
What is Hashing in Data Structure?
Using a mathematical formula called a Hash formula, hashing maps enormous volumes of data into compact tables. The idea is to change a "Key" (such a name or an ID) into a "Index" (a number) that tells you where to find something in a Hash Table.
Important Components Of Hashing:
- Hash Table: A list of pointers to the real data.
- Hash Function: The algorithm that finds a key and gives it a certain index in the table.
- Collision: When two different keys produce the same index (the biggest challenge in hashing).
Importance Of Hashing in Data Structure
It takes O(n) time to find an element in an array of size n. Hashing in Data Structure lets you get straight to the index where the data is kept.Here is why it is important:
- Speed: It provides the fastest way to retrieve data.
- Uniqueness: It is widely used to identify duplicate records in massive datasets.
- Security: Hashing is the basis of modern cryptography and password storage (though cryptographic hashing is slightly different from data structure hashing).
What is The Hash Function in Data Structure?
A "good" hash function should be:
- Deterministic: The same key must always produce the same index.
- Fast: It shouldn't take more time to calculate the hash than to search the list.
- Uniformly Distributed: It should spread keys evenly across the table to minimize collisions.
Common Hash Functions:
- Division Method: h(k) = k \pmod m (where m is the table size).
- Mid-Square Method: Square the key and extract the middle digits.
- Folding Method: Break the key into parts and add them together.
Collision Resolution Techniques for Hashing in Data Structure
Since a hash function maps a large universe of keys to a smaller set of indices, collisions are inevitable. Hashing in data structures uses two main strategies to handle this:
A. Chaining (Open Hashing)
Each cell of the hash table points to a Linked List of records that have the same hash index.
- Pros: Easy to implement; the table never gets "full."
- Cons: If many keys map to one index, the search time becomes O(n) (a "Degenerate" case).
B. Open Addressing (Closed Hashing)
All elements are stored in the hash table itself. If a collision occurs, we look for the next empty slot.
- Linear Probing: Look at the next slot (i + 1, i + 2, i + 3, and so on)
- Quadratic Probing: Look at slots at intervals of i² (1, 4, 9, and so on.)
- Double Hashing: Use a second hash function to find the step size.
Example of Hashing in Data Structure
Let’s look at a simple hashing in data structure example using the Division Method (k mod 10).
Keys to store: 12, 25, 32, 44, 55
| Key |
Hash (k(mod10)) |
Action |
| 12 |
2 |
Store at index 2. |
| 25 |
5 |
Store at index 5. |
| 32 |
2 |
Collision! If using Linear Probing, move to index 3. |
| 44 |
4 |
Store at index 4. |
| 55 |
5 |
Collision! Move to next empty slot (index 6). |
Hashing in Data Structure in C
In C, hashing is often implemented using an array of structures or linked lists. Here is a basic skeleton of hashing in data structure in c.
C
#include <stdio.h>
#define SIZE 10
int hash_table[SIZE];
int hashFunction(int key) {
return key % SIZE;
}
void insert(int key) {
int index = hashFunction(key);
while(hash_table[index] != 0) { // Linear Probing
index = (index + 1) % SIZE;
}
hash_table[index] = key;
}
int main() {
insert(42);
insert(22); // Collision handled by Linear Probing
printf("Value at index 2: %d\n", hash_table[2]);
printf("Value at index 3: %d\n", hash_table[3]);
return 0;
}
Efficiency of Hashing in Data Structure
- The efficiency of hashing in data structures depends on the load factor (alpha).
- Load factor (alpha) = Number of elements stored (n) / Size of the hash table (m)
- If the load factor is close to 0, the hash table has more empty slots, so operations are usually faster.
If the load factor goes beyond 0.7, collisions tend to increase, and the table may need rehashing, which means resizing it to a larger table, often with a prime number size.
- A slightly cleaner version for article use:
- Efficiency of Hashing in Data Structure
The efficiency of hashing in data structures depends on the load factor (alpha), which is calculated as:
- Load factor (alpha) = Number of elements stored (n) / Size of the hash table (m)
- When the load factor is close to 0, the table has plenty of empty space, so searching, inserting, and deleting are usually faster.
When the load factor becomes too high, usually above 0.7, collisions increase, and the hash table may need rehashing, which means resizing it to a larger table.).
Applications of Hashing in Data Structure
- Database Indexing: Hash indexes are used by most databases to quickly find "exact match" queries.
- Caching: Hash tables are used by web browsers to save data from sites that were just visited so that they can load quickly.
- Compiler Design: A "Symbol Table" (which is made possible by hashing) is used by compilers to keep track of variable names and their values.
- Blockchain: Each block in a blockchain has a hash of the block before it, which makes sure that the data hasn't been changed.
Hashing in Data Structure in Hindi (संक्षिप्त विवरण)
यदि आप hashing in data structure in hindi समझना चाहते हैं, तो इसे इस तरह देखें: हेशिंग एक ऐसी तकनीक है जो डेटा को उसके "Key" के आधार पर एक यूनिक इंडेक्स में बदल देती है। इससे डेटा को ढूंढना बहुत तेज़ हो जाता है। उदाहरण के लिए, जैसे एक लाइब्रेरी में हर किताब का एक खास कोड होता है जिससे उसे तुरंत ढूंढा जा सके, वैसे ही हेशिंग काम करती है।
Also Read :