Hashing Algorithm And Its Techniques In DBMS: In a large database, data is stored at various locations. It becomes hectic and time-consuming when locating a specific type of data in a database via linear search or binary search. This problem is solved by “Hashing”.
Hashing is an advantageous technique which uses a hash function to find the exact location of a data record in minimum amount of time.
For example, we recorded data of multiple students in an alphabetical format in a database of college. But, it is still difficult to locate data every time using linear search.
Also See: Examples of Database Management System.
Hashing Algorithm And Its Techniques In DBMS
In this article, we will discuss what is Hashing algorithm and what are its various techniques like Division Method, Mid- Square Method and Folding method in DBMS.
A hash function used in hashing is also called ‘hashing algorithm’. A hashing algorithm uses a hash key to locate a data record. A hash key is a string of characters which is transformed into shorter-length hash address by a hashing algorithm.
A hashing function can vary from a simplest mathematical function to any complex function. Though it is required that a hash-function should be easy and quick to generate results.
While inserting a data record, the hash function uses a hash key to produce a relative hash address and allocate this hash address to each data record. Any time, when the database needs to search, update or delete a record; it looks for hash addresses related to each record and performs its desired operation.
A hash-function is termed to be “good” if it does not generate same hash-address for different hash-keys.
Following are some known hashing-algorithms used in the database. All of these hashing algorithms are easy and quick to compute results:-
Hashing Techniques In DBMS
- Division method:- A hash function which uses division method is represented as;
where; k signifies a hash key and m is chosen to be a prime no. which is greater than total no. of keys. In the above formulas, k(mod m) indicates the remainder when k is divided by m. The first formulae range hash addresses from 0 to m-1 where second formulae range hash addresses from 1 to m.
Also See: What is RDBMS
Consider a class with 68 students and each student is given a unique 4-digit student number. 1024, 2448 and 3466 are few unique student numbers. Here, student nos. denote k i.e. hash key and we choose m to be 71 which is greater than total no. of keys and also a prime number.
Using first formulae, value of hash address H(k) results into;
H(1024)=1024(mod 71)=14, H(2448)=2448(mod 71)=34, H(3466)=3466(mod 71)=48
Where 14, 34 and 48 are the hash-addresses of students associated with student no. 1024, 2448 and 3466. Similarly, if we use second formulae, the value of hash-addresses results into 15, 35 and 49 respectively.
- Mid-square method:- In this hashing algorithm, we square hash key(k) and eliminate digits from both ends of k2. It is recommended that same positions are always used from all the k2 values to retrieve hash-addresses. The formulae of mid-square method is represented below;
Where l is the value of hash address computed after eliminating digits from both ends of k2.
Following are the results of above example using mid-square method;
Notice that fourth and fifth digits are used to retrieve hash address counting from right.
- Folding method:- This hashing algorithm chop a hash key into no. of parts and compute a hash address after adding these parts and ignoring the carry. We can also reverse even-numbered parts to retrieve a hash key. Folding method hashing-algorithm is represented by;
Considering above example, hash keys available are: 1024, 2448 and 3466. Now compute hash-address using this method;
H(1024) = 10+24 = 34, H(2448) = 24+88 = 12, H(3466) = 34+66 = 00
Carry is ignored when we generated results using keys 2448 and 3466; ‘00’ is also a hash address in the database, it is not a nil number.
Now, observe that each hashing-algorithm results into a different hash address related to each student no. and these hash addresses are not equivalent to serial nos. of students.
So it was all about Hashing Algorithm And Its Techniques In DBMS. If you have any doubt in your mind then please comment below, we will surely help you.