Hash Tables

S1ght

Expert Member
Joined
Jan 23, 2006
Messages
3,301
Reaction score
27
Location
Berlin
Hi all,

Am having some trouble understanding hash tables/collision handling with random access files. I understand how to hash a key to get the record number in the file and then how to get a position in the file, so I can read/write to random access files assuming there are no collisions.

Now when they share the same record number in the file is where I have problems, I've read a lot online about hash tables but still can't grasp it completely :o I'm trying to implement it with Bucket Addressing with size of 5. There are 13 records in the file and am using Visual Basic.

So far I've gotten as far as declaring a variable of type hashfile and I have an array of records where the record has the key value in it...

Can anyone maybe explain this in a nice easy to understand way :)

Thanx in advance
S1ght
 
go clubbing... find a girl.. get laid... = no problem with "hash tables"
 
Should i post links to all the times you asked for help with homework? :p
 
I think your hash size may be your problem. Maybe increase the size of your hash to avoid collisions. SHA1 and MD5 both return 32 byte hashes
 
If you're writing to a record that's already taken, it keeps looking forward until it finds an unused block, and uses that.

Or am I understanding your question wrong?

That's one method of doing it but the problem with that is that it can cause it to occupy spaces of other records, so say you have a record at position 2 and position 3 is still empty so it goes there but the next record you try to write was meant to go to 3 so it then goes to 4 so its not the best method if you want most things to go to their original positions
 
Wow - haven't used hash tables in ages. Haven't really had any use for them either... Will have to think about this a bit, but don't count on a decent reply :p
 
Visual Basic is just our 1 subject where they made us take it i swear :p we do c++ as our major.
 
That's one method of doing it but the problem with that is that it can cause it to occupy spaces of other records, so say you have a record at position 2 and position 3 is still empty so it goes there but the next record you try to write was meant to go to 3 so it then goes to 4 so its not the best method if you want most things to go to their original positions

That's a shortcoming of a hash table. But a hash table was never meant to be perfect, just better than different methods. You can try other methods of collision resolution, but they'll usually have drawbacks of their own.

Since you're working with a file, you can go and physically shift everything up to the next position so that you can squeeze your new value into the optimum position, but that means anything read on the first hash takes longer to find. You gain performance on the one read, but lose it on another, and your insert is MUCH slower.

Either way, you didn't mention much on what the problem was, so it's kinda hard to solve it...
 
Im with Pyro, whats your problem exactly ? Maybe im just not understanding clearly? Instead of storing values store link lists (so if two values collide, store the second after the first and keep track of an index somehow), as far as i know that is one way to implement hash tables. I might just be understanding your question wrong though....
 
From what I can see he is try is trying to to use hash index's on buckets but the problem seems to be trying to use a file as a db with read write without locking. Files are not a good storage system medium. Unless you are on a AS400 for example.....
 
BSc IT at UJ, Informatics(VB) and Computer Science(c++ and Java), got some friends doing BIT at Tuks though
 
When I saw the C++ and compulsory VB, I just knew there is some form of Informatics in the mix...
 
Top
Sign up to the MyBroadband newsletter
X