Database design (MySQL)

thesadguy · Jan 7, 2010

I have 2 tables, entity_A and entity_B.

entity_A has a field that links to one or more records in entityB.

entity_B has only a handful of records (say 20), while entity_A has a few thousand.

What is the best way to design for this? It doesn't seem to make sense to have entity_B have an entity_A_id field, as multiple records in entity_A would require linkage to the same entity_B record. In this case, entity_B would be many times larger than entity_A, with a LOT of redundant information (as I say, a few thousand records instead of just 20).

Currently, the system maintains an object for each entity_A record. The object is serialised and stored in the database. When it needs to be retrieved, it is unserialised. The entity_B table, is only used for ease-of-use for the user; displaying a form with a list of entity_B records and an "Add" button.

The big problem with this approach is that the entity_B data is not easily searchable.

Any ideas?

Other Pineapple Smurf · Jan 7, 2010

You will need a mapping table (Table C) if table_a needs to reference more than one row from table_b

example:

Table A:
ref
name
...

Table B:
ref
name
...

Table C:
ref_a
ref_b

SQL: (Psudo)

SELECT a.*, b.* FROM table_c JOIN table_a JOIN table_b WHERE table_a.ref=table_c.ref_a AND table_b.ref=table_c.ref_b;

Other Pineapple Smurf · Jan 7, 2010

BTW: Just remember that disk space is cheap and DBs are highly optimised apps. Use the basics and never reinvent the wheel.

Mapping tables can also grow but never be afraid of their size. Also use foreign key constraints on your tables. It makes it more bullet proof and easier to understand what you are doing.

After working with DBs everyday for the last 3yrs I've accepted this.

thesadguy · Jan 7, 2010

cbrunsdonza said:
You will need a mapping table (Table C) if table_a needs to reference more than one row from table_b

How would that look?

Couldn't I just as easily do what I described above, have table_b have multiple records, one for each of table_a's records' entity_b entries? Then table_b would be very large, but it would work.

Having a table_c to do the mapping would surely require a record for each mapping of a table_b record to a table_a record? Or am I missing the point?

Mr.Jax · Jan 7, 2010

"In this case, entity_B would be many times larger than entity_A" not quite true.
Multiple entity A records can reference the exact same record in entity B, no need to duplicate the data in entity B at all.

Ofcourse, assuming I understand you correctly.

Real world example: ent_B contains 'Cities' and the user can add new Cities

ent_A contains addresses, like street, house numberm, suburb and ofcource the city referenced in ent_B

Does that sort of describe the way the tables are used ?
If so, then the design is fine, as long as data is not duplicated in ent_B, because thats just not neccessary.
(I don't know MySQL that well, I know oracle, SQL server very well. Does MySQL have primary keys or unique indexes ? )

thesadguy · Jan 7, 2010

But as I say, having 20 records vs ~3 x size(table_a)... doesn't make sense.

Other Pineapple Smurf · Jan 7, 2010

thesadguy said:
But as I say, having 20 records vs ~3 x size(table_a)... doesn't make sense.

Don't fight it.

Say you have 1000 entries in table_a (unique rows) and each need to reference 3 (or even 20) rows in table_b

Your mapping table will have 3000 rows then.

Also dbs tend to grow. Its a good habit to learn mapping tables now. You will always use them in the future.

thesadguy · Jan 7, 2010

Sorry, can't divulge the ACTUAL usage of the tables, but a correct analogy follows.

2 tables: People and Cities

People contains a few thousand people, and one of the fields is "cities_lived_in"
Cities contains only about 20 cities.

Clearly some people would have lived in more than 1 city.

It's guaranteed that every Person has lived in AT LEAST one of the Cities.

Other Pineapple Smurf · Jan 7, 2010

*** UPDATED TO BE MORE RELEVANT TO YOUR QUESTION ***

Table A: rows 3
=====
Piet
Koos
Jan
...

Table B: rows 4
=====
Cape Town
Jhb
Durban
Pofadder
...

Table C: 6 rows

Table_A_ref
Table_B_ref
When_Lived
=====
Piet - Cape Town - 2009
Koos - Cape Town -2009
Koos - Jhb -2001
Jan - Pofadder - 2009
Jan - Cape Town - 2008
Jan - Jhb - 2007

* SELECT WILL GIVE YOU *

Piet, ..., Cape Town, ..., 2009
Koos, ..., Cape Town, ..., 2009
Koos, ..., Jhb, ..., 2001
Jan, ..., Pofadder, ..., 2009
Jan, ..., Cape Town, ..., 2008
Jan, ..., Jhb, ..., 2007

Nod · Jan 7, 2010

Get a book called "Physical Database design for mere mortals". Explain the use of linking tables, and how to normalize databases.

Other Pineapple Smurf · Jan 7, 2010

thesadguy said:
Sorry, can't divulge the ACTUAL usage of the tables, but a correct analogy follows.

2 tables: People and Cities

People contains a few thousand people, and one of the fields is "cities_lived_in"
Cities contains only about 20 cities.

Clearly some people would have lived in more than 1 city.

It's guaranteed that every Person has lived in AT LEAST one of the Cities.

OK

This is a textbook mapping table example. Any SQL beginners book will resolve it with my recommendation.

See the above post for a psudo layout

flarkit · Jan 7, 2010

thesadguy said:
Sorry, can't divulge the ACTUAL usage of the tables, but a correct analogy follows.

2 tables: People and Cities

People contains a few thousand people, and one of the fields is "cities_lived_in"
Cities contains only about 20 cities.

Clearly some people would have lived in more than 1 city.

It's guaranteed that every Person has lived in AT LEAST one of the Cities.

That analogy/scenario implies using a mapping table, unless you only use the people table to track the cities they've lived in and don't mind duplicating the people for each city

Mr.Jax · Jan 7, 2010

Right, got it ta.

As the others have said, linking table is best practice.

PARENT CHILD SETUP:

PEOPLE
- PERSONID (primary key)
- NAME
- bla bla bla

CITIES
- CITY_ID
- NAME
- BLABLABLA

PEOPLE_CITIES
- PERSONID (primary key, and foreign key to PEOPLE )
- CITY_ID (primary key, and foreign key to CITIES )
-

Above, as others said, perfect for your application.

thesadguy said:
Sorry, can't divulge the ACTUAL usage of the tables, but a correct analogy follows.

2 tables: People and Cities

People contains a few thousand people, and one of the fields is "cities_lived_in"
Cities contains only about 20 cities.

Clearly some people would have lived in more than 1 city.

It's guaranteed that every Person has lived in AT LEAST one of the Cities.

greggpb · Jan 7, 2010

cbrunsdonza said:
*** UPDATED TO BE MORE RELEVANT TO YOUR QUESTION ***

Table A: rows 3
=====
Piet
Koos
Jan
...

Table B: rows 4
=====
Cape Town
Jhb
Durban
Pofadder
...

Table C: 6 rows

Table_A_ref
Table_B_ref
When_Lived
=====
Piet - Cape Town - 2009
Koos - Cape Town -2009
Koos - Jhb -2001
Jan - Pofadder - 2009
Jan - Cape Town - 2008
Jan - Jhb - 2007

* SELECT WILL GIVE YOU *

Piet, ..., Cape Town, ..., 2009
Koos, ..., Cape Town, ..., 2009
Koos, ..., Jhb, ..., 2001
Jan, ..., Pofadder, ..., 2009
Jan, ..., Cape Town, ..., 2008
Jan, ..., Jhb, ..., 2007

++

thesadguy · Jan 7, 2010

Ok. If you guys say so, I trust you. Just didn't seem like the elegant solution I was looking for.

So you reckon I should have a table that contains all unique people, another table that contains all unique cities, and then a third table that contains the mappings between them? Bearing in mind, that we have quite a few fields that we need to reference in this way?

Basically have an extra mapping table for each of these relationships?

This kind of thing isn't frowned upon?

thesadguy · Jan 7, 2010

Oh, and how would I go about designing this to be user-expandable?

Currently, the user is able to add a new "entity_X" table and populate it with unique records. Then (possibly many) of these entity_X records need to be linked to various People records. This is easily accomplished using the abovementioned object/serialise/unserialise method. But it seems that using the mapping table would require 2 new tables to be built as a new entity table is added, and I would probably need to keep a third extra table to describe to the system how they all relate?

Nod · Jan 7, 2010

Without knowing what exactly you're trying to do, we can only help as so far as the information you provide to us.

Don't be afraid to use more tables. In the end it is better to query multiple smaller tables, than one big table. Also, for indexes your rows should be as unique as possible, and having one big table will make this less likely.

thesadguy · Jan 7, 2010

Ok, so, while it doesn't seem that elegant, it is best to

1) Have a separate table for each piece of unique information (i.e. people table with fields for personal information)
2) Have a separate table for each type of data we're trying to extract

?

greggpb · Jan 7, 2010

Try smothing like that for data driven field additions

PERSON(ID,Name)
1,Gregg
2,Bob

CITY(id,Name)
1,Cape Town
2,Durban

PERSON_CITY(ID,Person_ID,City_ID)
1,1,1
2,1,2
3,2,1

DATA_TYPE(ID,Type )
1,Street
2,Surburb

PERSON_CITY_DATA(PersonCity_ID,DataType_ID,DataPointValue)
1,1,MainRoad
1,2,Seapoint
2,1,Gimpy Street

PERSON_CITY Table could be left out depeding on how the data would be used.

Mr.Jax · Jan 7, 2010

If you have multiple pieces of data that would be referenced like that, you can still use only one physical linking table

Simply add a 'link type' to the table, obviously part of the PK, (we do this in our systems btw)

e.g.

ENTITY_LINK
- SOURCE ID
- TARGET ID
- LINK_TYPE

Unfortunately, you will lose some strict referential integrity if the source/target objects are from various different tables (e.g. Cities and Suburbs all referenced by TARGET_ID)

EDIT: Example
SOURCEID,TARGETID,LINKTYPE
Jan, Pretoria, 1
Jan, Jhb, 1
Jan, Koffie, 2
Jan, Tee, 2
etc...

Join the MyBroadband community

Get started

Database design (MySQL)

Active Member

Honorary Master

Honorary Master

Active Member

Expert Member

Active Member

Honorary Master

Active Member

Honorary Master

Honorary Master

Honorary Master

Executive Member

Expert Member

Expert Member

Active Member

Active Member

Honorary Master

Active Member

Expert Member

Expert Member