No announcement yet.

New fields for large table that only impact a small subset of the data

  • Filter
  • Time
  • Show
Clear All
new posts

  • New fields for large table that only impact a small subset of the data

    We need to add new fields to one of our tables but only a small subset of records will be impacted (or contain relevant values). I'm looking for some guidance about whether or not we should add the fields to the table or create a new table which will have a 1 to 1 relationship with the old table but only contain records where relevant data exists.

    Some stats:

    Current table has 2.5mm records, only about 50,000 records actually matter ( we may at some point clean it up ).

    I need to add these fields to the table:

    field1 ( varchar (40) )
    field2 ( int, indexed )
    field3 ( date )
    field4 ( enum ('value1' (default), 'value2', 'value3', 'value4') )
    field5 ( int )
    field6 ( date )

    Only 50,000 records or 2.5% will ever have data in the new fields. Does it make sense to add them to the current table or create a new table with a 1 to 1 join relationship?

    Any suggestions / help is appreciated.


  • #2
    Hmm, well since it's so few rows that will contain the data you are right in suggesting to keep it in a separate table.

    But then again if we say that each row increase with about 20-25 bytes with the extra columns then the total size of the table might gain between 50MB and 75MB.

    So..., I think I would go with adding the columns to the existing table to keep the design simple, but you could do it either way and I don't think you would really notice any difference performance wise (except if you don't have a index on one of these extra fields, and you are forcing MySQL to perform a table scan of these fields, at that point a table scan of the 2.5% table and then retrieving the corresponding rows from the main table via primary key would be faster than traversing 100% of the rows where 97.5% is empty ) ).


    • #3
      Thanks for the response, we came to pretty much the same conclusion and are adding the fields to the existing tables.