PrevPrev Go to previous topic
NextNext Go to next topic
Last Post 09 Jun 2011 12:00 AM by  SuperUser Account
Paediatric tables / more design issues
 2 Replies
Sort:
You are not authorized to post a reply.
Author Messages

SuperUser Account



Basic Member


Posts:289
Basic Member


--
26 Apr 2011 12:00 AM

    Hello everybody,

    I have discovered another issue with the pediatric tables and need your input to fix it the right way.

    Problem: tblPREG had only MOTHER_ID as primary key. This implies that a woman can only get pregnant once.

    Proposed fix: Add MENS_D (the date of the last known menstrual period) to tblPREG's primary key. This allows women to get pregnant multiple times. However, it requires also adding the MENS_D field to all tables which reference records in tblPREG, i.e. tblPREG_OBS, tblPREG_OUT, tblDELIVERY_MUM.

    I edited the wiki articles accordingly and added an article Notes on Pregnancy Tables which summarizes the overall model of the current proposal.

    This makes the primary keys of the individual tables rather large (3 fields for tblDELIVERY_MUM). Besides that, MENS_D is only approximatively known, therefore it is rather unsuited for a key. It may even become a quality assurance problem if the cohorts do not enforce referential integrity in their databases and we would then end up with pregnancy outcomes referencing no or non-existing pregnancies etc.

    Request for comments:

    • Is MENS_D a good key variable?
    • Is it also applicable and unambiguous in case of an In Vitro Fertilisation?
    • Are there more reliable variables that can be used as a key?
    • Should we just use dedicated ID variables instead (e.g. a pregnancy ID which is referenced in tblPREG_OBS, etc.) to enforce referential integrity?

    I'm especially interested in the opinions of the pediatric cohorts' data managers!

    Kind regards,

    Simon


    SuperUser Account



    Basic Member


    Posts:289
    Basic Member


    --
    27 Apr 2011 12:00 AM


    Forwarding Yacine's reply via email:

    Dear Simon,

    I have had a look at the current design of the pediatric tables. I understand your issue about primary key and uniqueness. Please find below some comments and suggestions

    tblPREG - Pregnancy

    following your comments, i would suggest to have a dedicated ID variable, like PREG_SEQ, to take into account the case of multiples pregnancies within the life of a cohort making the couple (MOTHER_ID, PREG_SEQ) the primary key for this table. The cohort team should be able to manage such a variable that will be under their control. The Date of last menstrual period MENS_D (or even the variable GYNAE_D) will be used to control the consistency of PREG_SEQ.

    tblPREG_OBS - Obstretical problems during pregnancy in a relationship

    (0,1) with tblPREG should have the same primary key/reference key (MOTHER_ID, PREG_SEQ). An occurence should exist in tblPREG_OBS only if PROB_Y of tblPREG is Yes. I wondered whether a variable is not missing in this table ? We need to have beside the variable PROB_T (all types of obstetrical problem), an additional numeric variable PROB_V (Problem_Value) taking the values: 1=Yes, 0=No, 9=Unknown.

    tblPREG_OUT

    We assume that CHILD_ID exists even when the variable OUTCOM takes the values 10, 11, 20, 21, ie the pregnancy outcome is negative. This is an important thing to check with the cohort managers. The primary key could be (MOTHER_ID, PREG_SEQ, CHILD_ID) with (MOTHER_ID, PREG_SEQ) as the reference key with tblPREG.

    -tblDELIVERY_MUM and tblDELIVERY_CHILD

    Obvioulsy tables tblDELIVERY_MUM and tblDELIVERY_CHILD are linked to tblPREG in a relationship (1,1*) and (1,n*) respectively, that means that in some cases a PREGNANCY could not lead to any record in either tblDELIVERY_MUM or tblDELIVERY_CHILD (in case of termination for instance, QA check). (MOTHER_ID, PREG_SEQ) could serve as a primary key (and reference key) for tblDELIVERY_MUM. (MOTHER_ID, PREG_SEQ, CHILD_ID) and (MOTHER_ID, PREG_SEQ) will be the primary key and the reference key to tblPREG.

    -tblNEWBORN is in relationship (1,1) with tblDELIVERY_CHILD

    tblNEWBORN_ABNORM is in relationship (1,1*) with tblNEWBORN, an

    occurence in tblNEWBORN_ABNORM existing only if ABNORM_Y of tblNEWBORN = Yes. As for table tblPREG_OBS, i wondered whether we don't need to add the variable ABNORM_V (value corresponding to ABNORM_T) unless you are assuming that cohort will only report the occurring abnormalities (is we don't want to register the No and Unknown values). Please could you clarify ?

    I hope these comments will be useful.

    Best regards, Yacine

    SuperUser Account



    Basic Member


    Posts:289
    Basic Member


    --
    09 Jun 2011 12:00 AM


    I can understand the idea behind the PREG_SEQ variable, yet I'm not a fan of it. The general idea with HICDEP was to create a format for exchange that cohorts could map into. Not a format for which they would have to generate any values, even those that are just numeric IDs.

    However - I would design my own database the same way and I don't have any good alternatives that wouldn't in some extreme allow for missing consistency.

    So I'm just raising this to everyone to try and think with the data at hand instead of creating new IDs. This should be the last resort - as done here

    /Jesper
    You are not authorized to post a reply.