Cohort ID | Original Response | Harmonisation Response |
---|---|---|
Cohort A | Column age of positve integer values. 0 is used to indicate unknown values. | Value of 0 in age will be changed to NA. age_years will take the values of age. |
Cohort B | Column Age of positve integer values | age_years will take the values of Age. |
2 Demographic and Clinical Variables
Steps taken to harmonised data columns in relation to demographic and clinical variables are discussed here.
1 Age
age_years is the harmonised positive integer data field to denote the age of the patient during the time of the CT scan.
It is harmonised as follows:
2 Sex
sex is the harmonised data field to denote the sex of the patient during the time of the CT scan.
It holds the following values:
Value | Description |
---|---|
0 | female |
1 | male |
-1 | unknown |
It is harmonised as follows:
Cohort ID | Original Response | Harmonisation Response |
---|---|---|
Cohort A | Column sex with | Change the values of sex as follows: |
Cohort B | Column Sex with | Map the values of Sex to sex as follows: |
3 Height, Weight, BMI and BSA
height is the harmonised positive real data field to denote the height in cm of the patient during the time of the CT scan.
weight is the harmonised positive real data field to denote the weight in kg of the patient during the time of the CT scan.
bsa_m2 is the harmonised positive real data field to denote the body surface area in m2 of the patient during the time of the CT scan.
bmi is the harmonised positive real data field to denote the body mass index of the patient during the time of the CT scan.
All values are converted to two decimal places if the number of decimal places exceeded two.
They are harmonised as follows:
Cohort ID | Original Response | Harmonisation Response |
---|---|---|
Cohort A | Column height in cm of positve real numeric values in one decimal place. | height_cm will take the values of height. |
Cohort B | Column Height in cm of positve integer values. | height_cm will take the values of Height. |
4 Smoking History
smoke_current is the harmonised data field to denote if the patient is a current smoker during the time of the CT scan. smoke_past is the harmonised data field to denote if the patient is a past smoker during the time of the CT scan.
They hold the following values:
Value | Description |
---|---|
0 | no |
1 | yes |
-1 | unknown |
They are harmonised as follows:
Cohort ID | Original Response | Harmonisation Response |
---|---|---|
Cohort A | Column smoke_current_good with | smoke_current will take the values of smoke_current_good. |
Cohort B | Column Smoke History with | Map the values of Smoke History to smoke_current as follows: |
After harmonisation, we validate the values of smoke_current and smoke_past to ensure that there can only be the following cases:
Description | smoke_current | smoke_past |
---|---|---|
Non-smoker | 0 | 0 |
Past smoker | 0 | 1 |
Current smoker | 1 | 0 |
Unknown | -1 | -1 |
5 Have Shortness of Breath
have_sob is the harmonised data field to denote if the patient has shortness of breath during the time of the CT scan.
It holds the following values:
Value | Description |
---|---|
0 | no |
1 | yes |
-1 | unknown |
have_sob is harmonised as follows:
Cohort ID | Original Response | Harmonisation Response |
---|---|---|
Cohort A | Column have_sob with | have_sob remains unchanged. |
Cohort B | Column Dyspnea with | Map the values of Dyspnea to have_sob as follows: |
6 Have Chest Pain
have_chest_pain is the harmonised data field to denote if the patient has chest pain during the time of the CT scan.
It holds the following values:
Value | Description |
---|---|
0 | no |
1 | yes |
-1 | unknown |
have_chest_pain is harmonised as follows:
Cohort ID | Original Response | Harmonisation Response |
---|---|---|
Cohort A | Column chest_pain_type with | Map the values of chest_pain_type to have_chest_pain as follows: |
Cohort B | Column Chest Pain Character with | Map the values of Chest Pain Character to have_chest_pain as follows: |
7 Symptoms
symptoms is the harmonised data field to denote the patient’s symptoms during the time of the CT scan.
It holds the following values:
Value | Description |
---|---|
0 | asymptomatic |
1 | chest pain |
2 | only dyspnea |
3 | others |
-1 | unknown |
Regarding the symptoms: chest pain, dypsnea and other symptoms:
- If a patient has all three symptoms, chest pain will take the highest priority. Hence, symptoms = 1
- If a patient has both dyspnea and other symptoms (not chest pain related), dyspnea will take the higher priority. Hence, symptoms = 2
The general approach is to assume that the patients are asymptomatic (symptoms = 0) unless indicated that they have chest pain (symptoms = 1), dypsnea (symptoms = 2), other symptoms like heart palpitations (symptoms = 3) or all symptom related data fields are missing (symptoms = -1).
symptoms is harmonised as follows:
Cohort ID | Original Response | Harmonisation Response |
---|---|---|
Cohort A | Column have_sob with | Map the values of chest_pain_type and have_sob to symptoms as follows: |
Cohort B | Column Dyspnea with | Map the values of Chest Pain Character and Dyspnea to symptoms as follows: |
8 Chest Pain Type
chest_pain_type is the harmonised data field to denote the patient’s chest pain type during the time of the CT scan.
It holds the following values:
Value | Description |
---|---|
0 | no symptoms |
1 | typical |
2 | atypical |
3 | nonanginal |
4 | dyspnea |
-1 | unknown |
Regarding the symptoms: chest pain, dypsnea and other symptoms:
- If a patient has both chest pain (typical, atypical or nonanginal) and dyspnea, chest pain will take the higher priority. Hence, chest_pain_type will be either 1, 2 or 3
- If a patient has both dyspnea and other symptoms (not chest pain related), dyspnea will take the higher priority. Hence, chest_pain_type will be 4.
- If a patient has other symptoms that are neither chest pain nor dyspnea, like heart palpitations, chest_pain_type will be -1.
The general approach is to assume that the patients are asymptomatic (chest_pain_type = 0) unless indicated that they have a specific type of chest pain (chest_pain_type = 1, 2 or 3), dypsnea (chest_pain_type = 4), other symptoms like heart palpitations (chest_pain_type = -1) or all symptom related data fields are missing (chest_pain_typed = -1).
chest_pain_type is harmonised as follows:
Cohort ID | Original Response | Harmonisation Response |
---|---|---|
Cohort A | Column have_sob with | Map the values of chest_pain_type and have_sob to chest_pain_type as follows: |
Cohort B | Column Dyspnea with | Map the values of Chest Pain Character and Dyspnea to chest_pain_type as follows: |
After harmonisation, we validate the values of chest_pain_type and symptoms to ensure that there can only be the following cases:
Description | symptoms | chest_pain_type |
---|---|---|
Asymptomatic | 0 | 0 |
Have chest pain | 1 | 1, 2 or 3 |
Only dypsnea | 2 | 4 |
Other symptoms | 2 | -1 |
Unknown | -1 | -1 |