Chamanparaa, P., Moghimbeigi, A., Faradmal, J., Poorolajal, J. (2015). Exploring the spatial patterns of three prevalent cancer latent risk factors in Iran; Using a shared component model. International Journal of Epidemiologic Research, 2(2), 68-77.

Parisa Chamanparaa; Abbas Moghimbeigi; Javad Faradmal; Jalal Poorolajal. "Exploring the spatial patterns of three prevalent cancer latent risk factors in Iran; Using a shared component model". International Journal of Epidemiologic Research, 2, 2, 2015, 68-77.

Chamanparaa, P., Moghimbeigi, A., Faradmal, J., Poorolajal, J. (2015). 'Exploring the spatial patterns of three prevalent cancer latent risk factors in Iran; Using a shared component model', International Journal of Epidemiologic Research, 2(2), pp. 68-77.

Chamanparaa, P., Moghimbeigi, A., Faradmal, J., Poorolajal, J. Exploring the spatial patterns of three prevalent cancer latent risk factors in Iran; Using a shared component model. International Journal of Epidemiologic Research, 2015; 2(2): 68-77.

Exploring the spatial patterns of three prevalent cancer latent risk factors in Iran; Using a shared component model

^{1}Golestan Research Center of Gastroenterology and Hepatology, Golestan Univercity of Medical Sciences, Golestan, Iran

^{2}Modeling of Noncommunicable Disease Research Canter, Department of Biostatistics and Epidemiology, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran

Abstract

Background and aims: The aim of this study was the modeling of the incidence rates of Colorectal, breast and prostate cancers using a shared component model in order to explore the spatial pattern of their shared risk factors (i.e., obesity and low physical activity) affecting on cancer incidence, and also to estimate the relative weight of these shared components. Methods: In this study, the new cases of colorectal, breast and prostate cancers information provided by the Management Center of Ministry of Health and Medical Education in 2009 were analyzed. The Bayesian shared component model was used. In addition, BYM (Besag, York and Mollie) model was applied to investigate the geographical pattern of disease incidence rates, individually. Results: The larger effect of obesity on the incidence of the relevant cancers was found in Ardabil, West Azarbaijan, Gilan, Zanjan, Kurdistan, Qazvin, Tehran, Mazandaran, Hamadan, Kermanshah, Semnan, Golestan, Yazd and Kerman, and this component was more important for prostate cancer compared to colorectal and breast cancers. In addition, low physical activity shared component had more effect on the incidence of colorectal and breast cancers in Ardabil, Zanjan, Qazvin, Tehran, Mazandaran, Markazi, Lorestan, Kermanshah, Ilam, Khuzestan, South Khorasan, Yazd, Kerman and Fars, and also, this component was more important for Breast cancer compared to Colorectal cancer. Conclusion: Based on deviance Information criterion, combined modeling of three understudy cancers using a shared component model was better than modeling them individually using BYM model.

Cancer is one of the most common causes of death in the world and it is the second leading cause of death in Iran where approximately 70,000 new cases of cancer occur in the country annually.1-2 According to the National Cancer Registry report in 2009, breast, colorectal and prostate cancers were among the most common cancers in all of the Iranian provinces.^{3}

Breast cancer is the most common cancer in the world after the lung cancer,^{4} and it is the most common cancer among Iranian women as well as throughout the world.^{3,5} In fact, about 21.4% of women in Iran who suffer from cancer are among this type of cancer.^{6} Many risk factors of breast cancer have been reported, but it is impossible to identify the specific ones.^{7} In a study, 21% of all deaths in the world related to breast cancer were attributed to overweight, low physical activity and alcohol consumption. Additionally, in high income and low or medium income countries obesity and low physical activity were important modifiable risk factors, respectively.^{8}

On the other hand, colorectal cancer is the second most common cancer in the world,^{9} where nearly a million new cases of this cancer are diagnosed every year and half of these cases resulted to death.^{10} Modifiable risk factors associated with this cancer include poor diet, low physical activity, being overweight, smoking and alcohol consumption.^{11}

Moreover, according to the global cancer statistics, prostate cancer is the second most common cancer in males, as 11.7% of the total new cancer cases are related to this type of cancer. This proportion is 19% and 3.5% in the developed and developing countries, respectively.^{12} The main causes of this disease are still unknown.^{13} Poor diets, being overweight and smoking have been noted the modifiable risk factors for this cancer.^{14}

One of the suitable methods for analysing each set of data is producing and inspecting graphs which display an outstanding feature of the data. In spatial epidemiology this is called disease mapping. Disease mapping acts as an exploratory analysis to gain an impression of the geographical distribution of disease or its risk factors.^{15} The main objectives of disease mapping are to describe the areas with high risk to formulate hypothesis of etiology and provide detailed maps of disease risks in order to allocate the better resources and public health policies.^{16} All of the population-based maps like the maps produced using standardized mortality ratio (SMR) or standardized incidence ratio (SIR) are unbiased estimators of relative risks (RR) that help us to determine the geographical variation of disease incidence or mortality rates.^{17}

But these methods have also some disadvantages: because these indexes are based on the proportion estimation, small changes in the expected values can lead to large changes in risk estimation; When the expected value is zero (or near to zero), the value of these indexes for each positive observed value is too large or impossible to be estimated; Also, these methods don't consider the expected similarities in the relative risk of adjacent or neighbour areas. So, it can be said that it is difficult to make a clear decision based on these criteria.^{18} To resolve these problems, different methods have been proposed. Amongst them Bayesian methods have been emphasized because of greater flexibility in modelling the complexity of data structures and more reliable results. In addition, this method allows us to take into account the spatial correlation of disease rates between neighbouring areas (the tendency of neighbouring areas to be more similar in disease rates) in order to consider the effect of geographical structure, and to provide more realistic estimates of RR.^{17} Many studies have focused on geographical modelling of disease, while many diseases have common risk factors that recently led to the appearance of joint disease mapping.^{19} Joint disease mapping can be defined as the spatial modelling of two or more diseases in two or more subsets of the at risk population.^{17,20} The significant advantages of these models are: Their ability to assess the common and specific patterns of different disease risk; improving estimation accuracy of diseases variation patterns; and determining joint clusters associated with diseases common risk factors.^{20}

In the past two decades, many methods have been proposed for joint disease mapping. The first study that introduced joint disease mapping has been performed by Langford et al.^{21} and Leyland et al.^{22} Then, a shared component model has been proposed for detecting joint and selective clustering of two diseases.^{23} After this study, Held et al. developed a shared component model for more than two diseases.^{24} Moreover, in another study, four methods of joint modelling were compared and it was concluded that the shared component model adds more versatility in answering epidemiological basic questions.^{25}

Mahaki and colleagues studied the spatial distribution of latent risk factors including smoking, obesity, inadequate consumption of fruit and vegetables, socioeconomic status and low physical activity which were in common among seven most prevalent cancers esophagus, stomach, bladder, colorectal, lung, prostate and breast cancers using a shared component model for data in 2007.^{26}

In another study, Chamanpara and colleagues modelled the geographical variation of esophagus and gastric cancers jointly using the data from 2004 to 2008, in Golestan, Iran where diet low in fruit and vegetable intake was considered as a shared component.^{27}

Because of the inherent relationship between these cancers, we can assume, among the risk factors mentioned for breast, colorectal and prostate cancers, that obesity is a common risk factor for all these cancers, while low physical activity is common for breast and colorectal cancers. So, in this study, we intended to use a Bayesian shared component model for joint modelling of these three cancer incidence rates in Iran, in order to explore the pattern of spatial correlation among them, and to estimate the relative weight of the shared risk factors, obesity and low physical activity, in the population of Iran in 2009.

METHODS

In this study, we applied new cases of colorectal (ICD10 code C18-C20, C26), prostate (C61) and breast (C50) cancers in all provinces in 2009 that reported by No communicable Disease Management Centre of the ministry of Health and Medical Education. According to the obtained censuses in 2006 and 2011, the total population of the country was 70495782 and 75149669 persons, respectively. Since we have used the new observed cases of cancers in 2009, we considered at risk population as the proportion of the population reported in two censuses (0.6× the population in 2006 + 0.4 × the population in 2011) and it was estimated as 72357336.8. In this article, we used a shared component model proposed by Held et al.^{24} for jointly modelling of the spatial variations for showing the incidence rates of cancers. We considered obesity or overweight (shared risk factor for all three cancers) and low physical activity (shared between breast and colorectal cancers) as the latent shared risk factors. In fact, a common feature of the model is, considering the shared components (obesity or low physical activity) as the dominant surrogates of all common and latent risk factors of these cancers. So, the result of joint maps shows the spatial variation of all unobserved spatially-structured risk factors that effect on diseases where the understudy components have been chosen as representative of them.^{28}

Let O _{i j} and E _{i j} represent the number of the observed and expected cases for j-th disease in i-th province. The expected number of cases in each province is calculated by multiplying the total incidence rate of disease and population of province. Let O _{i j} follows a Poisson distribution with mean μ _{i j}=E _{i j}.R _{i j} where E _{i j} and R _{i j} are the number of expected cases and the relative risk for disease j in the province i, respectively. The R _{i j} is unknown parameter of the model. Maximum likelihood estimation of the incidence rates, (R ̂ _{i j}), is obtained by dividing the number of the observed cases by its expected value of j-th disease in the i-th province.

In addition, to consider the information of the adjacent neighbors of each province, we used the popular BYM model. In this model, the logarithm of the relative risk of j-th disease in i-th province was written as below:

log(R _{i j})= α_{j}+u_{i j}+v_{i j}

Where α_{j} is an intercept, u_{i j} and v_{i j} are the structured and unstructured random effects. The random effect v_{i j} (uncorrelated heterogeneity) is a component that models the effect of unstructured dispersion between regions and it follows a normal distribution with zero mean and variance of . The structured random effect, u_{i j}, (correlated heterogeneity), considers local dependence in space and assumes weight for adjacent areas. Also, this component model of the conditional autoregressive normal (CAR normal) assumes the conditional distribution of each area-specific structured component with a mean equal to the average of its neighbors, and variance inversely proportional to the number of these neighbors. So we have:

Where l shows adjacent provinces with the province i (i= 1, 2, ..., 287) and n_{i} shows the number of adjacent provinces.

On the other hand, in this study, a Bayesian shared component model proposed by Held et al.^{24} was used for the joint analysis of the spatial distribution of three cancer incidence rates. The obesity and low physical activity were considered as the shared components. Thus in this model, the logarithm of RRs is as below:

Where the R_{_i1} , R_{_i2}, and R_{_i3} are the RRs of colorectal, breast and prostate cancers in i-th province, respectively. The parameter α_{_j} is the interceptor of j-th disease and the parameter λ_{_i1} is the shared component of obesity that is common for three understudy cancers. The λ_{_i2} is the share component of low physical activity that is common among the breast and colorectal cancer. Each shared component related to RR weighted by the scale parameter, δ, to allow different risk gradient (on the log scale). The terms ε_{_i j }are the heterogeneous effects to capture possible variations not explained by the other model terms.

The Bayesian models allocated priors to unknown parameters, whether fixed or random effects. In the joint model, the shared random effects, λ_{_i}, were considered normal conditional autoregressive priors with unit weights for the neighboring provinces to capture local dependence in space. We considered a uniform prior distribution for the intercept that is specific to each disease α_{_j}, independent normal prior distributions were used for the logarithms of the scaling parameters, log δ, and for the terms ε_{_i j}, assuming a multivariate normal prior distribution with covariance matrix showing the correlation between the cancers. The inverse of this matrix known as a precision matrix, Σ-1, models to arise from a Wishart (Q,3) prior distribution, where Q is set to be a diagonal matrix with 1s.^{29-31}

We fitted the BYM and shared the components model to the available data using Open BUGS version 3.1.2. We considered two independent Markov chains. To ensure the convergence of chains, after visual inspections, we used Gelman-Rubin and Raftery-Lewis diagnostic tests via R using the coda package. After a sufficient (30,000) burn-in to remove the effects of the initials, the following 15,000 iterations were sampled from each of the two chains choosing lag=15 to avoid possible autocorrelation. The estimated RRs were subsequently mapped to GeoBUGS package. Also, for checking the appropriateness of the model, the deviance information criterion (DIC) was used. In this case, the DIC of the joint model was compared with the sum of the DIC values from the three individual BYM models.^{32}

RESULTS

Ilam and Tehran provinces had the minimum and maximum population in 2009, 550512 and 13891781 persons, respectively. Also in this year, the number of colorectal, breast and prostate cases was reported as 6210, 7822 and 3856, respectively. Figure 1 shows the pattern of RR for studied cancers separately estimated by BYM model. According to this Figure the colorectal cancer had high RR in the central, north and northwest provinces (RR>1.5: Tehran; 1.2

The estimations of the two understudy shared components are presented in Figures 2 and 3. According to the map in Figure 2 the shared component of obesity had more effect on cancer incidence in the north, northwest and central regions, including the provinces of Ardabil, West Azarbaijan, Gilan, Zanjan, Kurdistan, Qazvin, Tehran, Mazandarn, Hamedan, Kermanshah, Semnan, Golestan, Yazd and Kerman.

Also, the shared component of low physical activity, which is shown in Figure 3, had larger effect on cancer incidence in the provinces of Ardabil, Zanjan, Qazvin, Tehran, Mazandaran Markazi, Lorestan, Kermanshah, Ilam, Khuzestan, South Khorasan, Yazd, Kerman and Fars.

Table 1 displays the posterior median estimation of scale parameters (level of importance) that each share component has for the different cancers. If the proportion of the two weights is greater than one, indicating that the share component is more important for the disease its weight is located in the numerator. Therefore, the greatest estimated value of the scale parameters relevant to the specific shared component indicates the more importance of that shared component of the disease which has the largest weight.^{28}

Table 1: Posterior median and 95% CrI for weights (level of importance) of three cancers in the shared component model

Risk factor

Cancer

Median (95%CrI^{*})

Obesity

Colorectal

0.976(0.444, 2.210)

Breast

0.971(0.432, 2.199)

Prostate

0.995(0.437, 2.146)

Low physical activity

Colorectal

0.985(0.447, 2.169)

Breast

0.991(0.449, 2.146)

^{*}Bayesiancredibilityinterval

Kohkiluie-and-Boier Ahmad

Charmahal-and-Bakhtiari

Hamadan

Lorestan

Yazd

Isfahan

Markazi

Qom

Tehran

Qazvin

Zanjan

West Azarbaijan

Kermanshah

Ilam

Khuzestan

Bushehr

Fars

Hormozgan

Sistan-and- Baluchestan

South Khorasan

Golestan

North Khorasan

Razavi Khorasan

Semnan

Mazandaran

Gilan

Ardabil

East Azarbaijan

Kurdistan

Kerman

Ardabil

Bushehr

Golestan

South Khorasan

Qazvin

Qom

Markazi

Isfahan

Yazd

Sistan-and- Baluchestan

Kerman

Hormozgan

Fars

Kohkiluie-and-Boier Ahmad

Charmahal-and-Bakhtiari

Khuzestan

Ilam

Lorestan

Hamadan

Kurdistan

Kermanshah

Tehran

Semnan

Razavi Khorasan

North Khorasan

Mazandaran

Gilan

Zanan

West Azarbaijan

East Azarbaijan

j

Hamadan

Isfahan

Yazd

Qom

Markazi

South Khorasan

Sistan-and- Baluchestan

Hormozgan

Kerman

Fars

Bushehr

Kohkiluie-and-Boier Ahmad

Charmahal-and-Bakhtiari

Ilam

Lorestan

Kermanshah

Semnan

Razavi Khorasan

North Khorasan

Golestan

Mazandaran

Tehran

Qazvin

Kurdistan

Gilan

Zanjan

West Azarbaijan

East Azarbaijan

Ardabil

Khuzestan

Figure 1: Map of RR (a): colorectal cancer (b); breast cancer (c); prostate cancer, in 2009 with Besag et al. model (BYM)

Ardabil

Golestan

Bushehr

Hormozgan

Ilam

Kohkiluie-and-Boier Ahmad

Qazvin

Qom

Kerman

South Khorasan

Yazd

Semnan

Razavi Khorasan

North Khorasan

Mazandaran

Sistan-and- Baluchestan

Khuzestan

Isfahan

Charmahal-and-Bakhtiari

Fars

Gilan

Zanjan

East Azarbaijan

West Azarbaijan

Lorestan

Markazi

Hamadan

Kermanshah

Kurdistan

Tehran

Figure 2: Map of the posterior median of the shared component representing obesity/overweight (including colorectal, breast and prostate cancers)

Bushehr

Sistan-and- Baluchestan

Qazvin

Hamadan

Tehran

Golestan

Ardabil

East Azarbaijan

West Azarbaijan

Gilan

Isfahan

Hormozgan

Qom

Markazi

Lorestan

Khuzestan

Charmahal-and-Bakhtiari

Kohkiluie-and-Boier Ahmad

North Khorasan

Mazandaran

Yazd

Zanjan

Razavi Khorasan

Ilam

Kermanshah

Kerman

Cancer is one of the most common causes of death in the world and it is the second leading cause of death in Iran where approximately 70,000 new cases of cancer occur in the country annually.1-2 According to the National Cancer Registry report in 2009, breast, colorectal and prostate cancers were among the most common cancers in all of the Iranian provinces.^{3}

Breast cancer is the most common cancer in the world after the lung cancer,^{4} and it is the most common cancer among Iranian women as well as throughout the world.^{3,5} In fact, about 21.4% of women in Iran who suffer from cancer are among this type of cancer.^{6} Many risk factors of breast cancer have been reported, but it is impossible to identify the specific ones.^{7} In a study, 21% of all deaths in the world related to breast cancer were attributed to overweight, low physical activity and alcohol consumption. Additionally, in high income and low or medium income countries obesity and low physical activity were important modifiable risk factors, respectively.^{8}

On the other hand, colorectal cancer is the second most common cancer in the world,^{9} where nearly a million new cases of this cancer are diagnosed every year and half of these cases resulted to death.^{10} Modifiable risk factors associated with this cancer include poor diet, low physical activity, being overweight, smoking and alcohol consumption.^{11}

Moreover, according to the global cancer statistics, prostate cancer is the second most common cancer in males, as 11.7% of the total new cancer cases are related to this type of cancer. This proportion is 19% and 3.5% in the developed and developing countries, respectively.^{12} The main causes of this disease are still unknown.^{13} Poor diets, being overweight and smoking have been noted the modifiable risk factors for this cancer.^{14}

One of the suitable methods for analysing each set of data is producing and inspecting graphs which display an outstanding feature of the data. In spatial epidemiology this is called disease mapping. Disease mapping acts as an exploratory analysis to gain an impression of the geographical distribution of disease or its risk factors.^{15} The main objectives of disease mapping are to describe the areas with high risk to formulate hypothesis of etiology and provide detailed maps of disease risks in order to allocate the better resources and public health policies.^{16} All of the population-based maps like the maps produced using standardized mortality ratio (SMR) or standardized incidence ratio (SIR) are unbiased estimators of relative risks (RR) that help us to determine the geographical variation of disease incidence or mortality rates.^{17}

But these methods have also some disadvantages: because these indexes are based on the proportion estimation, small changes in the expected values can lead to large changes in risk estimation; When the expected value is zero (or near to zero), the value of these indexes for each positive observed value is too large or impossible to be estimated; Also, these methods don't consider the expected similarities in the relative risk of adjacent or neighbour areas. So, it can be said that it is difficult to make a clear decision based on these criteria.^{18} To resolve these problems, different methods have been proposed. Amongst them Bayesian methods have been emphasized because of greater flexibility in modelling the complexity of data structures and more reliable results. In addition, this method allows us to take into account the spatial correlation of disease rates between neighbouring areas (the tendency of neighbouring areas to be more similar in disease rates) in order to consider the effect of geographical structure, and to provide more realistic estimates of RR.^{17} Many studies have focused on geographical modelling of disease, while many diseases have common risk factors that recently led to the appearance of joint disease mapping.^{19} Joint disease mapping can be defined as the spatial modelling of two or more diseases in two or more subsets of the at risk population.^{17,20} The significant advantages of these models are: Their ability to assess the common and specific patterns of different disease risk; improving estimation accuracy of diseases variation patterns; and determining joint clusters associated with diseases common risk factors.^{20}

In the past two decades, many methods have been proposed for joint disease mapping. The first study that introduced joint disease mapping has been performed by Langford et al.^{21} and Leyland et al.^{22} Then, a shared component model has been proposed for detecting joint and selective clustering of two diseases.^{23} After this study, Held et al. developed a shared component model for more than two diseases.^{24} Moreover, in another study, four methods of joint modelling were compared and it was concluded that the shared component model adds more versatility in answering epidemiological basic questions.^{25}

Mahaki and colleagues studied the spatial distribution of latent risk factors including smoking, obesity, inadequate consumption of fruit and vegetables, socioeconomic status and low physical activity which were in common among seven most prevalent cancers esophagus, stomach, bladder, colorectal, lung, prostate and breast cancers using a shared component model for data in 2007.^{26}

In another study, Chamanpara and colleagues modelled the geographical variation of esophagus and gastric cancers jointly using the data from 2004 to 2008, in Golestan, Iran where diet low in fruit and vegetable intake was considered as a shared component.^{27}

Because of the inherent relationship between these cancers, we can assume, among the risk factors mentioned for breast, colorectal and prostate cancers, that obesity is a common risk factor for all these cancers, while low physical activity is common for breast and colorectal cancers. So, in this study, we intended to use a Bayesian shared component model for joint modelling of these three cancer incidence rates in Iran, in order to explore the pattern of spatial correlation among them, and to estimate the relative weight of the shared risk factors, obesity and low physical activity, in the population of Iran in 2009.

METHODS

In this study, we applied new cases of colorectal (ICD10 code C18-C20, C26), prostate (C61) and breast (C50) cancers in all provinces in 2009 that reported by No communicable Disease Management Centre of the ministry of Health and Medical Education. According to the obtained censuses in 2006 and 2011, the total population of the country was 70495782 and 75149669 persons, respectively. Since we have used the new observed cases of cancers in 2009, we considered at risk population as the proportion of the population reported in two censuses (0.6× the population in 2006 + 0.4 × the population in 2011) and it was estimated as 72357336.8. In this article, we used a shared component model proposed by Held et al.^{24} for jointly modelling of the spatial variations for showing the incidence rates of cancers. We considered obesity or overweight (shared risk factor for all three cancers) and low physical activity (shared between breast and colorectal cancers) as the latent shared risk factors. In fact, a common feature of the model is, considering the shared components (obesity or low physical activity) as the dominant surrogates of all common and latent risk factors of these cancers. So, the result of joint maps shows the spatial variation of all unobserved spatially-structured risk factors that effect on diseases where the understudy components have been chosen as representative of them.^{28}

Let O _{i j} and E _{i j} represent the number of the observed and expected cases for j-th disease in i-th province. The expected number of cases in each province is calculated by multiplying the total incidence rate of disease and population of province. Let O _{i j} follows a Poisson distribution with mean μ _{i j}=E _{i j}.R _{i j} where E _{i j} and R _{i j} are the number of expected cases and the relative risk for disease j in the province i, respectively. The R _{i j} is unknown parameter of the model. Maximum likelihood estimation of the incidence rates, (R ̂ _{i j}), is obtained by dividing the number of the observed cases by its expected value of j-th disease in the i-th province.

In addition, to consider the information of the adjacent neighbors of each province, we used the popular BYM model. In this model, the logarithm of the relative risk of j-th disease in i-th province was written as below:

log(R _{i j})= α_{j}+u_{i j}+v_{i j}

Where α_{j} is an intercept, u_{i j} and v_{i j} are the structured and unstructured random effects. The random effect v_{i j} (uncorrelated heterogeneity) is a component that models the effect of unstructured dispersion between regions and it follows a normal distribution with zero mean and variance of . The structured random effect, u_{i j}, (correlated heterogeneity), considers local dependence in space and assumes weight for adjacent areas. Also, this component model of the conditional autoregressive normal (CAR normal) assumes the conditional distribution of each area-specific structured component with a mean equal to the average of its neighbors, and variance inversely proportional to the number of these neighbors. So we have:

Where l shows adjacent provinces with the province i (i= 1, 2, ..., 287) and n_{i} shows the number of adjacent provinces.

On the other hand, in this study, a Bayesian shared component model proposed by Held et al.^{24} was used for the joint analysis of the spatial distribution of three cancer incidence rates. The obesity and low physical activity were considered as the shared components. Thus in this model, the logarithm of RRs is as below:

Where the R_{_i1} , R_{_i2}, and R_{_i3} are the RRs of colorectal, breast and prostate cancers in i-th province, respectively. The parameter α_{_j} is the interceptor of j-th disease and the parameter λ_{_i1} is the shared component of obesity that is common for three understudy cancers. The λ_{_i2} is the share component of low physical activity that is common among the breast and colorectal cancer. Each shared component related to RR weighted by the scale parameter, δ, to allow different risk gradient (on the log scale). The terms ε_{_i j }are the heterogeneous effects to capture possible variations not explained by the other model terms.

The Bayesian models allocated priors to unknown parameters, whether fixed or random effects. In the joint model, the shared random effects, λ_{_i}, were considered normal conditional autoregressive priors with unit weights for the neighboring provinces to capture local dependence in space. We considered a uniform prior distribution for the intercept that is specific to each disease α_{_j}, independent normal prior distributions were used for the logarithms of the scaling parameters, log δ, and for the terms ε_{_i j}, assuming a multivariate normal prior distribution with covariance matrix showing the correlation between the cancers. The inverse of this matrix known as a precision matrix, Σ-1, models to arise from a Wishart (Q,3) prior distribution, where Q is set to be a diagonal matrix with 1s.^{29-31}

We fitted the BYM and shared the components model to the available data using Open BUGS version 3.1.2. We considered two independent Markov chains. To ensure the convergence of chains, after visual inspections, we used Gelman-Rubin and Raftery-Lewis diagnostic tests via R using the coda package. After a sufficient (30,000) burn-in to remove the effects of the initials, the following 15,000 iterations were sampled from each of the two chains choosing lag=15 to avoid possible autocorrelation. The estimated RRs were subsequently mapped to GeoBUGS package. Also, for checking the appropriateness of the model, the deviance information criterion (DIC) was used. In this case, the DIC of the joint model was compared with the sum of the DIC values from the three individual BYM models.^{32}

RESULTS

Ilam and Tehran provinces had the minimum and maximum population in 2009, 550512 and 13891781 persons, respectively. Also in this year, the number of colorectal, breast and prostate cases was reported as 6210, 7822 and 3856, respectively. Figure 1 shows the pattern of RR for studied cancers separately estimated by BYM model. According to this Figure the colorectal cancer had high RR in the central, north and northwest provinces (RR>1.5: Tehran; 1.2

The estimations of the two understudy shared components are presented in Figures 2 and 3. According to the map in Figure 2 the shared component of obesity had more effect on cancer incidence in the north, northwest and central regions, including the provinces of Ardabil, West Azarbaijan, Gilan, Zanjan, Kurdistan, Qazvin, Tehran, Mazandarn, Hamedan, Kermanshah, Semnan, Golestan, Yazd and Kerman.

Also, the shared component of low physical activity, which is shown in Figure 3, had larger effect on cancer incidence in the provinces of Ardabil, Zanjan, Qazvin, Tehran, Mazandaran Markazi, Lorestan, Kermanshah, Ilam, Khuzestan, South Khorasan, Yazd, Kerman and Fars.

Table 1 displays the posterior median estimation of scale parameters (level of importance) that each share component has for the different cancers. If the proportion of the two weights is greater than one, indicating that the share component is more important for the disease its weight is located in the numerator. Therefore, the greatest estimated value of the scale parameters relevant to the specific shared component indicates the more importance of that shared component of the disease which has the largest weight.^{28}

Table 1: Posterior median and 95% CrI for weights (level of importance) of three cancers in the shared component model

Risk factor

Cancer

Median (95%CrI^{*})

Obesity

Colorectal

0.976(0.444, 2.210)

Breast

0.971(0.432, 2.199)

Prostate

0.995(0.437, 2.146)

Low physical activity

Colorectal

0.985(0.447, 2.169)

Breast

0.991(0.449, 2.146)

^{*}Bayesiancredibilityinterval

Kohkiluie-and-Boier Ahmad

Charmahal-and-Bakhtiari

Hamadan

Lorestan

Yazd

Isfahan

Markazi

Qom

Tehran

Qazvin

Zanjan

West Azarbaijan

Kermanshah

Ilam

Khuzestan

Bushehr

Fars

Hormozgan

Sistan-and- Baluchestan

South Khorasan

Golestan

North Khorasan

Razavi Khorasan

Semnan

Mazandaran

Gilan

Ardabil

East Azarbaijan

Kurdistan

Kerman

Ardabil

Bushehr

Golestan

South Khorasan

Qazvin

Qom

Markazi

Isfahan

Yazd

Sistan-and- Baluchestan

Kerman

Hormozgan

Fars

Kohkiluie-and-Boier Ahmad

Charmahal-and-Bakhtiari

Khuzestan

Ilam

Lorestan

Hamadan

Kurdistan

Kermanshah

Tehran

Semnan

Razavi Khorasan

North Khorasan

Mazandaran

Gilan

Zanan

West Azarbaijan

East Azarbaijan

j

Hamadan

Isfahan

Yazd

Qom

Markazi

South Khorasan

Sistan-and- Baluchestan

Hormozgan

Kerman

Fars

Bushehr

Kohkiluie-and-Boier Ahmad

Charmahal-and-Bakhtiari

Ilam

Lorestan

Kermanshah

Semnan

Razavi Khorasan

North Khorasan

Golestan

Mazandaran

Tehran

Qazvin

Kurdistan

Gilan

Zanjan

West Azarbaijan

East Azarbaijan

Ardabil

Khuzestan

Figure 1: Map of RR (a): colorectal cancer (b); breast cancer (c); prostate cancer, in 2009 with Besag et al. model (BYM)

Ardabil

Golestan

Bushehr

Hormozgan

Ilam

Kohkiluie-and-Boier Ahmad

Qazvin

Qom

Kerman

South Khorasan

Yazd

Semnan

Razavi Khorasan

North Khorasan

Mazandaran

Sistan-and- Baluchestan

Khuzestan

Isfahan

Charmahal-and-Bakhtiari

Fars

Gilan

Zanjan

East Azarbaijan

West Azarbaijan

Lorestan

Markazi

Hamadan

Kermanshah

Kurdistan

Tehran

Figure 2: Map of the posterior median of the shared component representing obesity/overweight (including colorectal, breast and prostate cancers)

Bushehr

Sistan-and- Baluchestan

Qazvin

Hamadan

Tehran

Golestan

Ardabil

East Azarbaijan

West Azarbaijan

Gilan

Isfahan

Hormozgan

Qom

Markazi

Lorestan

Khuzestan

Charmahal-and-Bakhtiari

Kohkiluie-and-Boier Ahmad

North Khorasan

Mazandaran

Yazd

Zanjan

Razavi Khorasan

Ilam

Kermanshah

Kerman

South Khorasan

Fars

Semnan

Kurdistan

Figure 3: Map of the posterior median of the shared component representing low physical activity (including colorectal and breast cancers)

As seen in Table 1 the posterior medians of weights related to the obesity component were obtained, ẟ_{11}= 0.9755, ẟ_{12}= 0.9707, and ẟ_{13}= 0.9755. So, we had ẟ_{11}/ẟ_{12}= 1.005. This proportion showed that the obesity component was slightly more associated with colorectal cancer than the breast cancer. On the other hand, because ẟ_{13} had the greatest weight, here it can be said that obesity shared component had more relationship with prostate cancer compared to the other cancers (ẟ_{13}/ẟ_{12}= 1.025, and ẟ_{13}/ẟ11= 1.020).

Also, the posterior median of weights related to low physical activity was ẟ_{21}= 0.9845, and ẟ_{22}= 0.9906. It showed that this component is slightly more related to the breast cancer than colorectal cancer ẟ_{22}/ẟ_{21}= 1.006.

Finally, the DIC criteria of the joint model were 668.8 and the sum of the DIC values of the BYM model for three diseases was 669.6. So the DIC value was improved in the case of the joint model indicating the advantage of modeling the diseases jointly over modeling them individually. In addition, this improvement in DIC value is due to the reduction in posterior deviances and effective parameters of the joint model compared to the individual models.^{28}

DISCUSSION

In this study, we applied a shared component model that was proposed by Held et al.^{24} to examine the spatial pattern of shared risk factors between the three most common cancers in Iran. We also described the data recourses, the expected value calculation method, and the model's assumptions and structures that could be used to perform the similar analyses. "Moreover, the reported RR value for each province shows the risk for a person who lives in province relative to the total population."

The separate mapping results based on estimated RRs showed the spatial variation of the cancer incidence rates in the country and specified high-risk provinces. As a general conclusion we could say that the provinces of Tehran, Semnan, Isfahan and Yazd had higher incidence rates for at least two cancers (RR>1.2). In addition, the provinces of Mazandaran and Markazi for all three cancers and East Azarbaijan, Fars and Gilan provinces for at least 2 cancers had more than one relative risk factor (RR>1). According to Figures 2 and 3, in north and centre of the country, both the understudy risk factors were more common generally.

The results of this study was in accordance with the results of the study conducted by Mahaki et al. on obesity and low physical activity components using data in 2007. As indicated in the both studies, the provinces of Gilan, Mazandaran, Golestan, Semnan, Tehran and Qazvin for the obesity shared component and the provinces of Mazandaran, Ardebil, Tehran, Yazd, Kerman, Lorestan, Markazi and Khuzestan for the low physical activity shared component had RR>1.^{26} Moreover, Chamanpara et al. concluded that the component representing diet low in fruit and vegetable intake had larger effect on cancer incidence in the northern half of the target area (RR>1).^{27}

One of the outstanding features of a shared component model is that it allows us to estimate the weights (importance level) of the components in diseases. In fact, this estimation shows the importance of each latent component for each relevant disease. On the other hand by using DIC criteria, we found that the joint modeling of three understudy cancers was better than individual modelling of these cancers using BYM model. Despite these advantages, there are some limitations to this study that must be noted. In this study we assumed the independence between the shared components and fitted the model to the data, but in the real world, it may be the interaction between the shared components. In addition, the provinces that they bordering other countries have missing neighbours. This phenomenon is called edge effect and may occur in other similar studies and also may lead to over or under estimation. Also, there may still remain some other possible confounding risk factors that weren’t considered.^{23-24}

Another limitation of this model is that the data of the understudy risk factors are not available at the individual level, so we did the analysis at the provincial level using relevant diseases data and included them as covariates in the model. So, based on this limitation the ecological bias can't be excluded and we cannot infer any explicit causal result, therefore, the risk estimates at the area level may not reflect the risk estimates at the individual level.^{33}

So, based on the above restrictions the derived maps should be interpreted with caution. Since the most cancers have long latency periods or they may take many years between the exposure to the risk factors and disease diagnosis so an important extension of this model can be considered as a model in which the time dimension is included.^{17}

CONFLICTS OF INTEREST

All contributing authors declare no conflicts of interest.

ACKNOWLEDGEMENT

This article was a part of M.Sc. thesis in Biostatistics and it was supported by Hamadan University of Medical Sciences.

South Khorasan

Fars

Semnan

Kurdistan

Figure 3: Map of the posterior median of the shared component representing low physical activity (including colorectal and breast cancers)

As seen in Table 1 the posterior medians of weights related to the obesity component were obtained, ẟ_{11}= 0.9755, ẟ_{12}= 0.9707, and ẟ_{13}= 0.9755. So, we had ẟ_{11}/ẟ_{12}= 1.005. This proportion showed that the obesity component was slightly more associated with colorectal cancer than the breast cancer. On the other hand, because ẟ_{13} had the greatest weight, here it can be said that obesity shared component had more relationship with prostate cancer compared to the other cancers (ẟ_{13}/ẟ_{12}= 1.025, and ẟ_{13}/ẟ11= 1.020).

Also, the posterior median of weights related to low physical activity was ẟ_{21}= 0.9845, and ẟ_{22}= 0.9906. It showed that this component is slightly more related to the breast cancer than colorectal cancer ẟ_{22}/ẟ_{21}= 1.006.

Finally, the DIC criteria of the joint model were 668.8 and the sum of the DIC values of the BYM model for three diseases was 669.6. So the DIC value was improved in the case of the joint model indicating the advantage of modeling the diseases jointly over modeling them individually. In addition, this improvement in DIC value is due to the reduction in posterior deviances and effective parameters of the joint model compared to the individual models.^{28}

DISCUSSION

In this study, we applied a shared component model that was proposed by Held et al.^{24} to examine the spatial pattern of shared risk factors between the three most common cancers in Iran. We also described the data recourses, the expected value calculation method, and the model's assumptions and structures that could be used to perform the similar analyses. "Moreover, the reported RR value for each province shows the risk for a person who lives in province relative to the total population."

The separate mapping results based on estimated RRs showed the spatial variation of the cancer incidence rates in the country and specified high-risk provinces. As a general conclusion we could say that the provinces of Tehran, Semnan, Isfahan and Yazd had higher incidence rates for at least two cancers (RR>1.2). In addition, the provinces of Mazandaran and Markazi for all three cancers and East Azarbaijan, Fars and Gilan provinces for at least 2 cancers had more than one relative risk factor (RR>1). According to Figures 2 and 3, in north and centre of the country, both the understudy risk factors were more common generally.

The results of this study was in accordance with the results of the study conducted by Mahaki et al. on obesity and low physical activity components using data in 2007. As indicated in the both studies, the provinces of Gilan, Mazandaran, Golestan, Semnan, Tehran and Qazvin for the obesity shared component and the provinces of Mazandaran, Ardebil, Tehran, Yazd, Kerman, Lorestan, Markazi and Khuzestan for the low physical activity shared component had RR>1.^{26} Moreover, Chamanpara et al. concluded that the component representing diet low in fruit and vegetable intake had larger effect on cancer incidence in the northern half of the target area (RR>1).^{27}

One of the outstanding features of a shared component model is that it allows us to estimate the weights (importance level) of the components in diseases. In fact, this estimation shows the importance of each latent component for each relevant disease. On the other hand by using DIC criteria, we found that the joint modeling of three understudy cancers was better than individual modelling of these cancers using BYM model. Despite these advantages, there are some limitations to this study that must be noted. In this study we assumed the independence between the shared components and fitted the model to the data, but in the real world, it may be the interaction between the shared components. In addition, the provinces that they bordering other countries have missing neighbours. This phenomenon is called edge effect and may occur in other similar studies and also may lead to over or under estimation. Also, there may still remain some other possible confounding risk factors that weren’t considered.^{23-24}

Another limitation of this model is that the data of the understudy risk factors are not available at the individual level, so we did the analysis at the provincial level using relevant diseases data and included them as covariates in the model. So, based on this limitation the ecological bias can't be excluded and we cannot infer any explicit causal result, therefore, the risk estimates at the area level may not reflect the risk estimates at the individual level.^{33}

So, based on the above restrictions the derived maps should be interpreted with caution. Since the most cancers have long latency periods or they may take many years between the exposure to the risk factors and disease diagnosis so an important extension of this model can be considered as a model in which the time dimension is included.^{17}

CONFLICTS OF INTEREST

All contributing authors declare no conflicts of interest.

ACKNOWLEDGEMENT

This article was a part of M.Sc. thesis in Biostatistics and it was supported by Hamadan University of Medical Sciences.

References

1. Kolahdoozan S, Sadjadi A, Radmard AR, Khademi H. Five common cancers in Iran. Arch Iran Med. 2010; 13(2): 143-6.

2. Ramezani Gourabi B. Recognition of geographical diffusion Esophagus Cancer in Southwestern of Caspian Sea. J Am Sci. 2011; 7(2): 297-302.

3. Ministry of Health and Medical Education. National cancer registration report 2009. Tehran: Ministry of Health and Medical Education; 2012.

4. Stewart BW, Wild CP. World Cancer Report, 2014. Available at: Oncology-News/World-Cancer-Report-2014. Accessed 14 July 2014.

5. World Health Organization. Breast cancer: prevention and control. Available at: http://www.who.int/cancer/detection/breastcancer/en/. Accessed.

6. Noroozi A, Jomand T, Tahmasebi R. Determinants of breast self-examination performance among Iranian women: an application of the health belief model. J Cancer Educ. 2011; 26(2): 365-74.

7. Lacey JV Jr, Kreimer AR, Buys SS, Marcus PM, Chang SC, Leitzmann MF, et al. Breast cancer epidemiology according to recognized breast cancer risk factors in the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial Cohort. BMC Cancer. 2009; 9: 84.

8. Danaei G, Vander Hoorn S, Lopez AD, Murray CJ, Ezzati M. Comparative risk assessment collaborating g. causes of cancer in the world: comparative risk assessment of nine behavioural and environmental risk factors. Lancet. 2005; 366(9499): 1784-93.

9. International Agency for Research on Cancer. World cancer factsheet. World Health Organization.www.cancerresearchuk. org 2012.

10. Stone WL, Krishnan K, Campbell SE, Qui M, Whaley SG, Yang H. Tocopherols and the treatment of colon cancer. Ann N Y Acad Sci. 2004; 1031: 223-33.

11. American Cancer Society. What are the risk factors for colorectal cancer? Available at: http://www.cancer.org/cancer/colonandrectumcancer/moreinformation/colonandrectumcancerearlydetection/ colorectal- cancer- early-detection-risk-factors-for-crc. Accessed 17 July 2014.

12. Parkin DM, Bray F, Ferlay J, Pisani P. Global cancer statistics, 2002. CA Cancer J Clin. 2005; 55(2): 74-108.

13. Hsing AW, Devesa SS. Trends and patterns of prostate cancer: what do they suggest? Epidemiol Rev. 2001; 23(1): 3-13.

14. American Cancer Society. What are the risk factors for prostate cancer? Available at: http://www.cancer.org/cancer/prostatecancer/detailedguide/ prostate- cancer- risk- factors. Accessed 17 July 2014.

15. Berke O. Exploratory disease mapping: kriging the spatial risk function from regional count data. Int J Health Geogr. 2004; 3(1): 18.

16. Lawson AB, Biggeri AB, Boehning D, Lesaffre E, Viel JF, Clark A, et al. Disease mapping models: an empirical evaluation. Disease Mapping Collaborative Group. Stat Med. 2000; 19(17-18): 2217-41.

17. Tzala E, Best N. Bayesian latent variable modelling of multivariate spatio-temporal variation in cancer mortality. Stat Methods Med Res. 2008; 17(1): 97-118.

18. Clayton D, Kaldor J. Empirical Bayes estimates of age-standardized relative risks for use in disease mapping. Biometrics. 1987; 43(3): 671-81.

19. Assuncao RM, Castro MS. Multiple cancer sites incidence rates estimation using a multivariate Bayesian model. Int J Epidemiol. 2004; 33(3): 508-16.

20. Dabney AR, Wakefield JC. Issues in the mapping of two diseases. Stat Methods Med Res. 2005; 14(1): 83-112.

21. Langford IH, Leyland AH, Rasbash J, Goldstein H. Multilevel modelling of the geographical distributions of diseases. J R Stat Soc Ser C Appl Stat. 1999; 48(2): 253-68.

22. Leyland AH, Langford IH, Rasbash J, Goldstein H. Multivariate spatial models for event data. Stat Med. 2000; 19(17-18): 2469-78.

23. Knorr‐Held L, Best NG. A shared component model for detecting joint and selective clustering of two diseases. J R Stat Soc Series. 2001; 164(1): 73-85.

24. Held L, Natario I, Fenton SE, Rue H, Becker N. Towards joint disease mapping. Stat Methods Med Res. 2005; 14(1): 61-82.

25. Manda SM, Feltbower RG, Gilthorpe MS. Review and empirical comparison of joint mapping of multiple diseases. South. Afr J Epidemiol Infect. 2011; 27(4): 169-182.

26. Mahaki B, Mehrabi Y, Kavousi A, Akbari ME, Waldhoer T, Schmid VJ, et al. Multivariate disease mapping of seven prevalent cancers in Iran using a shared component model. Asian Pac J Cancer Prev. 2011; 12(9): 2353-8.

27. Chamanpara P, Moghimbeigi A, Faradmal J, Poorolajal J. Joint disease mapping of two digestive cancers in Golestan province, Iran: Using a Shared Component Model. Osong Public Health Res Perspect. 2015.

28. Dreassi E. Polytomous disease mapping to detect uncommon risk factors for related diseases. Biom J. 2007; 49(4): 520-9.

29. Best N, Hansell AL. Geographic variations in risk: adjusting for unmeasured confounders through joint modeling of multiple diseases. Epidemiology. 2009; 20(3): 400-10.

30. Onicescu G, Hill EG, Lawson AB, Korte JE, Gillespie MB. Joint disease mapping of cervical and male oropharyngeal cancer incidence in blacks and whites in South Carolina. Spat Spatiotemporal Epidemiol. 2010; 1(2-3): 133-41.

31. Downing A, Forman D, Gilthorpe MS, Edwards KL, Manda SO. Joint disease mapping using six cancers in the Yorkshire region of England. Int J Health Geogr. 2008; 7: 41.

32. Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A. Bayesian measures of model complexity and fit. J R Stat Soc Series B. 2002; 64(4): 583-639.

33. Macnab YC. Bayesian multivariate disease mapping and ecological regression with errors in covariates: Bayesian estimation of Dalys and 'preventable' Dalys. Stat Med. 2009; 28(9): 1369-85.