# Spatial and Temporal Distribution of *Plasmodium vivax* Malaria in Korea Estimated with a Hierarchical Generalized Linear Model

## Article information

## Abstract

### Objectives

The spatial and temporal correlations were estimated to determine *Plasmodium vivax* malarial transmission pattern in Korea from 2001–2011 with the hierarchical generalized linear model.

### Methods

Malaria cases reported to the Korea Centers for Disease Control and Prevention from 2001 to 2011 were analyzed with descriptive statistics and the incidence was estimated according to age, sex, and year by the hierarchical generalized linear model. Spatial and temporal correlation was estimated and the best model was selected from nine models. Results were presented as diseases map according to age and sex.

### Results

The incidence according to age was highest in the 20–25-year-old group (244.52 infections/100,000). Mean ages of infected males and females were 31.0 years and 45.3 years with incidences 7.8 infections/100,000 and 7.1 infections/100,000 after estimation. The mean month for infection was mid-July with incidence 10.4 infections/100,000. The best-fit model showed that there was a spatial and temporal correlation in the malarial transmission. Incidence was very low or negligible in areas distant from the demilitarized zone between Republic of Korea and Democratic People’s Republic of Korea (North Korea) if the 20–29-year-old male group was omitted in the diseases map.

### Conclusion

Malarial transmission in a region in Korea was influenced by the incidence in adjacent regions in recent years. Since malaria in Korea mainly originates from mosquitoes from North Korea, there will be continuous decrease if there is no further outbreak in North Korea.

**Keywords:**epidemiology; incidence; Korea; linear models; malaria;

*Plasmodium vivax*

## 1. Introduction

Malaria caused by infection with *Plasmodium vivax*, has been a public health concern in Korea (=Republic of Korea) since 1993. At that time, it was suggested that the prevalence would decrease soon since mosquito spread from North Korea was the main cause of re-emerging infection [1]. However, since then, the incidence has not decreased rapidly and there was a fluctuation of incidence even after 10 years due to the fluctuation of the malaria incidence in North Korea [2]. *P vivax* is the only species that causes malaria in the Korean Peninsula. To estimate the annual pattern of malaria incidence by age and sex in Korea, the hierarchical generalized linear model (HGLM) was adapted. In the model, spatial and temporal correlations were also considered for the best estimation. Results of estimated incidence were represented as a diseases map.

To draw a diseases map through the estimation of incidence from small regions, empirical Bayesian and hierarchical Bayesian estimation methods have been used [3,4]. An alternative method is HGLM [5]. The Bayesian approach requires the assumption of ‘prior to’ parameters, while HGLM presents hierarchical likelihood so that there appears no sensitivity problem of parameter estimation that may occur in the case of wrong prior assumption. Also, Bayesian approaches should penetrate complicated calculation process such as Gibbs sampling procedure for parameter estimation. However, HGLM does not require those complicated procedure but can be easily estimated with statistical packages.

The following are the main questions in this analysis: Is malaria incidence of one region correlated with that of adjacent regions? Is malaria incidence of 1 year correlated with that of recent previous year? What are the characteristics of malaria incidence according to age and sex? Which HGLM model can best present the spatial and temporal correlation of incidence of malaria?

The results may help the prediction of malaria transmission according to time and geographic location, and may provide strategies for malaria prevention in Korea.

## 2. Materials and Methods

### 2.1. Data collection and management

Cases of malaria infection have been reported to the Center for Diseases Control and Prevention (CDC) by medical clinics and hospitals as well as health centers according to the “Law on the prevention and control of infectious diseases” in Korea. All cases reported are *P vivax* malaria. Data from 2001–2011 were collected by the CDC. Raw data include age, sex, address in city or county, and month and year of occurrence.

Data were manipulated for the analysis as follows: Variables in the data of infected persons were mean age, sex, address, mean month of occurrence, and year of occurrence. Address was classified into 232 cities or counties. Data are grouped according to sex, address, and year. The age of each group is the mean age of infected persons and the month of infection was a mean of infected months. Therefore the number of data inputted for analysis was 5104 (2 sexes × 232 cities × 11 years). Incidence in each city or county was standardized based on the number of infection/100,000 individuals. Populations in each city or county in a year were obtained from the webs site of Statistics Korea, available from: http://kostat.go.kr.

All the results in the paper can be obtained by using dhglmfit() function in the R package dhglm at the Comprehensive R Archive Network (CRAN) [6].

### 2.2. Statistical model

#### 2.2.1. Equation

For response variable, we analyzed observed numbers *y _{itk}* of malaria cases (

*i*: city or county (1,.,232),

*t*: year (2001,.,201),

*k*: sex (1: male, 2: female)). The model used in this paper, the following Poisson HGLM with the log link was considered:

*η _{itk}* = linear predictor;

*μ*= conditional expected numbers of cases;

_{itk}*N*= population of group (city or county) acquainted from Statistics Korea according to sex and year;

_{itk}*β*

_{0}= intercept;

*β*

_{1}= Slope;

*I*(·)=indicator function;

*age*=mean age of infection occurrence between 2001 and 2011 out of

_{ik}*i*city or county; month

_{th}_{i}= mean month of infection occurrence between 2001 and 2011 out of

*i*city or county;

^{th}*b (age*= smoothing spline on the mean age of infection occurrence;

_{ik})*c (month*= smoothing spline on the mean month of infection occurrence;

_{i})*u*= random temporal effects;

_{i}*v*= random spatial effects.

_{it}For each temporal and spatial effect, three models were considered to give a total of nine models. Among nine models, the model with least value of conditional Akaike Information Criterion (AIC) was selected as the best fitting model to malaria distribution in Korea from 2001 to 2011.

#### 2.2.2. Spatial models for u_{i}

Spatial Model 1 is the independent model, where the incidence of a region is assumed to be independent of adjacent regions.

*u _{i}~i.i.d. N*(0, λ

_{1})

Spatial Models 2 and 3 assume spatial effects where the incidence of a region is affected through adjacent regions.

Spatial Model 2 is the intrinsic autoregressive model oPf Besag and Higdon [7], *u _{i}~IAR* with kernel Σ

*(*

_{i~j}*u*)

_{i}–u_{j}^{2}/λ

_{1}.

Spatial Model 3 is the Markov random-field model [8], *u _{i}~MRF* with

**var**(

*u*)

^{-1}=(

*I-ρN*)/λ

_{1}where

*ρ*is the spatially correlation and

*N*is the incidence matrix for neighbors of city or county.

#### 2.2.3. Temporal model for v_{it}

Temporal Model 1 is the independent model; the incidence of each year does not depend upon the previous years,

*v _{it}~i.i.d. N*(0, λ

_{2})

Temporal Models 2 and 3 assume temporal effects, where the incidence of each year depends upon the previous year.

Temporal Model 2 is the random work model, where *v _{it}*=

*v*with kernel Σ

_{t}~RW*(*

_{t}*v*)

_{t}–v_{t-1}^{2}/λ

_{2}

Temporal Model 3 is the first order autoregressive model, *v _{it}~AR*(1) with kernel Σ

*Σ*

_{i}*(*

_{t}*v*)

_{t}–ρv_{t-1}^{2}/λ

_{2}

All script descriptions for each model are in the supplement.

### 2.3. Diseases mapping

The map of Korea was obtained from the National Geographic Information Institute. The incidence of each region (city or county) and year was inputted and the drawing is presented according to the incidence level in color. MapWizard for Excel was used for drawing of diseases map [9].

## 3. Results

### 3.1. The best-fit model

A combination model of Spatial Model 3 and Temporal Model 3 showed the least conditional AIC value. Results of the fitted model are summarized in Table 1.

### 3.2. Temporal distribution of malaria

Change of malaria incidence in Korea according to year and sex is shown in Figure 1. The total number of malaria cases in Korea from 2001 to 2011 was 17,044, with 14,269 males and 2,775 females. The incidence decreased between 2001 and 2004, but it increased again by 2007. There was a decrease to 2008 and increase to 2010. Monthly incidence shows that July and August were the peak months with incidence of 8.15 infections/100,000 and 8.53 infections/100,000, respectively (Figure 2). The mean month of malaria incidence was mid-July with 10.2 infections/100,000 after estimation by the best-fit HGLM model (Figure 3).

### 3.3. Spatial distribution of malaria

The diseases map that shows the annual malaria incidence in each city according to year was presented in Figure 4. There was the highest incidence in the area near the demilitarized zone (DMZ) in the northern part of Korea.

### 3.4. Incidence according to sex and age

The incidence according to age was highest in 20–25-year-old group, at 244.52/100,000 (Figure 5). The smoothing spline chart of incidence distribution according to age after estimation is shown in Figure 6. The mean age of incidence of malaria in males was 31.0 infections/100,000 and 45.3 infections/100,000 in females after estimation by the best-fit HGLM model.

### 3.5. Spatial distribution of malaria for males only, males excluding 20–29-year-olds, and females only in 2011

The diseases map for males only in 2011 was nearly identical to the overall diseases map of malaria in 2011 (Figure 3). If the males aged 20–29 were omitted, the diseases map showed the low or negligible incidence in areas other than northern part of Korea. Cities or counties not adjacent to the DMZ with countable incidence were Uljin, Ichon, Boryeong, Gunsan, and Iksan. The diseases map for females only showed that only two cities or counties are countable apart from the regions adjacent to the DMZ i.e. Jincheon and Hongseong (Figure 7).

## 4. Discussion

According to the results, it can be said that combination model of Spatial Model 3 (Markov random field model) and Temporal Model 3 (the first order autoregressive random work model) provided the best fit to model of malaria incidence in Korea. According to the best-fit model, it can be seen that there was a significant correlation of malaria incidence between adjacent regions and also significant correlation between the malaria incidence of year and that of recent years.

Since malaria in Korea is primarily transmitted by mosquitoes from North Korea, the distance from the DMZ is important (Figure 1) [1]. Therefore, the incidence rate of a region should be affected by the incidence of adjacent regions. Also, when there is a malaria infection in an area distant from the DMZ, the infected mosquito can move to another adjacent region. The annual incidence is affected by that of recent years since there might be a dormant infection of malaria with a long incubation period [10]. Also the infected mosquito can survive into the next year. The best-fit model can explain these phenomena in Korea very well (Figure 3).

The peak incidence was seen in the age 20–24 years group (Figure 5). This is due to the fact that population most vulnerable to malaria is soldiers who work near the DMZ and discharged soldiers who were infected during military service. As for females, middle-aged females in rural areas work actively on farms where the density of mosquitoes is higher than in urban areas (Figure 6).

According to the pattern of annual incidence, the incidence of malaria in Korea might decrease year by year. The epidemiological pattern in Korea is typical unstable malaria: transmission or incidence is variable from year to year and sometimes there are outbreaks [2]. The re-emergence is dependent on the malaria infection status in North Korea [11]. If the public health status of North Korea does not decrease rapidly, the malaria incidence in North Korea will decrease. Although there is a regional cycle of malaria infection in a few regions in Korea, the incidence is very low except in areas adjacent to the DMZ. All malaria cases are now reported to the CDC of the Republic of Korea, and can be treated well with a standard regimen of chloroquine and primaquine. A few cases of chloroquine-resistant strains have been reported [12], but this is not a basis for the persistence of incidence. It is said that malaria is endemic in Korea since there is a regional cycle of malaria infection in a few regions as well as a rapid dissemination of newly introduced genotypes [13]. However, since the malaria incidence is limited to regions adjacent to the DMZ, continuous transmission of malaria is difficult to anticipate throughout the whole country. These findings are well presented in the diseasesmap of Figure 7, where themap of males only, excluding the 20–29-year-old group, and the map of females only showed very low or negligible incidence except in a few regions. The incidence of infection in men is nearly same as the incidence of both sexes. The estimation of the incidence in the following year can be done with further one of regression model in the range of HGLM.

In conclusion, to estimate the malaria incidence in Korea from 2001–2011, the combination of the spatial and temporal effects gives the best fitting model. According to the model, incidence at a region was influenced by adjacent regions. Also the incidence of a year was affected by that of recent previous year. There is a decreasing trend of malaria incidence in Korea over the recent several years if there is no further outbreak of malaria in North Korea.

Supplement. R script for the estimation *of Plasmodium vivax *malaria from data reported to the Korea Centers for Disease Control and Prevention, from 2001–2011.

## Acknowledgements

This work was supported by the research grant of Center for Diseases Control and Prevention, Republic of Korea in 2012 (2012E2400100).