Suggesting a new alternative method of measuring mental disorders without the use of Paper-Pencil tests based on EEG

Document Type: Original Article


1 Psychology Dept., Tabriz University, I.R. Iran

2 Social Health Determinants Research Center, Community Health Dept., Shahrekord University of Medical Sciences, Shahrekord, I.R. Iran.


Background and aims: Paper-pencil tests have always its own problems in the mental disorders evaluation, including learning questions, bad or good blazon are the problems with this methodology. This study aimed to propose a new alternative method of measuring mental disorders without paper-pencil test using EEG.
Methods: The research society involved depressed patients referred the psychiatrist clinics in Tabriz. 107 patients were selected as samples using a convenient sampling method. The Beck test was conducted. The EEG was recorded from the F4 point concurrent with displaying the film of 5 animated emotional images from Normed Images database (IAPS). The specialized screen of this recording was designed by the author in the Biograph Infinity software of device. Other software was written by the author in order to separate the αpeak frequency average associated with any image of the recorded EEG. Then the research variablesα1peak , α2peak , α3peak , α4peak , α5peak of each patient were analyzed with SPSS. After all, another 26 patients were selected to measure the Golden Standard, sensitivity, Positive predictability, Negative predictability and ROC.
Results: The results of the multiple regression analysis showed that α1peak associated with αpeakfrequancy of image 1 had more explanatory power with a beta value of 0.289 compared with other variables. Then α3peak had a high explanatory power. The regression equation for the predicting the score based on his/her EEG was found in terms of αpeak frequency.
Discussion: This research showed that Beck's depression score was predictable without using any questionnaire but according to EEG with a high sensitivity (100%), specificity (30.8%), PPV (59.1%), NPPV (100%), and ROC (57.4%).  


Main Subjects


 Measurement is to assign numbers to individuals through a regular way to demonstrate their specificity, but the test is an objective and standard measure of a behavior quality. As it can be derived from the definition, the concept of measurement includes the test, too. A test evaluates only a sample of a behavior. Each test consists of several questions that will determine the subject’s position in the measured traits. Although activity in the field of psychological tests has begun scientifically since the late of the 19th century, but the idea of human knowledge to achieve different objectives existed since ancient times. For example, in the ancient Chinese empire the state organization's employees were tested every 3 years using oral exams, and the results of these tests were used as the criteria for performance evaluation as well as the employees’ improvement. The first psychological test was literally the Intelligence Scale of Binet - Simon which was built in 1905 to measure the intelligence of children. For this reason, Binet is called the father of psychological tests. The emergence of psychological tests and activities thanks to the efforts of many scientists, including Fekhner, Wundt, Cattell, Galton and so on.1 Due to the indefinite nature of mental disorders, their measurement is different from the physical disorders measurement; so, requires different measurement methods. Until now, psychologists and psychiatrists have been used widely the normalized Paper-Pencil test for this purpose. However, different problems such as learning the test, refusal the disease, good or bad blazon of oneself, and the resistance and other problems will result in new ways to enter this field, including other new methods which are computer-based tests. Computers have a large impact on the performance of the tests. One of the important assignments for these tests is the computer adaptive test. These tests are more consistent approaches to assess progress and viewpoints, and often 1, 2 or 3 small set of questions with an average difficulty is presented to the subjects to estimate the individual ability. Depending on the subject’s answer to the question, the next question will be presented such that has the most coordination with his/her abilities. So, if all the questions were answered incorrectly, the next question arises easier. If all questions are answered successfully, the next question or questions will be designed more difficult. If the question is answered correctly, another question with the same degree of difficulty is replaced as a marker. Marker question is a question that specifies the total power. Extensive researches have been conducted about the comparability of computer questionnaire with standardized one.

Wilson conducted an attitude test on 98 female students, both in computerized and paper-pencil methods. They demonstrated that the average and variances obtained by the 2 methods were comparable by modifying the methods.2

Helden and Hykman, developed the 2 methods, computerized and the paper-pencil, from Jenkins activity scale.3 This scale is used to identify people with type A personality. 60 people participated in all forms of the test. The results suggested the comparability of variances, averages, test reliability and validity. Martin and Rush implemented Helden and Hykman method on Eysenck Personality Inventory and Carl depression rating.4 Half of the subjects answered to the computerized questionnaire and the other half responded to the standardized questionnaire. The mean and standard deviation of 2 methods were comparable. Also, the researchers have studied the Neuro-psychological computer- adaptive tests. Pellegrino, compared a collection of 10 computer test of spatial abilities with Paper-Pencil tests.5 The results showed that using computerized tests to spatial abilities can replace the standard paper-pencil tests. Furthermore, Pellegrino compared the Halsted computer-based classified test with the original one.5 This test was conducted on a group of people with neuro-psychiatric deficits. Their report shows that the computer-based tests are similar to the Halsted standard test. However, there are some researches with confusing results. French and Beaumont, For example, reported a meaningful difference between the computerized and Progressive Matrices of standard tests and concluded that they couldn’t be used interchangeably.6

In this study, we sought a new method did not face with these problems, but some studies have been started before us, including the extraction of an individual preference from his/her from EEG, which is in Neuromarketing area.7,8

Another similar study is called QEEG- based treatment in which the QEEG was used for treatment. QEEG is a diagnostic tool that analyzes brain waves from analog mode (EEG) to digital one. This makes the study of the electrical potentials resulted of the brain cell processes during mental activities possible (such as attention, memory, decision-making, etc.). QEEG also called brain map, displays the recorded electrical activity on the surface of the skull as a range of different colors that reflect the activity level of the brain cells in different parts. Among this type of studies, is the QEEG based OCD neurofeedback therapy or the treatment of personality disorders based on QEEG that use QEEG technology to apply more successful neurofeedback treatment of their patients. In the case of paper-pencil test is not conducted on subjects. But the weak point of the method that the present study makes up is the meaningful internal evaluation based on QEEG for a software such as  Neuroguide that perform based on a normal curve and lacks of an external golden standard such as Beck test; so, the vacancy of the method appears even in QEEG-based treatment.

Another interesting series of studies compare an individual EEG by Paper-Pencil tests, as in the studies conducted by Merkelberch, Morris, and Horslnberg.9

So, according to the above, the aim of this study is to determine the degree of mental disorder based on the EEG. As demonstrated by previous findings in Neuromarketing, alpha peak frequency in F4 is the best parameter for the right-handed people to extract the Preference and emotions from EEG.7,8 According to the Davidson there are 3 assumptions in the EEG of depressed and normal people:10

A) Right hemisphere alpha of the normal people is larger than their left hemisphere. This is different in depressed people; B) Alpha has a negative relationship with brain activity; C) The right and the left hemispheres which are associated with negative and positive emotions, respectively.

Therefore, in depression treatment when there is an asymmetry neurofeedback  decreases in F3 and increases simultaneously in F4 when 2-channel assembly. Prefrontal cortex in neural networks involved in processing the mood and emotion. Furthermore, there are some differences between the 2 hemispheres in terms of positive and negative emotions. So, the right hemisphere involves more in negative emotions and the left hemisphere involves more in positive emotions. Numerous studies have shown that the dorsal lateral side of the left prefrontal cortex (DLPFC) becomes more active in positive emotions. The left DLPFC injury following stroke, trauma or epilepsy or depression is often associated with the right DLPFC, while the damage in the right DLPFC studies associated with elevated mood.11

It has been found that stimulation of DLPFC using direct current electricity is associated with a positive emotional changes.11 Recent studies with the aim of changing the activity of the prefrontal cortex and balancing between left and right have shown the significant effects of TDCS on decreased symptoms of major depressive disorder.12 Also, in confirming this assumption it must be said that SSRIs also reduce the alpha on F3.



The present study is a correlation one that has been conducted by the regression and correlation method. The data were collected in this field. This is a cross sectional as well as an applied study.

Because of a widespread of depression among people and the necessity of designing a methodology to eliminate the problems related to paper-pencil test, the disease (depression) was chosen for this case. The sample size consists of 107 subjects with convenience sampling methods who were introduced as a statistical sample of depression from Tabriz psychiatrists to the researcher. After the introduction and explanation the test process, the Beck test conducted on the subjects to assess baseline Depression.

In compliance with the professional ethics in referrals as well as at the top of the Beck questionnaire that was provided to patients, assured them that no electricity entered their body through the use of ProComm device electrode. Also, a written consent was received from each subject to conduct the test and publishes the results.

Research instruments in the present study consisted of Procomp Infinity to take EEG and the Beck depression inventory.

Beck depression inventory (version 2), (BDI.II): It consists of 21 questions and their answers are scored in the range of 0 to 3.13 The inventory has a 71% correlation with Hamilton’s depression scoring and the validity of a one-week retesting is 93%.

EEG device: in the present study a 2-channel device (Procomp 2, made by Thought Technology), with the frequency range from 0.5 to 40 Hertz was used. A sampling rate of 8 Hertz and a single–channel recording on the point F4 of the brain were used.















Figure 1: The displayed images in movie



Neurofeedback device (Procomp2 Infinity) was used to extract the traits from EEG as a device that displayed fife images selected from emotional images of global database to the user, and the EEG was recorded, concurrently. The necessary changes have been previously applied to fit its hardware and software to the researcher’s test. IAPS is a global database of emotional pictures on the web belonging to the University of California that has classified the images according to their emotional excitation power. 5 photos from 5 different categories and different emotional conditions were selected by researcher and a 40-second film was made of them.

Each image displayed for 5 seconds, then a neutral image for 3 seconds and again the next image until the end of the displaying. Finally, the individual EEG was recorded.


Selected images from the IAPS global database and their properties












Fly on the cake


Happy child


Fish in the ocean


Happy Cats


Angry snake


IAPS No. 7360


IAPS No. 2058


IAPS No. 1900


IAPS No. 1463


IAPS No. 1052

Figure 2: The displayed images and their IAPS number



The input of this model includes 5 extracted characteristics from EEG recorded from the F4 point of the brain of patients with depression which is called αpeak frequency. The point, that is F4 and αpeak was selected according to the previous studies. The data gatheringscreen which was conducted in Procomp Infinity (feedback device) is observed in Figure 3. In right side,  a film of 5 pictures is displayed and was seen via monitors in front of the subjects. On the left side the EEG and the Alpha peak diagram is seen. After the recording process from the Export Data (Figure 4) the Alpha peak of EEG is extracted. In fact, at this stage the analogues are changed to digital data.




Figure 3: The EEG recording window of Procomp infinity


Figure 4: The EEG data extraction window in the text format of Procomp Infinity software



A computer program written by the author divided the raw EEG to 5-bits, pieces of α1peak and α2peak and α3peak and α4peak and α5peak.

In this program, the alpha time series at peak frequency was read about each patient and the mean αpeak frequency value associated with animated image inserts according to the table:


Table 1:Times associated with the animated images

αpeak frequency mean

Times associated with the animated images


0 to 5 seconds


8 to 13 seconds


16 to 21 seconds


24 to 29 seconds


32 to 37 seconds


Then, the average of αpeak frequency was obtained at limits. Finally, the data entered into SPSS and multiple regression analysis was performed.

The reason for choosing F4 to scan: In a recent double-blind randomized controlled trial with sham (sham is similar to placebo that is used in interventions such as ECT in which the patient is connected to a power-off device) that was pointed out by Fregni, has been used in depression therapy and the results of this study showed significant effects of antidepressants after anodic polarization in the dorsa-lateral prefrontal cortex-accessories (DLPFC).14 Paulo Sergio Boggio, compared 3 groups consisted of 40 patients with the major depression in a randomized double-blinded clinical trial.15 Hamilton Depression Scale to measure the highest symptoms (HDRS) was observed in the group that received the DLPFC stimulation anode. The group had a 40% reduction in symptoms, while a group in which the occipital electrodes were installed, demonstrated a 21% decrease and the control group had a 10 percent reduction. 5 patients had a full recovery in DLPFC group and no full recovery was observed in the 2 other groups. Antidepressant effect had remained one month after the last treatment in the DLPFC group. These findings should be confirmed and repeated by other centers.

In a multiple regression model, a set of independent variables was entered the equation to determine the coefficient of determination (R2) and the weight of each variable (Beta). Moreover, in order to determine the contribution of each variable to explain the dependent variable in the multiple regression technique Enter method is used. In this method the independent variables entered simultaneously into the analysis and their effects on the dependent variable is determined.

Independent variables which were entered into a multivariate regression equation are:α1peak andα2peak andα3peak andα4peak and α5peak

Multiple correlation coefficient R is equal to 0.823 and modified coefficient of determination () is 0.725 that is, 79.1% of depression changes relating to the understudied variables in this research. Thus, it can be said that the variables are important to explain how those involved in. To understand the significance of a regression equation, refer to the analysis of the regression variance table.



Table 2: Results of regression variance analysis


Total squares

Degree of freedom

Average of squares


Significance level





















Based on the data of regression variance analysis the F-value is equal to 8.725. Due to the smaller significance level than the error level, the regression equation is approved thus; the observed determination coefficient of the regression equation is statistically significant. To know which one of the variables is more important to explain the depression refers to the standardized beta coefficients table.

In the below table, the 4 understudied variables and the importance and influence of each of the variables of depression are shown.



Table 3: Estimate the regression coefficient of assumptions


Nonstandard coefficients

Standard coefficients




Estimation error









































Given the significance of the model with P2=79.1%, only 2 variables out of 5
(ie, 2 images of 5) at Pα1peak and α3peak) and according to  had almost similar influence on the individual. So, regression equation of the Beck prediction from the individual EEG based on the first and third picture is as follows:


Bek=6.516α1peak frequency + 9 α3peak frequency


The results of the present study as well as the regression analysis showed that the score of an individual depression without using a questionnaire and only based on EEG on F4 which is stimulated by the first and third images visually and real-time is predictable.

The diagnostic performance measures, including sensitivity, specificity, negative and positive predictability were calculated. The sensitivity is called the correct positives which were identified by the test; while, the property is called the ratio of the correct negatives identified by the test. The negative predictability shows the possibility of the absence of disease if the test result is negative. The positive predictability shows the possibility of the presence of disease if the test result is positive.

A group comprised of 26 patients was selected randomly to measure sensitivity, specificity, positive and negative predictability. The clinical psychologist’s opinion was recorded as a golden standard via clinical interview and according to Beck's test for each patient's depression and its rate.

The regression equation obtained from 2,α1peak and α3peak , variables were used to calculate the values from the EEG. So, the values of sensitivity, specificity, positive and negative predictability were obtained.

Bek=6.516α1peakfrequancy + 9α3peakfrequancy

The obtained values of sensitivity, specificity, positive and negative predictability were 100%, 30.8%, 59.1%, and 100% respectively. Therefore, considering the high sensitivity of the methodology, the test could identified correctly all the patients. The relatively modest value of specificity explains the modest power of the methodology in separating the healthy group. The area under the curve was 57.4% in terms of sensitivity and specificity:


Figure 5: Roc curve



Paper-pencil tests have always its own problems in the mental disorders evaluation, including learning questions, bad or good blazon, the denial of disease and other problems, that they are the problems with this methodology. This study aimed to propose a new alternative method of measuring mental disorders without paper-pencil test.

Considering the limited access to the patients the available sampling method was used in which the subjects with an early detection of depression by a psychiatrist were introduced to a clinical psychologist to conduct Beck's test. The subjects also introduced to a researcher on Feb., Mar., Apr., and Mar. in 2015 to obtain the golden standard.

In a multiple regression model to predict the depression, with respect to the mentioned components, as well as determining the coefficient of determination (R2) and the weight of each variable (Beta) a set of independent variables was entered into the equation. The results revealed that α1peak with β=0.591 was more determinant than other variables. This means that the more powerful α1peak, the more increased level of depression. Then α3peak variable with β=0.521 had a more determinative than other variables. The obtained values of sensitivity, specificity, positive and negative predictability were 100%, 30.8%, 59.1%, and 100% respectively. Therefore, considering the high sensitivity of the methodology, the test could identify correctly all the patients. The relatively modest value of specificity explains the modest power of the methodology in separating the healthy group. The area under the curve was 57.4% in terms of sensitivity and specificity. Although the earlier studies on alpha peak frequency and recording have been conducted only on f4 point, this research generally is the first one of this kind in which the aim is psychological-clinical disorders rather than neuro marketing.

The present study is consistent with the in terms of  the location of EEG scan at F3 or F4 in dorsa-lateral Prefrontal Cortex as well as the relevant component extraction of alpha peak frequency.7,8,10

Aurup and Akgunduz in their study “Preference Extraction from EEG” obtained an accuracy of 90% agreement in predicting the subject's emotional favors using alpha peak frequency of EEG which has been recorded from the f4 point.7 Aurup in another research called “Pair-Wise Preference Comparisons Using Alpha-Peak Frequencies” in which the recording process had been conducted to explain that low alpha peak of the F4 point suggested a stronger preference compared with another item.8 In the present study there is a strong relationship between alpha peak waves and the rate of depression.



Considering the obtained results of this study it is recommended that when using this method for new depressed patients with the aim of predicting the Beck’s depression score (without using paper-pencil test), it will be better to make a film of the first and third images:








Fish in ocean

 Angry snake

International IAPS NO: 1052

International IAPS NO: 1052


Figure 6: Related pictures of significantαpeak frequency(α1peakfrequancy and α3peakfrequancy)



















Figure 7: Initial scenario of pictures



Also, it is recommended to decrease the brain scan time from 40 seconds to 5 seconds + 3 neutral seconds + 5 animated seconds, that is 13 seconds and to create the following time stripe:




5 Seconds



3 Seconds



5 Seconds


Figure 8: Final scenario of pictures


Considering the study conducted by Aurup and Akgunduz called “extraction of users’ prefers from EEG”, in which they obtained an accuracy of 90% agreement in predicting the subjects’ emotional favors using alpha peak frequency of EEG which has been recorded from the f4 point as well as in the study conducted by Aurup called “Pair-Wise Preference Comparisons Using Alpha-Peak Frequencies", in which the recording process had been conducted explained that low alpha peak of the F4 point suggested a stronger preference compared with another item. In the present study, as well there is a strong relationship between alpha peak waves and the rate of depression recorded from the F4 point.7,8

The present study can be the first step in the prediction and even the depression diagnosis using direct data of the brain without any questionnaire and even diagnosis by a specialist. Although there is a long way to that point, this type of studies show that it can be possible to move in this way and being closer to the aim by determining carefully the cerebral point, careful recording and data processing via effective instruments and methods.



The authors declare no conflict of interest.



Special thanks to Psychiatrists who cooperated with us in patients’ referral to conduct the present study, especially Dr. Majid Torabi.

1. Marks IM, Kenwright M, McDonough M, Whittaker M, Mataix-Cols D. Saving clinicians' time by delegating routine aspects of therapy to a computer: A randomized controlled trial in phobia/panic disorder. Psychol Med. 2004; 34(1):9-17.

2. Wilson FR, Genco KT, Yager GG. Assessing the equivalence of paper-and-pencil vs. computerized tests: Demonstration of a promising methodology. Comput Human Behav. 1985; 1(3): 265-75.

3. Holden RR, Hickman D. Computerized versus standard administration of the Jenkins Activity Survey (Form T). J Human Stress. 1987; 13(4): 175-9.

4. Merten T, Ruch W. A comparison of computerized and conventional administration of the German versions of the Eysenck Personality Questionnaire and the Carroll Rating Scale for Depression. ‎Pers Individ Dif. 1996; 20(3): 281-91.

5. Pellegrino JW, Hunt EB, Abate R, Farr S. A computer-based test battery for the assessment of static and dynamic spatial reasoning abilities. Behav Res Methods Instrum Comput. 1987; 19(2): 231-6.

6. French CC, Beaumont JG. A clinical study of the automated assessment of intelligence by the Mill Hill Vocabulary Test and the Standard Progressive Matrices Test. J Clin Psychol. 1990; 46(2): 129-40.

7. Aurup GM, Akgunduz A. Preference Extraction from EEG: An Approach to Aesthetic Product Development. In Proceedings of the 2012. International Conference on Industrial Engineering and Operations Management, Istanbul, Turkey: 1178-86.

8. Aurup GM, Akgunduz A. Pair-Wise Preference Comparisons Using Alpha-Peak Frequencies. J integr des Process Sci. 2012; 16(4): 3-18.

9. Merkelbach H, Muris P, Horselenberg R, de Jong P. EEG correlates of a pencil and paper test measuring hemisphericity. J Clin Psychol. 1997; 57: 739-44.

10. Davidson RJ. Emotion and affective style: Hemispheric substrates. Psychol Sci. 1992; 3(1): 39-43.

11. Boggio PS, Bermpohl F, Vergara AO, Muniz AL, Nahas FH, Leme PB, Rigonatti SP, Fregni F. Go-no-go task performance improvement after anodal transcranial DC stimulation of the left dorsolateral prefrontal cortex in major depression. . J Affect Disord. 2007; 101(1-3): 91-8.

12. Nitsche MA, Boggio PS, Fregni F, Pascual-Leone A. Treatment of depression with transcranial direct current stimulation (tDCS): A review. Exp Neurol. 2009; 219(1): 14-9.

13. Beck AT, Steer RA, Brown GK. Beck Depression Inventory-II M. San Antonio, TX. Philadelphia: The Psychological Corporation; 1996.

14. Nitsche MA, Boggio PS, Fregni F, Pascual-Leone A. Treatment of depression with transcranial direct current stimulation (tDCS): a review. Bipolar Disord. 2006; 8(2): 203-4.

15. Begum S, Ahmed MU, Funk P, Xiong N, Von Schéele B. A case based decision support system for individual stress diagnosis using fuzzy similarity matching. Comput Intell. 2009; 25(3): 180-95.