Sample size calculation on web, can we rely on the results?

One of the most frequent concerns of the researches is sample size calculation. There are many equations to calculate sample size one should be selected correctly based on the type of the study and data type. Any mistake can mislead the researcher and so the study results. Even an overestimation can cause waste of cost, time and material. Either statisticians or other researchers frequently use online advice on sample size calculation. Online sample size calculators have been searched, reviewed and checked for the calculated results based on known formulas for common research objectives. Considering the most common types of health study objectives regarding sample size calculation (estimating proportion or mean, comparison of two proportions or two means), from 60 website or software, only 5(8.3%) provided all four types of calculations. Overall exact calculation rate was 8.3% and 16.1% per calculation and per site, respectively. Many of the sites just calculate sample size for estimating proportions and most of the results lack the accuracy in calculation.


Introduction
The statisticians, methodologists and non-statistician professional researchers frequently ask this question "how many subjects do I need for my study?" [1]. The sample size affect the feasibility of the research and proper calculation possess significant effect on the power and validity of the results [2]. Currently, the researchers solve most of their problems via online solutions and face to face consultations are limited for the must to do occasions. For internet assisted sample size calculation, the first step is to find a proper site with online calculation capability or downloadable files but the accuracy of calculations is one of the most important matters. There are many online sample size calculation sites with different specifications. Non-statistician users do not like sophisticated processes for calculation and more professional statisticians may need advanced calculations for somehow rare study types. In this review the sample size calculation websites have been listed and tried to be checked how their calculations are the same as basic formulas show.

Methods
Google search engine have been used with keywords including sample size calculator, sample size calculation, sample size in health studies, sample size web site and sample size software for retrieving related sites. Software informer website (available at http://software.informer.com) also used to find downloadable files for sample size calculation. Each site reviewed and data concerning URL, owner, availability of calculation equation(s), providing other types of calculations rather than only sample size calculation e.g. confidence levels or statistical tests collected. Standard formulas for sample size calculation extracted considering the four most common types of sample size calculation in health studies for estimating proportion of a qualitative variable, a mean of a quantitative variable and comparison of two proportions and two means of two different populations according to detailed review of several statistical resources, such as some biostatistics textbooks [3][4][5][6][7][8] and articles [9]. Four hypothetical examples considered for calculations (one for each) and the results based on selected formulas calculated ( Table 1). Then the results of each site or software compared with the standard results. A difference more than rounding the final results (more than 1) with the standard result, considered as inaccurate (not the same as formula) calculation. The feasibility of site usage also evaluated to determine whether the application of sites was simple and user friendly or professional. To evaluate the complexity, two medical researchers who had some experience in medical research and sample size calculation asked to use sample size calculators to score them as easy, moderate or advanced to use. The sites scored as easy if they required basic input data for calculation and instructed each phrase of calculation clearly for users; the sites which needed detailed information about the study e.g. one tailed or two tailed calculations considered as moderate; and the sites which required very detailed and additional information which were not familiar to routine practices in medical research according to the basis of statistical knowledge as a medical researchers e.g. complex survey analysis or cost analysis considered as advanced. The congruent rating of both observers for sites considered as the ease of use level and if there was any incongruence between their assessment, a third evaluator considered the site review for rating.

Discussion
We provided a list of available online sample size calculating sites with brief description and an evaluation of their accuracy in calculations, besides the level of ease of use and complexity.
Online sample size calculators vary from easy and userfriend to advanced and professional, but what is at the most important matter is "How valid are their calculations?". According to our findings, there are few sites which provides all common ways of sample size calculation needed for common types of researches and only a few provides accurate calculations. The most important reason for inaccurate results may be due to some fix considerations in many sites e.g. considering a fix proportion of 50% for calculation of sample size in studies when the aim is estimation of prevalence. Some other reasons may be different formulas used in sites rather those we considered in our assessment. It seems providing basic formulas for each calculation, may help researchers to choose more accurate path for sample size calculation. Although in some conditions e.g. comparison of two proportions different formulas will provide almost the same results for sample size calculation. In other conditions like estimation there are somehow more different formulas for sample size calculations and so more apart results. This has been shown in our study that some sites used the formula which considered type II error for sample size calculation for estimation a proportion or a mean [ [12,26,49,55,57] in our study we did not rate and assess their calculations but when in simple ways of calculations there are so many faults how we can expect accurate results from those complex sites. Providing multi language support may be an ease of use for users with different languages other than English language. In our review we found two sites which provided multi language support [23,37]. Some good references for reviewing sample size calculation principles are used in this study and can be helpful for researchers [7,8]. Special statistical software like STATA, SAS and R also provides commands for sample size calculation and power analysis which can be useful for sophisticated approach.

Conclusion
It seems that those sites which provide calculating formulas can be more trusted for online sample size calculation and at least there will be a way for researchers to check the calculation accuracy based on the provided formulas. Although researchers who perform sophisticated studies need special assistance of statisticians for sample size calculation in their studies.