Choosing the optimum number of samples while testing your product boils down to a common trade-off: cost vs. benefit. Staying within your testing budget is important, but ensuring that your data is robust enough to withstand scrutiny from the regulatory body is paramount and can help avoid costly delays due to re-testing. As a contract testing lab, DDL rarely advises on sample size, however, there are common patterns that we have observed in the sampling that our customers use when submitting. We will outline these patterns in the hope that they will be useful for those who are researching how to best structure their testing regimen.
The first determination that needs to be made when choosing the appropriate statistical method to determine sample size is whether the test data you will be obtaining are attribute or variable. Attribute data, also called binomial data, are qualitative. Pass/fail or go/no-go are common types of attribute data – for example whether a measured dimension falls within the tolerances on the drawing. Variable data are given in numbers – for example the seal strength of a heat-sealed pouch or the tensile strength of a poly film. This article will look specifically at attribute testing as it is common in the world of medical device testing and is more easily generalized than variable testing.
In order to determine the optimal sample size, you must determine what level of risk you are able to tolerate – an internal regulatory or quality department or a consultant will provide good guidance. A high-risk product requires more test samples in order to achieve an acceptable confidence interval. Statistically speaking, a higher risk product means that you need to assign a more stringent acceptable quality level (AQL), p0, to your experimental design. The AQL represents the maximum allowable proportion of defective items in a lot. For example, if a maximum of 5% of your parts can be defective, your p0 value would be 0.05. From there, using the cumulative geometric distribution function, you can determine your optimum sample size for an attribute test.
The most common sample sizes DDL sees for attribute tests are 29 and 59. In order to obtain 95% confidence that your product’s passing rate is at least 95% – commonly summarized as “95/95”, 59 samples must be tested and must pass the test. If your product has lower risk and you are able to accept a lower passing rate of 90%, only 29 passing samples are needed to obtain 95% confidence, or “95/90”. These numbers all assume that there will be no failures in any of the samples. That, unfortunately, is not always the case. In the event of an isolated failure, a different equation – the negative binomial distribution, must be used. In order to maintain the same confidence intervals as stated above with one failure, the sample size is 46 for a p value of 0.10 and 93 for a p value of 0.05 and increases with additional failures.
Determining sample size for an attribute test is a fairly straightforward task once the statistical requirements are known, but its importance cannot be overstated. Not only can it help ensure that regulatory requirements are met, it also provides evidence that the quality of the product is high, meaning increased customer satisfaction and patient safety.