Statistics overview

This statistics overview introduces some important aspects concerning the calculation and use of table based statistics in QPSMR Companion. In Companion, you can control statistics using product format settings.

Significance testing, confidence limits & probabilities

Question: Is this figure significant? Reply: Compared to what?

When discussing 95% or 99% significance, a survey figure (percentage or mean score) can only be significantly different when compared with some other figure from our survey, or a known value (from another source). The program has many ways of marking a figure as 95% significant but what does this mean?

For example

Let us assume that we have a product rating mean score for two subsets of data. Men have a mean of 1.56 and Women have a mean of 1.34. When compared, Companion flags Men as having a significantly higher mean at the 95% level than Women. There is a 1 in 20 chance that a difference this large (0.22) would be obtained if the user draws two subsets from the same universe. We can be 95% confident there is a real difference between Men and Women when rating the product, assuming users perform perfect sampling.

The word “significance” in this context is unfortunate, because it is not the same as the normal use of the word. On a rating scale of 1 to 5 we might have a mean score difference of 0.01 that is “significant” and a difference of 2.00 that is not “significant”. The word “confidence” is much more descriptive.

So, when Companion marks a figure at 99% it is unlikely (1 in 100) there is no difference between samples. The reverse is not true; we cannot say that if Companion does not mark a figure then there is no difference. It may just be that the sample sizes are not large enough for us to be confident. The more interviews we do, the more real differences will be found.

Following some of the tests in the program is a probability figure. This shows the statistic’s confidence level value between 0.0-1.0, so small figures are significant and large ones are not. A 95% significance will show as a probability of 0.05 or less, and 99% significance as 0.01 or less. To convert a probability to a significance level, subtract the probability from 1.0 and use first two digits after the decimal point. For example, a probability of 0.1278 becomes 0.8722 which is 87.22% significant.

What Companion compares with what?

The program will compare figures that are in the same break. Companion tests only columns that appear in the same break under the same header against each other. The majority of the tests in the program compare subsets of the data shown as the columns on a table.

In addition, Companion compares each column with the total column. This does not use an overlapping test, but subtracts the column from the total to give “the rest”. You can use “the rest” column for comparison.

Against the total column

As an example we will use six columns: Total, Male, Female, Young, Middle, Old.

Using format SHG0, SHG11 (default), or SHG12, Companion tests whether data subsets (columns) are different from the remainder of the sample. In this way the program will highlight any “interesting” columns for further investigation.

In our example

Companion tests Males against Females, Young against Middle and Old together, Middle against Young and Old together, and also Old against Young and Middle together.

The purpose of testing in this way is to highlight columns (breakdowns) which are “different”. This testing method includes the total sample in every test and is more likely to detect differences than other comparison types.

Because Companion calculates “rest of the data” by subtracting the column under investigation from the total, this standard method will only work on tables with a total column.

Companion marks cells with + or – for 95% significance and two ++ or — for 99%. You can change this to formats SMA and SMB. SMA/SMB are set to + (plus sign) so cells will be marked with plus or minus. This depends on whether the proportion/mean is significantly higher (+) or lower (-) numerically. If using an Excel output, cells will be a green colour if higher and red if lower.

Column identifiers

This common method for comparison in Companion uses labels on each column as an identifier, for example formats SGH1 or SHG11 (default):

Demographics^Area\North (a)
Mid (b)
South (c)
Not stated
Sex\Male (m)
Female (f)

The word “Demographics” is an over-header above the following columns, in this case all of the columns. The words “Area” and “Sex” are headers above the following columns.

You must include the identifiers at the end of the individual labels (preferably on a new line) enclosed within parentheses. In this method, Companion will compare individual pairs of columns; the range of the comparisons depend upon format SHG. See also the “Combined method” below.

With the normal formats SHG1 or SHG11 (test within headers), Companion compares each area with the other two, and places the appropriate letter markers against the cell if it finds significant differences.

For example

If Companion finds “Mid” to be different and higher to both the other areas, with a significance of 95% (default SLA95), it will be marked with lower case letters “ac” after the value. If Companion finds “Mid” to be different and higher to both the other areas, with a significance of 99% (default SLB99) it will be marked with upper case letters “AC” after the value. Companion will not test the “Not stated” column, because it does not have an identifier. Companion will compare Males and Females, so if Females are different and higher, it will place the letter “m” (lower or upper case depending on significance level) next to the cell.

If you use rare formats SHG2 or SHG12 (test with overheaders), then Companion compares all five identified columns with each other.

IMPORTANT: When marking figures, you should only mark the higher figure.

Combined testing

If you use SHG11 (default) and SHG12, it causes Companion to use markers from both the total comparison method and the column identifiers method to mark cells.

Types of test

This section describes the types of testing that can be done:

Distribution Z tests or t-tests

Where a table has a list of items (for example “Likes”) down the side, you can use format SIG to mark cells using a Z test or t tests on proportions. Format SIG3 (default) does a standard t test.

Companion treats each table row separately and marks cells depending on whether the percentage is different to the column Companion compares it with in the same row.

Mean or average t tests

Where a table has rows from which a mean score or average is produced, Companion will calculate t tests and mark significant differences using format TTV1(default).

Mean or average F tests

If you use format TTF, Companion will perform an F-test on all the columns within each group. This test is used to establish whether the group of columns (for example – Area) affects the row mean or average, without looking at all the individual coloumn pairs.

Other software

Data from Companion can be output to other specialised software for further statistical tests.