What is statistics?

Statistics is an investigation of data.
When we want to use statistics, we must first collect data, organize them in the most convenient way for this case, in a way that we can use them to draw conclusions about the given situation.

To understand the fundamental principles of statistics we must begin by clarifying and defining certain concepts:

• Frequency: Indicates the number of times a certain element appears in some set.
• Mode: The element that appears most frequently in a set.
• Relative frequency: Indicates the number of times a certain element appears in a set in relation to other elements of the same set.
• Mean or average: It is calculated by adding the values of the elements in a set and dividing the total by the number of elements in the set.

There are many ways to display statistical parameters, including tables, lists, graphs, charts, diagrams and more.

We know that the word statistics can sound a bit threatening and incomprehensible, perhaps because we don't usually use it in our daily conversations.

Statistics really is a kind of language of its own, but after you read this article you will see how, in a matter of minutes, you will know all you need to know about this topic to be able to solve exercises without blinking an eye.

Don't believe us, stay with us!

Let's see then, what is statistics?

Statistics is an investigation of data.

No, we do not expect you to investigate natural phenomena, but simply to carry out an analysis that allows us to discover important things.

Important for life itself, but mainly for the next school test.

Join Over 30,000 Students Excelling in Math!
Endless Practice, Expert Guidance - Elevate Your Math Skills Today

How will we find the data we need?

The first step is to collect and organize data.

Imagine doing some research.
The first thing to do is to organize all the data, that is, all the grades received on that test.
If instead of looking at each grade separately, we sort them we will be able to continue drawing conclusions about the test.

Here are some options for the best way to sort the data:

Data Lists

We will simply write down all the data obtained in a list.
For example:

List of test scores.

$90,80,70,100,80,50,60,90,70,80$

Table

We will draw a table with two columns.
In the first one appears the data: the grade and in the second one: the number of times this data was received. In our example: The number of students who obtained the grade.

For example:

Bar or column chart

This type of diagram is also called adiscrete graph.

Do not be afraid of the word diagram, as this is just a geometric drawing that you can easily plot on the Cartesian plane.

On the axis $X$ you write your data.
On the axis $Y$ the number of times this data appears.
In our example, the number of students who got the grade.

A tip from us to create a diagram easily:

Imagine the table with the data as if it were a table with axes $X$ and $Y$
. Write on the grade the letter $X$ and on the number of students the letter $Y$.
Place the points on the Cartesian plane and then draw the corresponding bars.

Pie chart

It's time to bring the round cake pan.
Just kidding... don't run to the kitchen!

The pie chart is also called a pie chart, pizza chart, pie chart, cake chart or cheese chart, because of its similarity to them.
The size of each slice represents the amount of each piece of information.
The more times the data item appears, the larger the slice and vice versa.
In our example, the more students who receive that grade, the larger the slice that represents it.

Graph

As in the bar chart,
you can easily create a graph if you imagine it as a table of $X$ and $Y$.
Again, draw the two axes, mark on the axis $X$ the data => the grade, and on the axis $Y$ the quantity => number of students who obtained that grade, and mark the corresponding points.
This time, instead of drawing rectangles, just draw a line through all the points.

For example

Important note:

It doesn't affect anything the way you choose to organize your data, all ways are good and represent the same thing.
Just pay attention to see if you are asked for a specific way of representation.
Perfect! You have organized the data in a very good and professional way! Now we can move on to analyzing the data and drawing the required conclusions.

Statistical frequency

The statistical frequency represents the number of times each piece of data is obtained.
Hint: Remember that the word frequent is synonymous with common/habitual/usual. When you see the word "frequent",
remember to ask: how common was the data?
In our example, how many times each grade was earned, how many times a specific grade was earned, or the number of students who earned a specific grade.

The frequency of the grade $100$ is $1$.
The frequency of the grade $90$ is $2$.
The frequency of the grade $80$ is $3$.
So on and so forth...

Relative Frequency in Statistics

Relative frequency in statistics describes the relationship between the frequency and the amount of data.
Imagine it this way: if the frequency of the grade $50$ is $2$, i.e., two students got the grade $50$,
but the number of students is $4$, we can deduce that half of the class failed the exam.
However, if the number of students is $20$, we can conclude that only one tenth of the students failed the exam (provided, of course, that all the other grades were higher).

To calculate the relative frequency of a particular piece of data,
we will take the frequency of the grade and divide it by the amount of data collected.

In our example:

The relative frequency of the grade $100$ is: $\frac{1}{10}$ or $10%$.
The relative frequency of the score $90$ is: $\frac{2}{10}$ or $20%$.
The relative frequency of the grade $80$ is: $\frac{3}{10}$ or $30%$.
And so on...

Hint: Do you want to make sure you did it correctly?
The complete relative frequency of all data is always $1$ or $100%$.

Main Indexes in Statistics

Mode, range, mean/average and median.

Mode

It is the most common value, the one that appears the most number of times
To find the mode look at which value appears most frequently.
It is the value received the most number of times.

In our example we will ask: Which score was obtained the most times?
The answer is $80$.
Therefore, the mode is $80$. Its frequency is $3$,
is the highest among all the frequencies presented.

Note:

Several modes could appear in a single investigation.
If another score had been obtained that had the same frequency as $3$ in our example, we would also mark it as a mode.

Range

The interval between the maximum value and the minimum value.
In our example, the data range is the grades: $50-100$.
Among which $50$ is the lowest grade and $100$ is the highest.

Mean / Average

The value that represents by itself the whole investigated set.
How shall we calculate it?

With a simple equation

The sum of the data will be: The first data by its frequency + the second data by its frequency + the third data by its frequency and so on...
In fact, we will multiply each of the data by its frequency and then add the products.
We will divide the result by the number of data and, in this way, we will discover the average.

In our example:
let's calculate the sum of the data according to the following table and divide by the number of data.

$\frac{50\times 1 + 60\times 1 +70\times 2 + 80\times 3 + 90\times 2 + 100\times 1}{1+2+3+2+1+1}=\frac{770}{10}=77$

Then, the mean score is $\ 77$.

Useful information:

Median

The median is the data that has the central position. How will you be able to remember it?
Think of half of the data as greater than it and half as less than it (or equivalent).
The median we are looking for is in the middle.

What is the median?

First step

Let's arrange the list of data in increasing order and check if the amount of data is even or odd.

For example

$100,90,90,80,80,80,70,70,60,50$

In case the amount of data is even there will not be an element that half of the data is greater and half of the data is less than it, therefore,
will add $1$ to the amount of elements and then, divide by $2$.
or so: $10+1=11$
$11÷2=5.5$

Second step

We will calculate the median of the two central elements, i.e., at the locations $5$ and $6$.
$80+80=160$
$160÷2=80$
Our median is . $180$

Hint: If the amount of data in the table is very large there is no need to sort them all in the list.
You can look at the table and see which element is in which location.

Here we see that:
$50$ takes the first place,
$60$ the second,
$70$ the third and the fourth
and $80$ takes the fifth, sixth and seventh places.
We have reached the places we wanted (fifth and sixth), therefore, there is no need to continue, we can calculate the mean which will give us the median.

In case the amount of data is odd we do the same:
add 1 to the amount of elements and divide by $2$.
However, this time we do not need to calculate the mean of the two central elements since there is only one.

For example, if we had had $11$ ratings,
we would have calculated it like this :
$1+11=12$
$12÷2=6$
Attention, the median is not $6$ but the grade that is located at the sixth place in your list. $6$ is only the location of the median.

To sum up...

Now that you know everything you need to solve exercises in statistics you just have to practice.
It is not without reason that we have stated that statistics is a language.
Do not be afraid or worry about it, you will see that little by little the word statistics will stop scaring you. On the contrary, now that you understand something of the subject maybe even you will be happy to find it in some exam.

If you are interested in this article you may also be interested in the following articles:

On the Tutorela blog you will find a variety of articles about mathematics.

Related Subjects