What is Data?

Data is a collection of raw facts which can be processed to make Information out of it. The data can be in the form of numbers, words, symbols or even pictures for example:

Data in different forms :


Numbers 10, 20, 2006, 29, 06 etc
Words Something, Learning, Thankyou etc
Symbols

The above data like something, learning etc. Can you understand anything from this data ?

As this data is not processed into information, we can't get any information out of it.

Data v/s Information

Data consists of raw facts, values and figures, in itself it does not mean anything. Once it is structured, processed and presented as a response to a question it becomes Information


Difference between Data and Information


Data Information
something other
any or
any or

Data Processing

Data processing refers to organizing and performing various operations on collected data to produce meaningful information.

input-output-process

Data Analysis

Data analysis is about finding useful patterns in data based on the goals or purpose of the data analysis activity.

Data analysis is very important for Machine Learning

Machines analyze historical and available data to make predictions and arrive at conclusions for taking action

Processes Involed in Data Analysis:

  1. Inspecting the Data
  2. Segregating and filtering useful data
  3. Analyzing and Making a model of the data to discover meaningful information
  4. Making Conclusion as per Goal

DataSets

Artificial Intelligence requires a huge amount of data in order to provide an acceptable output. When people collectively pool their data, we can collect a huge amount of raw data.

In Machine Learning, a common task is the study and construction of algorithms that can learn and make predictions on data.

Such algorithms function by making data driven decisions through building a practical model from input data.

Data is usually organized for the purpose of analysis in Datasets.

Dataset is a collection of related sets of Information that is composed of separate elements but can be manipulated by a computer as a unit.

There are two sets of Data:

1. Training Data
2. Testing Data

Training Dataset

Testing Dataset

Types of Data

At the Supreme Level, Data can be divided intro two types:

1. Quantitative Data

2. Qualitative Data

Quantitative Data

Quantitative Data Consists


1. Numbers and things you can measure objectively
2. Dimensions such as Height, Width, and Length
3. Temperature and Humidity
4. Prices
5. Area and Volume

This is just the highest level of data.

There are also different types of quantitative and qualitative data. These data can be further classified into various types as discussed in the subsequent sections.

Quantitative Data (to be replaced)
|----- Continious Data
|----- Discrete Data

Continious Data

Discrete Data

Qualitative Data

Qualitative Data Consists


1. Characteristics and descriptors that can’t be easily measured,
but can be observed subjectively
2. Such as Smells, Tastes, Textures, Attractiveness, and Color

Binomial Data

Nominal Data

Odrinal Data