What is Data?
Data is a collection of raw facts which can be processed to make Information out of it. The data can be in the form of numbers, words, symbols
or even pictures for example:
Numbers | 10, 20, 2006, 29, 06 etc |
Words | Something, Learning, Thankyou etc |
Symbols |
The above data like something, learning etc. Can you understand anything from this data ?
As this data is not processed into information, we can't get any information out of it.
Data v/s Information
Data consists of raw facts, values and figures, in itself it does not mean anything. Once it is structured, processed and presented as a response to a question it becomes Information
Data | Information |
something | other |
any | or |
any | or |
Data Processing
Data processing refers to organizing and performing various operations on collected data to produce meaningful information.
Data Analysis
Data analysis is about finding useful patterns in data based on the goals or purpose of the data analysis activity.
Data analysis is very important for Machine Learning
Machines analyze historical and
available data to make predictions and arrive at conclusions for taking action
Processes Involed in Data Analysis:
- Inspecting the Data
- Segregating and filtering useful data
- Analyzing and Making a model of the data to discover meaningful information
- Making Conclusion as per Goal
DataSets
Artificial Intelligence requires a huge amount of data in order to provide an acceptable output. When people collectively pool their data, we can collect a huge amount of raw data.
In Machine Learning, a common task is the study and construction of algorithms that can learn and make predictions on data.
Such algorithms function by making data driven decisions through building a practical model from input data.
Data is usually organized for the purpose of analysis in Datasets.
Dataset is a collection of related sets of Information that is composed of separate elements but can be manipulated by a computer as a unit.
There are two sets of Data:
1. Training Data
2. Testing Data
Training Dataset
Testing Dataset
Types of Data
At the Supreme Level, Data can be divided intro two types:
1. Quantitative Data
2. Qualitative Data
Quantitative Data
1. | Numbers and things you can measure objectively |
2. | Dimensions such as Height, Width, and Length |
3. | Temperature and Humidity |
4. | Prices |
5. | Area and Volume |
This is just the highest level of data.
There are also different types of quantitative and qualitative data. These data can be further classified into various types as discussed in the subsequent sections.
Quantitative Data (to be replaced)
|----- Continious Data
|----- Discrete Data
Continious Data
Discrete Data
Qualitative Data
1. | Characteristics and
descriptors that can’t be
easily measured, but can be observed subjectively |
2. | Such as Smells, Tastes, Textures, Attractiveness, and Color |