Understanding Variable Types in Data Analysis

   Understanding variable types is important in data analysis to describe and explain the problem well. Data preparation in machine learning involves understanding the content of data and each variable type in it. After learning the variable types we can do initial changes to represent the data in a better way.

   In this post, we'll briefly learn some of the key variable types in statistics and data analysis.
   Generally, data variables can be classified into quantitative and qualitative variable types.

    • Quantitative variables express numerical values acquired through counting or measuring, and they are continuous. Counts,  percentages, values in number can be an example of quantitative data. 
    • Qualitative variables express certain categories such as names, symbols, colors, labels and etc. They are discrete and categorical. 

   Quantitative variables can be categorized into ratio and interval type.

    • Ratio type is interval data with a natural zero point such as temperature measurement data. 0 value has a meaning in this type of data [-10, 0, 2, 20].
    • Interval type represents equally spaced, meaningful interval variables.
    
Qualitative variables can be further categorized as below.

    • Nominal data basically refers to categorical data without order such as name, type of car, a model of product and etc. 
    • Ordinal type refers to quantities with the natural order and meaningful data such as, "beginner/intermediate/advanced", "cold/medium/hot" and etc.
    • Dichotomous (binary) type is limited into only two categories e.g. "1/0", "yes/no", or "true/false".

   Definition of "discrete" and "continuous" variables.

  • A Discrete variable contains only finite and distinct values. 
  • A Continuous variable can take any values, and they are not restricted.

   In this post, we've briefly learned data types in statistics and machine learning. Thank you for reading!

No comments:

Post a Comment