The theory of data similarity

Data similarity is the key element in measuring the relationships between data for data analysis. The research topic includes the definition of a similarity measure, the computation of the similarity, the properties of the similarity measure, evaluation criteria of similarity functions, etc. The construction of the similarity theory is a solution to the core problems of data mining and big data analysis. The achievement in this research direction will impact the development of data technology.


Data measurements and data algebra

The complete and correct theory of data computing is vital to data science. The RDBMS (Relational Database Management System) was fine when data naturally fit into tables, but it was known from the beginning that the Relational Model of Data was incomplete. The imperfection of the model became obvious primarily because of the difficulties experienced when using the relational database (RDBMS) with particular data structures. The topic is meant to construct the algebraic system for various types of data.


The research methods for data science

The basic research methods for data science include data exploration, data experiment and data perception. Data exploration is intended to explore the characteristics and structure of data sets , so we can assess the value of data sets and select the methods for analyzing data sets. Data experiments are intended to check and verify hypotheses and laws of the nature or datanature. Data perception transfers data in perceptible ways through the five senses: vision, hearing, touch, smell and taste.