In addition, you need to learn data acquisition, analysis and processing software, mathematical modeling software and computer programming language. The knowledge structure is a cross-border talent (with professional knowledge and data thinking).
Take China Renmin University as an example:
Basic courses: mathematical analysis, advanced algebra, introduction to general physical mathematics and information science, data structure, introduction to data science, introduction to programming and programming practice.
Compulsory courses: discrete mathematics, probability statistics, algorithm analysis and design, data computing intelligence, introduction to database system, computer system foundation, parallel architecture and programming, unstructured big data analysis.
Elective courses: introduction to data science algorithm, special topics of data science, data science practice, practical development technology of Internet, sampling technology, statistical learning, regression analysis and stochastic process.
Big data work:
1, big data system architect
Big data platform construction, system design and infrastructure.
Skills: computer architecture, network architecture, programming paradigm, file system, distributed parallel processing, etc.
2. Big data system analyst
Facing the actual industry field, we use big data technology to manage, analyze and apply the data security life cycle.
Skills: artificial intelligence, machine learning, mathematical statistics, matrix calculation, optimization method.
3.hadoop development engineer.
Solve the problem of big data storage.
4. Data analyst
Professionals in different industries who specialize in collecting, sorting out and analyzing industry data and conducting industry research, evaluation and prediction based on the data. In our work, we use tools to extract, analyze and present data to realize the commercial significance of data.
5. Data Mining Engineer
To discover laws from massive data, data mining needs certain mathematical knowledge, such as linear algebra, advanced algebra, convex optimization, probability theory and so on. Commonly used languages are Python, Java, C or C++, and I use Python or Java more myself. Sometimes write programs with MapReduce, and then process data with Hadoop or Hyp. If you use Python, combine it with Spark.