Course details
Knowledge Discovery in Databases
ZZN Acad. year 2020/2021 Winter semester 5 credits
Data warehouses. Data mining techniques association rules, classification and prediction, clustering. Mining unconventional data - data streams, time series and sequences, graphs, spatial and spatio-temporal data, multimedia. Text and web mining. Working-out a data mining project by means of an available data mining tool.
Guarantor
Course coordinator
Language of instruction
Completion
Time span
- 39 hrs lectures
- 13 hrs projects
Assessment points
- 51 pts final exam (written part)
- 15 pts mid-term test (written part)
- 34 pts projects
Department
Lecturer
Instructor
Subject specific learning outcomes and competences
- Students get a broad, yet in-depth overview of the field of data mining and knowledge discovery.
- They are able both to use and to develop knowledge discovery tools.
-
Student learns terminology in Czech and English.
-
Student gains experience in solving projects in a small team.
-
Student improves his ability to present and defend the results of projects.
Learning objectives
To familiarize students with the methods and algorithms of data modelling for knowledge discovery from it.
Why is the course taught
Due to the increasing amounts of data currently stored in databases and other data sources, it is necessary to discover some new knowledge, which is not possible to obtain with use of querying the data. Therefore, in connection with the knowledge and skills from the subject UPA related to the data mining process and to the data preparation before its modelling, it is necessary to get acquainted with the data modelling methods and algorithms. They are based on methods and techniques from various areas, such as statistics and machine learning.
Prerequisite knowledge and skills
- Knowledge of the basic steps of the data mining process and methods of data preparation for the step of data modelling (discussed in the subject UPA - Data Storage and Preparation).
- Basic knowledge of probability and statistics.
- Knowledge of database technology at a bachelor subject level.
Study literature
- Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Third Edition. Morgan Kaufmann Publishers, 2012, 703 p., ISBN 978-0-12-381479-1.
Fundamental literature
- Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Third Edition. Morgan Kaufmann Publishers, 2012, 703 p., ISBN 978-0-12-381479-1.
- Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Second Edition. Elsevier Inc., 2006, 770 p., ISBN 1-55860-901-3.
Syllabus of lectures
- Data Warehouse and OLAP Technology for knowledge discovery.
- Mining frequent patterns and associations - basic concepts, efficient and scalable frequent itemset mining methods.
- Multi-level association rules, association mining and correlation analysis, constraint-based association rules.
- Predictive modelling - basic concepts, classification methods - decision tree, Bayesian classification, rule-based classification.
- Classification by means of neural networks, SVM classifier, Random forests.
- Other classification and regression methods. Evaluation of quality of classification and regression.
Cluster analysis - basic concepts, types of data in cluster analysis. - Partitioning-based and hierarchical clustering. Other clustering methods. Evaluation of quality of clustering.
- Outlier analysis. Mining in biological data.
- Introduction to mining data stream and time-series.
- Introduction to mining in sequences, graphs, spatio-temporal data, moving object data and multimédia data.
- Text mining.
- Mining the Web. Process mining.
- Introduction to big data analytics.
Syllabus - others, projects and individual work of students
- Working-out a data mining project by means of an available data mining tool.
Progress assessment
A mid-term test, formulation of a data mining task, presentation of the project.
Exam prerequisites:
Duty credit consists of working-out the project, defending project results and of obtaining at least 24 points for activities during semester.
Controlled instruction
- Mid-term written exam, there is no resit, excused absences are solved by the guarantor.
- The formulation of the data mining task in the prescribed term, excused absences are solved by the assistent.
- The presentation of the project results in the prescribed term, excused absences are solved by the assistent.
- Final exam, The minimal number of points which can be obtained from the final exam is 20. Otherwise, no points will be assigned to the student. excused absences are solved by the guarantor.
Exam prerequisites
Duty credit consists of working-out the project, defending project results and of obtaining at least 24 points for activities during semester.
Course inclusion in study plans
- Programme IT-MGR-2, field MBI, MIN, 2nd year of study, Compulsory
- Programme IT-MGR-2, field MBS, any year of study, Compulsory-Elective group S
- Programme IT-MGR-2, field MGM, 2nd year of study, Elective
- Programme IT-MGR-2, field MIS, 2nd year of study, Compulsory-Elective group N
- Programme IT-MGR-2, field MMI, MMM, any year of study, Elective
- Programme IT-MGR-2, field MPV, any year of study, Compulsory-Elective group D
- Programme IT-MGR-2, field MSK, 2nd year of study, Compulsory-Elective group M
- Programme MITAI, field NADE, NCPS, NEMB, NGRI, NHPC, NIDE, NMAL, NMAT, NNET, NSEC, NSEN, NSPE, NVER, NVIZ, any year of study, Elective
- Programme MITAI, field NBIO, NISY, any year of study, Compulsory
- Programme MITAI, field NISD, 2nd year of study, Compulsory