Algorithms text mining and web crawling WF-R-PS-STAS
1. Introduction to Text Mining algorithms, basic information about the method.
2. Using algorithms to count the number of words in documents and assigning weights to them: Raw, Inverce Document Frequency - part one
3. Using algorithms to count the number of words in documents and assigning weights to them: Raw, Inverce Document Frequency - part two
4. Presentation of the results by means of Principal Component Analysis.
5. Reporting the results of Principal Components Analysis.
6. Converting the database of words into numerical data.
7. Recoding of numerical data into new numerical variables.
8. Web Crawling - basic information
9. Reporting the results of Web Crawling
10. Combining qualitative data analysis methods using TM algorithms with other algorithms: decision trees.
11. Reporting the results of connecting TM with decision trees
12. Combining qualitative data analysis methods using TM algorithms with other algorithms: Generalized k-means Cluster Analysis.
13. Reporting the results of combining TM with cluster analysis
14. Combining qualitative data analysis methods using TM algorithms with other algorithms: neural networks.
15. Reporting the results of connecting TM with neural networks
(in Polish) E-Learning
(in Polish) Grupa przedmiotów ogólnouczenianych
Subject level
Learning outcome code/codes
Type of subject
Course coordinators
Learning outcomes
KNOWLEDGE:
- PhD students correctly use the terminology of the Text mining and Web Crawling methods.
SKILLS:
- carry out analysis with the use of Text mining algorithms and principal components analysis as well as search for data using Web Crawling
COMPETENCES:
- correctly interpret the results of the analyzes performed
Description of ECTS credits
Participation in classes: 30 hours
Preparation for classes and preparation of reports, reading literature: 30 hours
Assessment criteria
The basis for passing the course is the submission of two final reports presenting the results prepared with the use of the Web Crawling and Text Mining methods.
Bibliography
Elder, J., Hill, T., Miner, G., Nisbet, B., Delen, D., & Fast, A. (2012). Practical Text Mining and Statistical Analysis for Nono-structured Text Data Application. Oxford: Elsevier.
Nisbet, R., Elder, J., & Miner, G. (2009). Handbook of statistical analysis and data mining applications. Burlington, MA: Academic Press (Elsevier).
Szymańska, A. (2017). Wykorzystanie algorytmów Text Mining do analizy danych tekstowych w psychologii [Usage of text mining algorithms to analyze textual data in psychology]. Socjolingwistyka, 33, 99–116.
Additional information
Additional information (registration calendar, class conductors, localization and schedules of classes), might be available in the USOSweb system: