(in Polish) Algorytmy sztucznej inteligencji w Big Data-wykład WSE-BD-ASIwBD(I)-w

1. Introduction to Big Data – history and development of Big Data, definition and key challenges, applications across various industries.
2. Data acquisition in Big Data – web crawling algorithms, data collection from the internet, information extraction mechanisms.
3. Web scraping – methods of retrieving data from websites, scraping libraries, ethical and legal issues.
4. Transformer architecture – overview of the structure and functioning of models based on the attention mechanism.
5. Generative Pretrained Transformer (GPT) – analysis of the language model, its applications, and development.
6. BERT and its role in text data processing – applications in NLP, structure, and functioning of the model.
7. Text mining algorithms – information extraction from documents, sentiment analysis, text categorization.
8. Inductive algorithms: classification trees – basics of decision tree theory, examples of applications.
9. Regression trees – regression algorithms and predictive analysis based on data.
10. Artificial neural networks – fundamentals of operation, structure of perceptrons, and deep learning.
11. Kohonen networks – self-organizing feature maps and their applications.
12. Support Vector Machines (SVM) – classification and regression using SVM, applications in Big Data.
13. SVM – classification – detailed discussion of classification methods.
14. SVM – regression – application of Support Vector Machines in predictive analysis.
15. Summary – key conclusions, discussion of the practical applications of the methods learned.

Term 2025/26_Z:

(in Polish) Grupa przedmiotów ogólnouczenianych

(in Polish) nie dotyczy

(in Polish) Opis nakładu pracy studenta w ECTS

Direct participation in classes: 30 hours Participation in assessments outside of class: 2 hours Participation in consultations: 15 hours Total: 2 ECTS Independent work: Preparation for classes (reading, written work, translation, etc.): 10 hours Preparation for assessment (e.g. reading, presentation, project, etc.): 20 hours Total: 1 ECTS

Subject level

elementary

Learning outcome code/codes

enter learning outcome code/codes

Type of subject

obligatory

Course coordinators

Agnieszka Szymańska

Learning outcomes

Knowledge:

Knows basic and advanced concepts related to Big Data and data acquisition techniques (web crawling, web scraping).

Understands the architecture of transformer-based models, including GPT and BERT, and their applications in text analysis.

Is familiar with major machine learning algorithms used in Big Data, such as decision trees, SVM, and neural networks.

Skills:

Can acquire and preprocess data from the Internet using basic information extraction techniques.

Is able to apply Text Mining methods and basic NLP tools to text analysis.

Can build and evaluate simple classification or regression models based on available data.

Social Competences:

Understands the importance of ethical and responsible data acquisition and processing.

Can critically assess analytical results obtained through AI algorithms.

Is prepared to further develop knowledge and skills in the area of Big Data and artificial intelligence.

Assessment criteria

The basis for passing the course is a test covering the knowledge acquired during the lecture. To pass the test, a student must provide 60% correct answers.

Bibliography

Elder, J., Hill, T., Miner, G., Nisbet, B., Delen, D., & Fast, A. (2012). Practical Text Mining and Statistical Analysis for Non-structured Text Data Application. Oxford: Elsevier.

Nisbet, R., Elder, J., & Miner, G. (2009). Handbook of statistical analysis and data mining applications. Burlington, MA: Academic Press (Elsevier).

Szymańska, A. (2017a). Wykorzystanie algorytmów Text Mining do analizy danych tekstowych w psychologii [Usage of text mining algorithms to analyze textual data in psychology]. Socjolingwistyka, 33, 99–116.

Szymańska, A. (2017b). Wykorzystanie Analizy Skupień Metodą Data Mining Do Wykreślania Profili Osób Badanych. Studia Psychologiczne, 55, 26–42. https://doi.org/10.2478/V1067-010-0160-1

Additional information

Additional information (registration calendar, class conductors, localization and schedules of classes), might be available in the USOSweb system:

Description of WSE-BD-ASIwBD(I)-w in USOSweb