Introduction to big data. Characteristics of big data. Big data and data science. Relational databases and big data. Distributed data systems. Hadoop ecosystem.
Big data management. Structured and semi-structured data models. Non-relational (NoSQL) data models. Data models and database systems for big data. Domain-specific languages for big data. Monitoring big data systems.
Big data processing. Querying and retrieval.
Paradigms for computing with data. Processing pipelines and aggregators. Basic algorithmic building blocks and patterns. Hadoop. Spark.
Data analytics with big data. Data analytics tools. Basic statistics. Clustering. Associations. Predictive modeling. Spark machine learning library MLib.
Big data and graph analytics. NoSQL graph databases for big data. Neo4j graph database. Graph querying with CYPHER. Basic graph analytics with Neo4j and CYPHER.
Practical aspects of big data analytics. Processing heterogeneous data. Processing data streams.
- nosilec: Matjaž Kukar