This repo holds the materials, lectures and scripts for the Boston University course "Tools and Techniques for Data Mining and Applications". You can find more information about the course here. It is based on the materials developed by Evimaria Terzi and Harry Mavroforakis in Fall 2015 [here] (https://github.com/dataminingapp/dataminingapp-lectures)
Lecture 2 - Getting Started| [Pandas] (http://nbviewer.ipython.org/github/datascience16/lectures/blob/master/Lecture2/Getting-to-know-your-data-with-Pandas.ipynb)
Lecture 3 - Distance Functions | Slides
Lecture 7 - Hierarchical Clustering
[Lecture 8 - EM] (http://nbviewer.ipython.org/github/datascience16/lectures/blob/master/Lecture8/EM.ipynb)
Lecture 9 - Other clustering algorithms | [Density-slides] (https://github.com/datascience16/lectures/blob/master/Lecture9/density-based-clustering.pdf?raw=true)
Lecture 11 - SVD in practice | Web scraping slides
Lecture 12 - Classifiaction | [More classification methods] (https://github.com/datascience16/lectures/blob/master/Lecture13/NBSVM.pdf?raw=true) | [SVM] (http://nbviewer.ipython.org/github/datascience16/lectures/blob/master/Lecture14/SVM.ipynb)
Lecture 13 - Classification II | [Slides] (https://github.com/datascience16/lectures/blob/master/Lecture13/kNNHigh.pdf?raw=true) | [RandomP] (http://nbviewer.ipython.org/github/datascience16/lectures/blob/master/Lecture13/RandomP.ipynb)
Lecture 14 - Linear Regression
Lecture 15 - Logistic Regression
Lecture 16 - Linear Regression II
Lecture 17 - Recommender Systems
Lecture 18 - Introduction to graph analysis
Lecture 19 - Node Centralities | [Centrality-slides] (https://github.com/datascience16/lectures/blob/master/Lecture19/Centrality-Measures.pdf?raw=true)
Lecture 20 - Community detection | [Cuts-slides] (https://github.com/datascience16/lectures/blob/master/Lecture20/cuts.pdf?raw=true)
Lecture 22 - Map Reduce Graph Algorithms
Lecture 23 - Computing Triangles slides | [Spark Slides] (https://github.com/datascience16/lectures/blob/master/Lecture23/spark.pdf?raw=true)
The homeworks of this course can be found at this repository.