You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: lectures/polars.md
+16-8Lines changed: 16 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -65,6 +65,10 @@ as [statsmodels](https://www.statsmodels.org/) and [scikit-learn](https://scikit
65
65
66
66
This lecture will provide a basic introduction to polars.
67
67
68
+
```{tip}
69
+
**Why use Polars over pandas?** The main reason is `performance`. As a general rule, it is recommended to have 5 to 10 times as much RAM as the size of the dataset to carry out operations in pandas, compared to 2 to 4 times needed for Polars. In addition, Polars is between 10 and 100 times as fast as pandas for common operations. A great article comparing the Polars and pandas can be found [in this JetBrains blog post](https://blog.jetbrains.com/pycharm/2024/07/polars-vs-pandas/)
70
+
```
71
+
68
72
Throughout the lecture, we will assume that the following imports have taken
69
73
place
70
74
@@ -89,7 +93,6 @@ A `DataFrame` is a two-dimensional object for storing related columns of data.
89
93
90
94
Let's start with Series.
91
95
92
-
93
96
We begin by creating a series of four random observations
94
97
95
98
```{code-cell} ipython3
@@ -98,11 +101,11 @@ s
98
101
```
99
102
100
103
```{note}
101
-
You may notice the above series has no indexes, unlike in [](pandas:series).
104
+
You may notice the above series has no indices, unlike in [pd.Series](pandas:series).
102
105
103
106
This is because Polars' is column centric and accessing data is predominantly managed through filtering and boolean masks.
104
107
105
-
Here is [an interesting blog post discussing this in more detail](https://medium.com/data-science/understand-polars-lack-of-indexes-526ea75e413)
108
+
Here is [an interesting blog post discussing this in more detail](https://medium.com/data-science/understand-polars-lack-of-indexes-526ea75e413).
106
109
```
107
110
108
111
Polars `Series` are built on top of Apache Arrow arrays and support many similar
0 commit comments