{
"cells": [
{
"cell_type": "raw",
"metadata": {
"ExecuteTime": {
"end_time": "2020-01-02T10:50:31.982740Z",
"start_time": "2020-01-02T10:50:31.976911Z"
}
},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Machine Learning with vaex.ml\n",
"\n",
"If you want to try out this notebook with a live Python kernel, use mybinder:\n",
"\n",
"\n",
"\n",
"\n",
"The `vaex.ml` package brings some machine learning algorithms to `vaex`. If you installed the individual subpackages (`vaex-core`, `vaex-hdf5`, ...) instead of the `vaex` metapackage, you may need to install it by running `pip install vaex-ml`, or `conda install -c conda-forge vaex-ml`.\n",
"\n",
"The API of `vaex.ml` stays close to that of [scikit-learn](https://scikit-learn.org/stable/), while providing better performance and the ability to efficiently perform operations on data that is larger than the available RAM. This page is an overview and a brief introduction to the capabilities offered by `vaex.ml`."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"ExecuteTime": {
"end_time": "2021-04-13T10:14:59.411079Z",
"start_time": "2021-04-13T10:14:57.668212Z"
}
},
"outputs": [],
"source": [
"import vaex\n",
"vaex.multithreading.thread_count_default = 8\n",
"import vaex.ml\n",
"\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will use the well known [Iris flower](https://en.wikipedia.org/wiki/Iris_flower_data_set) and Titanic passenger list datasets, two classical datasets for machine learning demonstrations."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"ExecuteTime": {
"end_time": "2021-04-13T10:15:00.780624Z",
"start_time": "2021-04-13T10:14:59.413189Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"df.scatter(df.petal_length, df.petal_width, c_expr=df.class_);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Preprocessing \n",
"\n",
"### Scaling of numerical features\n",
"\n",
"`vaex.ml` packs the common numerical scalers:\n",
"\n",
"* `vaex.ml.StandardScaler` - Scale features by removing their mean and dividing by their variance;\n",
"* `vaex.ml.MinMaxScaler` - Scale features to a given range;\n",
"* `vaex.ml.RobustScaler` - Scale features by removing their median and scaling them according to a given percentile range;\n",
"* `vaex.ml.MaxAbsScaler` - Scale features by their maximum absolute value.\n",
" \n",
"The usage is quite similar to that of `scikit-learn`, in the sense that each transformer implements the `.fit` and `.transform` methods."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"ExecuteTime": {
"end_time": "2021-04-13T10:15:01.969065Z",
"start_time": "2021-04-13T10:15:01.862449Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
#
sepal_length
sepal_width
petal_length
petal_width
class_
scaled_petal_length
scaled_petal_width
scaled_sepal_length
scaled_sepal_width
\n",
"\n",
"\n",
"
0
5.9
3.0
4.2
1.5
1
0.25096730693923325
0.39617188299171285
0.06866179325140277
-0.12495760117130607
\n",
"
1
6.1
3.0
4.6
1.4
1
0.4784301228962429
0.26469891297233916
0.3109975341387059
-0.12495760117130607
\n",
"
2
6.6
2.9
4.6
1.3
1
0.4784301228962429
0.13322594295296575
0.9168368863569659
-0.3563605663033572
\n",
"
3
6.7
3.3
5.7
2.1
2
1.1039528667780207
1.1850097031079545
1.0380047568006185
0.5692512942248463
\n",
"
4
5.5
4.2
1.4
0.2
0
-1.341272404759837
-1.3129767272601438
-0.4160096885232057
2.6518779804133055
\n",
"
...
...
...
...
...
...
...
...
...
...
\n",
"
145
5.2
3.4
1.4
0.2
0
-1.341272404759837
-1.3129767272601438
-0.7795132998541615
0.8006542593568975
\n",
"
146
5.1
3.8
1.6
0.2
0
-1.2275409967813318
-1.3129767272601438
-0.9006811702978141
1.726266119885101
\n",
"
147
5.8
2.6
4.0
1.2
1
0.13723589896072813
0.0017529729335920385
-0.052506077192249874
-1.0505694616995096
\n",
"
148
5.7
3.8
1.7
0.3
0
-1.1706752927920796
-1.18150375724077
-0.17367394763590144
1.726266119885101
\n",
"
149
6.2
2.9
4.3
1.3
1
0.30783301092848553
0.13322594295296575
0.4321654045823586
-0.3563605663033572
\n",
"\n",
"
"
],
"text/plain": [
"# sepal_length sepal_width petal_length petal_width class_ scaled_petal_length scaled_petal_width scaled_sepal_length scaled_sepal_width\n",
"0 5.9 3.0 4.2 1.5 1 0.25096730693923325 0.39617188299171285 0.06866179325140277 -0.12495760117130607\n",
"1 6.1 3.0 4.6 1.4 1 0.4784301228962429 0.26469891297233916 0.3109975341387059 -0.12495760117130607\n",
"2 6.6 2.9 4.6 1.3 1 0.4784301228962429 0.13322594295296575 0.9168368863569659 -0.3563605663033572\n",
"3 6.7 3.3 5.7 2.1 2 1.1039528667780207 1.1850097031079545 1.0380047568006185 0.5692512942248463\n",
"4 5.5 4.2 1.4 0.2 0 -1.341272404759837 -1.3129767272601438 -0.4160096885232057 2.6518779804133055\n",
"... ... ... ... ... ... ... ... ... ...\n",
"145 5.2 3.4 1.4 0.2 0 -1.341272404759837 -1.3129767272601438 -0.7795132998541615 0.8006542593568975\n",
"146 5.1 3.8 1.6 0.2 0 -1.2275409967813318 -1.3129767272601438 -0.9006811702978141 1.726266119885101\n",
"147 5.8 2.6 4.0 1.2 1 0.13723589896072813 0.0017529729335920385 -0.052506077192249874 -1.0505694616995096\n",
"148 5.7 3.8 1.7 0.3 0 -1.1706752927920796 -1.18150375724077 -0.17367394763590144 1.726266119885101\n",
"149 6.2 2.9 4.3 1.3 1 0.30783301092848553 0.13322594295296575 0.4321654045823586 -0.3563605663033572"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"features = ['petal_length', 'petal_width', 'sepal_length', 'sepal_width']\n",
"scaler = vaex.ml.StandardScaler(features=features, prefix='scaled_')\n",
"scaler.fit(df)\n",
"df_trans = scaler.transform(df)\n",
"df_trans"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The output of the `.transform` method of any `vaex.ml` transformer is a _shallow copy_ of a DataFrame that contains the resulting features of the transformations in addition to the original columns. A shallow copy means that this new DataFrame just references the original one, and no extra memory is used. In addition, the resulting features, in this case the scaled numerical features are _virtual columns,_ which do not take any memory but are computed on the fly when needed. This approach is ideal for working with very large datasets."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Encoding of categorical features\n",
"\n",
"`vaex.ml` contains several categorical encoders:\n",
"\n",
"* `vaex.ml.LabelEncoder` - Encoding features with as many integers as categories, startinfg from 0;\n",
"* `vaex.ml.OneHotEncoder` - Encoding features according to the one-hot scheme;\n",
"* `vaex.ml.MultiHotEncoder` - Encoding features according to the multi-hot scheme (binary vector);\n",
"* `vaex.ml.FrequencyEncoder` - Encode features by the frequency of their respective categories;\n",
"* `vaex.ml.BayesianTargetEncoder` - Encode categories with the mean of their target value;\n",
"* `vaex.ml.WeightOfEvidenceEncoder` - Encode categories their weight of evidence value.\n",
" \n",
" The following is a quick example using the Titanic dataset."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"ExecuteTime": {
"end_time": "2021-04-13T10:15:03.177265Z",
"start_time": "2021-04-13T10:15:03.143397Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
#
pclass
survived
name
sex
age
sibsp
parch
ticket
fare
cabin
embarked
boat
body
home_dest
\n",
"\n",
"\n",
"
0
1
True
Allen, Miss. Elisabeth Walton
female
29
0
0
24160
211.338
B5
S
2
nan
St Louis, MO
\n",
"
1
1
True
Allison, Master. Hudson Trevor
male
0.9167
1
2
113781
151.55
C22 C26
S
11
nan
Montreal, PQ / Chesterville, ON
\n",
"
2
1
False
Allison, Miss. Helen Loraine
female
2
1
2
113781
151.55
C22 C26
S
--
nan
Montreal, PQ / Chesterville, ON
\n",
"
3
1
False
Allison, Mr. Hudson Joshua Creighton
male
30
1
2
113781
151.55
C22 C26
S
--
135
Montreal, PQ / Chesterville, ON
\n",
"
4
1
False
Allison, Mrs. Hudson J C (Bessie Waldo Daniels)
female
25
1
2
113781
151.55
C22 C26
S
--
nan
Montreal, PQ / Chesterville, ON
\n",
"\n",
"
"
],
"text/plain": [
" # pclass survived name sex age sibsp parch ticket fare cabin embarked boat body home_dest\n",
" 0 1 True Allen, Miss. Elisabeth Walton female 29 0 0 24160 211.338 B5 S 2 nan St Louis, MO\n",
" 1 1 True Allison, Master. Hudson Trevor male 0.9167 1 2 113781 151.55 C22 C26 S 11 nan Montreal, PQ / Chesterville, ON\n",
" 2 1 False Allison, Miss. Helen Loraine female 2 1 2 113781 151.55 C22 C26 S -- nan Montreal, PQ / Chesterville, ON\n",
" 3 1 False Allison, Mr. Hudson Joshua Creighton male 30 1 2 113781 151.55 C22 C26 S -- 135 Montreal, PQ / Chesterville, ON\n",
" 4 1 False Allison, Mrs. Hudson J C (Bessie Waldo Daniels) female 25 1 2 113781 151.55 C22 C26 S -- nan Montreal, PQ / Chesterville, ON"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = vaex.datasets.titanic()\n",
"df.head(5)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"ExecuteTime": {
"end_time": "2021-04-13T10:15:05.014900Z",
"start_time": "2021-04-13T10:15:04.289615Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
#
pclass
survived
name
sex
age
sibsp
parch
ticket
fare
cabin
embarked
boat
body
home_dest
label_encoded_embarked
embarked_missing
embarked_C
embarked_Q
embarked_S
embarked_0
embarked_1
embarked_2
frequency_encoded_embarked
mean_encoded_embarked
woe_encoded_embarked
\n",
"\n",
"\n",
"
0
1
True
Allen, Miss. Elisabeth Walton
female
29
0
0
24160
211.338
B5
S
2
nan
St Louis, MO
1
0
0
0
1
1
0
0
0.698243
0.337472
-0.696431
\n",
"
1
1
True
Allison, Master. Hudson Trevor
male
0.9167
1
2
113781
151.55
C22 C26
S
11
nan
Montreal, PQ / Chesterville, ON
1
0
0
0
1
1
0
0
0.698243
0.337472
-0.696431
\n",
"
2
1
False
Allison, Miss. Helen Loraine
female
2
1
2
113781
151.55
C22 C26
S
--
nan
Montreal, PQ / Chesterville, ON
1
0
0
0
1
1
0
0
0.698243
0.337472
-0.696431
\n",
"
3
1
False
Allison, Mr. Hudson Joshua Creighton
male
30
1
2
113781
151.55
C22 C26
S
--
135
Montreal, PQ / Chesterville, ON
1
0
0
0
1
1
0
0
0.698243
0.337472
-0.696431
\n",
"
4
1
False
Allison, Mrs. Hudson J C (Bessie Waldo Daniels)
female
25
1
2
113781
151.55
C22 C26
S
--
nan
Montreal, PQ / Chesterville, ON
1
0
0
0
1
1
0
0
0.698243
0.337472
-0.696431
\n",
"\n",
"
"
],
"text/plain": [
" # pclass survived name sex age sibsp parch ticket fare cabin embarked boat body home_dest label_encoded_embarked embarked_missing embarked_C embarked_Q embarked_S embarked_0 embarked_1 embarked_2 frequency_encoded_embarked mean_encoded_embarked woe_encoded_embarked\n",
" 0 1 True Allen, Miss. Elisabeth Walton female 29 0 0 24160 211.338 B5 S 2 nan St Louis, MO 1 0 0 0 1 1 0 0 0.698243 0.337472 -0.696431\n",
" 1 1 True Allison, Master. Hudson Trevor male 0.9167 1 2 113781 151.55 C22 C26 S 11 nan Montreal, PQ / Chesterville, ON 1 0 0 0 1 1 0 0 0.698243 0.337472 -0.696431\n",
" 2 1 False Allison, Miss. Helen Loraine female 2 1 2 113781 151.55 C22 C26 S -- nan Montreal, PQ / Chesterville, ON 1 0 0 0 1 1 0 0 0.698243 0.337472 -0.696431\n",
" 3 1 False Allison, Mr. Hudson Joshua Creighton male 30 1 2 113781 151.55 C22 C26 S -- 135 Montreal, PQ / Chesterville, ON 1 0 0 0 1 1 0 0 0.698243 0.337472 -0.696431\n",
" 4 1 False Allison, Mrs. Hudson J C (Bessie Waldo Daniels) female 25 1 2 113781 151.55 C22 C26 S -- nan Montreal, PQ / Chesterville, ON 1 0 0 0 1 1 0 0 0.698243 0.337472 -0.696431"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"label_encoder = vaex.ml.LabelEncoder(features=['embarked'])\n",
"one_hot_encoder = vaex.ml.OneHotEncoder(features=['embarked'])\n",
"multi_hot_encoder = vaex.ml.MultiHotEncoder(features=['embarked'])\n",
"freq_encoder = vaex.ml.FrequencyEncoder(features=['embarked'])\n",
"bayes_encoder = vaex.ml.BayesianTargetEncoder(features=['embarked'], target='survived')\n",
"woe_encoder = vaex.ml.WeightOfEvidenceEncoder(features=['embarked'], target='survived')\n",
"\n",
"df = label_encoder.fit_transform(df)\n",
"df = one_hot_encoder.fit_transform(df)\n",
"df = multi_hot_encoder.fit_transform(df)\n",
"df = freq_encoder.fit_transform(df)\n",
"df = bayes_encoder.fit_transform(df)\n",
"df = woe_encoder.fit_transform(df)\n",
"\n",
"df.head(5)"
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2020-01-02T13:09:43.742926Z",
"start_time": "2020-01-02T13:09:43.676031Z"
}
},
"source": [
"Notice that the transformed features are all included in the resulting DataFrame and are appropriately named. This is excellent for the construction of various diagnostic plots, and engineering of more complex features. The fact that the resulting (encoded) features take no memory, allows one to try out or combine a variety of preprocessing steps without spending any extra memory. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Feature Engineering\n",
"\n",
"### KBinsDiscretizer\n",
"\n",
"With the `KBinsDiscretizer` you can convert a continous into a discrete feature by binning the data into specified intervals. You can specify the number of bins, the strategy on how to determine their size:\n",
"\n",
"* \"uniform\" - all bins have equal sizes;\n",
"* \"quantile\" - all bins have (approximately) the same number of samples in them;\n",
"* \"kmeans\" - values in each bin belong to the same 1D cluster as determined by the `KMeans` algorithm."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"ExecuteTime": {
"end_time": "2021-04-13T10:15:07.793286Z",
"start_time": "2021-04-13T10:15:07.742886Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/jovan/vaex/packages/vaex-core/vaex/ml/transformations.py:1089: UserWarning: Bins whose width are too small (i.e., <= 1e-8) in age are removed.Consider decreasing the number of bins.\n",
" warnings.warn(f'Bins whose width are too small (i.e., <= 1e-8) in {feat} are removed.'\n"
]
},
{
"data": {
"text/html": [
"
\n",
"\n",
"
#
pclass
survived
name
sex
age
sibsp
parch
ticket
fare
cabin
embarked
boat
body
home_dest
label_encoded_embarked
embarked_missing
embarked_C
embarked_Q
embarked_S
frequency_encoded_embarked
mean_encoded_embarked
woe_encoded_embarked
binned_age
\n",
"\n",
"\n",
"
0
1
True
Allen, Miss. Elisabeth Walton
female
29
0
0
24160
211.338
B5
S
2
nan
St Louis, MO
1
0
0
0
1
0.698243
0.337472
-0.696431
0
\n",
"
1
1
True
Allison, Master. Hudson Trevor
male
0.9167
1
2
113781
151.55
C22 C26
S
11
nan
Montreal, PQ / Chesterville, ON
1
0
0
0
1
0.698243
0.337472
-0.696431
0
\n",
"
2
1
False
Allison, Miss. Helen Loraine
female
2
1
2
113781
151.55
C22 C26
S
--
nan
Montreal, PQ / Chesterville, ON
1
0
0
0
1
0.698243
0.337472
-0.696431
0
\n",
"
3
1
False
Allison, Mr. Hudson Joshua Creighton
male
30
1
2
113781
151.55
C22 C26
S
--
135
Montreal, PQ / Chesterville, ON
1
0
0
0
1
0.698243
0.337472
-0.696431
0
\n",
"
4
1
False
Allison, Mrs. Hudson J C (Bessie Waldo Daniels)
female
25
1
2
113781
151.55
C22 C26
S
--
nan
Montreal, PQ / Chesterville, ON
1
0
0
0
1
0.698243
0.337472
-0.696431
0
\n",
"\n",
"
"
],
"text/plain": [
" # pclass survived name sex age sibsp parch ticket fare cabin embarked boat body home_dest label_encoded_embarked embarked_missing embarked_C embarked_Q embarked_S frequency_encoded_embarked mean_encoded_embarked woe_encoded_embarked binned_age\n",
" 0 1 True Allen, Miss. Elisabeth Walton female 29 0 0 24160 211.338 B5 S 2 nan St Louis, MO 1 0 0 0 1 0.698243 0.337472 -0.696431 0\n",
" 1 1 True Allison, Master. Hudson Trevor male 0.9167 1 2 113781 151.55 C22 C26 S 11 nan Montreal, PQ / Chesterville, ON 1 0 0 0 1 0.698243 0.337472 -0.696431 0\n",
" 2 1 False Allison, Miss. Helen Loraine female 2 1 2 113781 151.55 C22 C26 S -- nan Montreal, PQ / Chesterville, ON 1 0 0 0 1 0.698243 0.337472 -0.696431 0\n",
" 3 1 False Allison, Mr. Hudson Joshua Creighton male 30 1 2 113781 151.55 C22 C26 S -- 135 Montreal, PQ / Chesterville, ON 1 0 0 0 1 0.698243 0.337472 -0.696431 0\n",
" 4 1 False Allison, Mrs. Hudson J C (Bessie Waldo Daniels) female 25 1 2 113781 151.55 C22 C26 S -- nan Montreal, PQ / Chesterville, ON 1 0 0 0 1 0.698243 0.337472 -0.696431 0"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"kbdisc = vaex.ml.KBinsDiscretizer(features=['age'], n_bins=5, strategy='quantile')\n",
"df = kbdisc.fit_transform(df)\n",
"df.head(5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### GroupBy Transformer\n",
"\n",
"The `GroupByTransformer` is a handy feature in `vaex-ml` that lets you perform a groupby aggregations on the training data, and then use those aggregations as features in the training and test sets."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"ExecuteTime": {
"end_time": "2021-04-13T10:15:09.682863Z",
"start_time": "2021-04-13T10:15:09.591867Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
#
pclass
survived
name
sex
age
sibsp
parch
ticket
fare
cabin
embarked
boat
body
home_dest
label_encoded_embarked
embarked_missing
embarked_C
embarked_Q
embarked_S
frequency_encoded_embarked
mean_encoded_embarked
woe_encoded_embarked
binned_age
age_mean
age_std
fare_mean
fare_std
\n",
"\n",
"\n",
"
0
1
True
Allen, Miss. Elisabeth Walton
female
29
0
0
24160
211.338
B5
S
2
nan
St Louis, MO
1
0
0
0
1
0.698243
0.337472
-0.696431
0
39.1599
14.5224
87.509
80.3226
\n",
"
1
1
True
Allison, Master. Hudson Trevor
male
0.9167
1
2
113781
151.55
C22 C26
S
11
nan
Montreal, PQ / Chesterville, ON
1
0
0
0
1
0.698243
0.337472
-0.696431
0
39.1599
14.5224
87.509
80.3226
\n",
"
2
1
False
Allison, Miss. Helen Loraine
female
2
1
2
113781
151.55
C22 C26
S
--
nan
Montreal, PQ / Chesterville, ON
1
0
0
0
1
0.698243
0.337472
-0.696431
0
39.1599
14.5224
87.509
80.3226
\n",
"
3
1
False
Allison, Mr. Hudson Joshua Creighton
male
30
1
2
113781
151.55
C22 C26
S
--
135
Montreal, PQ / Chesterville, ON
1
0
0
0
1
0.698243
0.337472
-0.696431
0
39.1599
14.5224
87.509
80.3226
\n",
"
4
1
False
Allison, Mrs. Hudson J C (Bessie Waldo Daniels)
female
25
1
2
113781
151.55
C22 C26
S
--
nan
Montreal, PQ / Chesterville, ON
1
0
0
0
1
0.698243
0.337472
-0.696431
0
39.1599
14.5224
87.509
80.3226
\n",
"\n",
"
"
],
"text/plain": [
" # pclass survived name sex age sibsp parch ticket fare cabin embarked boat body home_dest label_encoded_embarked embarked_missing embarked_C embarked_Q embarked_S frequency_encoded_embarked mean_encoded_embarked woe_encoded_embarked binned_age age_mean age_std fare_mean fare_std\n",
" 0 1 True Allen, Miss. Elisabeth Walton female 29 0 0 24160 211.338 B5 S 2 nan St Louis, MO 1 0 0 0 1 0.698243 0.337472 -0.696431 0 39.1599 14.5224 87.509 80.3226\n",
" 1 1 True Allison, Master. Hudson Trevor male 0.9167 1 2 113781 151.55 C22 C26 S 11 nan Montreal, PQ / Chesterville, ON 1 0 0 0 1 0.698243 0.337472 -0.696431 0 39.1599 14.5224 87.509 80.3226\n",
" 2 1 False Allison, Miss. Helen Loraine female 2 1 2 113781 151.55 C22 C26 S -- nan Montreal, PQ / Chesterville, ON 1 0 0 0 1 0.698243 0.337472 -0.696431 0 39.1599 14.5224 87.509 80.3226\n",
" 3 1 False Allison, Mr. Hudson Joshua Creighton male 30 1 2 113781 151.55 C22 C26 S -- 135 Montreal, PQ / Chesterville, ON 1 0 0 0 1 0.698243 0.337472 -0.696431 0 39.1599 14.5224 87.509 80.3226\n",
" 4 1 False Allison, Mrs. Hudson J C (Bessie Waldo Daniels) female 25 1 2 113781 151.55 C22 C26 S -- nan Montreal, PQ / Chesterville, ON 1 0 0 0 1 0.698243 0.337472 -0.696431 0 39.1599 14.5224 87.509 80.3226"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"gbt = vaex.ml.GroupByTransformer(by='pclass', agg={'age': ['mean', 'std'],\n",
" 'fare': ['mean', 'std'],\n",
" })\n",
"df = gbt.fit_transform(df)\n",
"df.head(5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### CycleTransformer\n",
"\n",
"The `CycleTransformer` provides a strategy for transforming cyclical features, such as angles or time. This is done by considering each feature to be describing a polar coordinate system, and converting it to Cartesian coorindate system. \n",
"This is shown to help certain ML models to achieve better performance."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"ExecuteTime": {
"end_time": "2021-04-13T10:55:09.248159Z",
"start_time": "2021-04-13T10:55:09.225352Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
#
days
days_x
days_y
\n",
"\n",
"\n",
"
0
0
1
0
\n",
"
1
1
0.62349
0.781831
\n",
"
2
2
-0.222521
0.974928
\n",
"
3
3
-0.900969
0.433884
\n",
"
4
4
-0.900969
-0.433884
\n",
"
5
5
-0.222521
-0.974928
\n",
"
6
6
0.62349
-0.781831
\n",
"\n",
"
"
],
"text/plain": [
" # days days_x days_y\n",
" 0 0 1 0\n",
" 1 1 0.62349 0.781831\n",
" 2 2 -0.222521 0.974928\n",
" 3 3 -0.900969 0.433884\n",
" 4 4 -0.900969 -0.433884\n",
" 5 5 -0.222521 -0.974928\n",
" 6 6 0.62349 -0.781831"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = vaex.from_arrays(days=[0, 1, 2, 3, 4, 5, 6])\n",
"cyctrans = vaex.ml.CycleTransformer(n=7, features=['days'])\n",
"cyctrans.fit_transform(df)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Dimensionality reduction \n",
"\n",
"### Principal Component Analysis\n",
"\n",
"The [PCA](https://en.wikipedia.org/wiki/Principal_component_analysis) implemented in `vaex.ml` can scale to a very large number of samples, even if that data we want to transform does not fit into RAM. To demonstrate this, let us do a PCA transformation on the Iris dataset. For this example, we have replicated this dataset thousands of times, such that it contains over **1 billion** samples."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"ExecuteTime": {
"end_time": "2021-04-13T10:27:00.511609Z",
"start_time": "2021-04-13T10:15:10.667961Z"
},
"tags": [
"skip-ci"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Number of samples in DataFrame: 1,005,000,000\n"
]
}
],
"source": [
"df = vaex.datasets.iris_1e9()\n",
"n_samples = len(df)\n",
"print(f'Number of samples in DataFrame: {n_samples:,}')"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"ExecuteTime": {
"end_time": "2021-04-13T10:31:54.111826Z",
"start_time": "2021-04-13T10:27:00.539429Z"
},
"tags": [
"skip-ci"
]
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "92d9ff2d39464ba1acdf6bf812e079e5",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"HBox(children=(FloatProgress(value=0.0, max=1.0), Label(value='In progress...')))"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "e17a5474ac84415cbccc65d9c14d05ad",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"HBox(children=(FloatProgress(value=0.0, max=1.0), Label(value='In progress...')))"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "f037bdb78f6a43818da3be78bb89a45f",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"HBox(children=(FloatProgress(value=0.0, max=1.0), Label(value='In progress...')))"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"features = ['petal_length', 'petal_width', 'sepal_length', 'sepal_width']\n",
"pca = vaex.ml.PCA(features=features, n_components=4)\n",
"pca.fit(df, progress='widget')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The PCA transformer implemented in `vaex.ml` can be fit in well under a minute, even when the data comprises 4 columns and 1 billion rows. "
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"ExecuteTime": {
"end_time": "2021-04-13T10:33:28.471868Z",
"start_time": "2021-04-13T10:33:28.433622Z"
},
"tags": [
"skip-ci"
]
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
#
sepal_length
sepal_width
petal_length
petal_width
class_
PCA_0
PCA_1
PCA_2
PCA_3
\n",
"\n",
"\n",
"
0
5.9
3.0
4.2
1.5
1
-0.5110980605065719
0.10228410590320294
0.13232789125239366
-0.05010053260756789
\n",
"
1
6.1
3.0
4.6
1.4
1
-0.8901604456484571
0.03381244269907491
-0.009768028904991795
0.1534482059864868
\n",
"
2
6.6
2.9
4.6
1.3
1
-1.0432977809309918
-0.2289569106597803
-0.41481456509035997
0.03752354509774891
\n",
"
3
6.7
3.3
5.7
2.1
2
-2.275853649246034
-0.3333865237191275
0.28467815436304544
0.062230281630705805
\n",
"
4
5.5
4.2
1.4
0.2
0
2.5971594768136956
-1.1000219282272325
0.16358191524058419
0.09895807321522321
\n",
"
...
...
...
...
...
...
...
...
...
...
\n",
"
1,004,999,995
5.2
3.4
1.4
0.2
0
2.6398212682948925
-0.3192900674870881
-0.1392533720548284
-0.06514104909063131
\n",
"
1,004,999,996
5.1
3.8
1.6
0.2
0
2.537573370908207
-0.5103675457748862
0.17191840236558648
0.19216594960009262
\n",
"
1,004,999,997
5.8
2.6
4.0
1.2
1
-0.2288790498772652
0.4022576190683287
-0.22736270650701024
-0.01862045442675292
\n",
"
1,004,999,998
5.7
3.8
1.7
0.3
0
2.199077961161723
-0.8792440894091085
-0.11452146077196179
-0.025326942106218664
\n",
"
1,004,999,999
6.2
2.9
4.3
1.3
1
-0.6416902782168139
-0.019071177408365406
-0.20417287674016232
0.02050967222367117
\n",
"\n",
"
"
],
"text/plain": [
"# sepal_length sepal_width petal_length petal_width class_ PCA_0 PCA_1 PCA_2 PCA_3\n",
"0 5.9 3.0 4.2 1.5 1 -0.5110980605065719 0.10228410590320294 0.13232789125239366 -0.05010053260756789\n",
"1 6.1 3.0 4.6 1.4 1 -0.8901604456484571 0.03381244269907491 -0.009768028904991795 0.1534482059864868\n",
"2 6.6 2.9 4.6 1.3 1 -1.0432977809309918 -0.2289569106597803 -0.41481456509035997 0.03752354509774891\n",
"3 6.7 3.3 5.7 2.1 2 -2.275853649246034 -0.3333865237191275 0.28467815436304544 0.062230281630705805\n",
"4 5.5 4.2 1.4 0.2 0 2.5971594768136956 -1.1000219282272325 0.16358191524058419 0.09895807321522321\n",
"... ... ... ... ... ... ... ... ... ...\n",
"1,004,999,995 5.2 3.4 1.4 0.2 0 2.6398212682948925 -0.3192900674870881 -0.1392533720548284 -0.06514104909063131\n",
"1,004,999,996 5.1 3.8 1.6 0.2 0 2.537573370908207 -0.5103675457748862 0.17191840236558648 0.19216594960009262\n",
"1,004,999,997 5.8 2.6 4.0 1.2 1 -0.2288790498772652 0.4022576190683287 -0.22736270650701024 -0.01862045442675292\n",
"1,004,999,998 5.7 3.8 1.7 0.3 0 2.199077961161723 -0.8792440894091085 -0.11452146077196179 -0.025326942106218664\n",
"1,004,999,999 6.2 2.9 4.3 1.3 1 -0.6416902782168139 -0.019071177408365406 -0.20417287674016232 0.02050967222367117"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_trans = pca.transform(df)\n",
"df_trans"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Recall that the transformed DataFrame, which includes the PCA components, takes no extra memory. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Incremental PCA\n",
"\n",
"The PCA implementation in vaex is very fast, but more so for \"tall\" DataFrames, i.e. DataFrames that have many rows, but not many columns. For DataFrames that have hundreds of columns, it is more efficient to use an Incremental PCA method. `vaex.ml` provides a convenient method that essentialy wraps `sklearn.decomposition.IncrementalPCA`, the fitting of which is more efficient for \"wide\" DataFrames. \n",
"\n",
"The usage is practically identical to the regular PCA method. Consider the following example:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"ExecuteTime": {
"end_time": "2021-04-13T10:44:40.861332Z",
"start_time": "2021-04-13T10:44:38.804288Z"
}
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "7d86be352dbd45fdb0bf34fce0bebd13",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"HBox(children=(FloatProgress(value=0.0, max=1.0), Label(value='In progress...')))"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"
"
],
"text/plain": [
"# sepal_length sepal_width petal_length petal_width class_ prediction_kmeans\n",
"0 5.9 3.0 4.2 1.5 1 0\n",
"1 6.1 3.0 4.6 1.4 1 0\n",
"2 6.6 2.9 4.6 1.3 1 0\n",
"3 6.7 3.3 5.7 2.1 2 1\n",
"4 5.5 4.2 1.4 0.2 0 2\n",
"... ... ... ... ... ... ...\n",
"145 5.2 3.4 1.4 0.2 0 2\n",
"146 5.1 3.8 1.6 0.2 0 2\n",
"147 5.8 2.6 4.0 1.2 1 0\n",
"148 5.7 3.8 1.7 0.3 0 2\n",
"149 6.2 2.9 4.3 1.3 1 0"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import vaex.ml.cluster\n",
"\n",
"df = vaex.datasets.iris()\n",
"\n",
"features = ['petal_length', 'petal_width', 'sepal_length', 'sepal_width']\n",
"kmeans = vaex.ml.cluster.KMeans(features=features, n_clusters=3, max_iter=100, verbose=True, random_state=42)\n",
"kmeans.fit(df)\n",
"\n",
"df_trans = kmeans.transform(df)\n",
"df_trans"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"K-Means is an unsupervised algorithm, meaning that the predicted cluster labels in the transformed dataset do not necessarily correspond to the class label. We can map the predicted cluster identifiers to match the class labels, making it easier to construct diagnostic plots."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"ExecuteTime": {
"end_time": "2020-07-14T15:58:55.795681Z",
"start_time": "2020-07-14T15:58:55.783702Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
#
sepal_length
sepal_width
petal_length
petal_width
class_
prediction_kmeans
predicted_kmean_map
\n",
"\n",
"\n",
"
0
5.9
3.0
4.2
1.5
1
0
1
\n",
"
1
6.1
3.0
4.6
1.4
1
0
1
\n",
"
2
6.6
2.9
4.6
1.3
1
0
1
\n",
"
3
6.7
3.3
5.7
2.1
2
1
2
\n",
"
4
5.5
4.2
1.4
0.2
0
2
0
\n",
"
...
...
...
...
...
...
...
...
\n",
"
145
5.2
3.4
1.4
0.2
0
2
0
\n",
"
146
5.1
3.8
1.6
0.2
0
2
0
\n",
"
147
5.8
2.6
4.0
1.2
1
0
1
\n",
"
148
5.7
3.8
1.7
0.3
0
2
0
\n",
"
149
6.2
2.9
4.3
1.3
1
0
1
\n",
"\n",
"
"
],
"text/plain": [
"# sepal_length sepal_width petal_length petal_width class_ prediction_kmeans predicted_kmean_map\n",
"0 5.9 3.0 4.2 1.5 1 0 1\n",
"1 6.1 3.0 4.6 1.4 1 0 1\n",
"2 6.6 2.9 4.6 1.3 1 0 1\n",
"3 6.7 3.3 5.7 2.1 2 1 2\n",
"4 5.5 4.2 1.4 0.2 0 2 0\n",
"... ... ... ... ... ... ... ...\n",
"145 5.2 3.4 1.4 0.2 0 2 0\n",
"146 5.1 3.8 1.6 0.2 0 2 0\n",
"147 5.8 2.6 4.0 1.2 1 0 1\n",
"148 5.7 3.8 1.7 0.3 0 2 0\n",
"149 6.2 2.9 4.3 1.3 1 0 1"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_trans['predicted_kmean_map'] = df_trans.prediction_kmeans.map(mapper={0: 1, 1: 2, 2: 0})\n",
"df_trans"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can construct simple scatter plots, and see that in the case of the Iris dataset, K-Means does a pretty good job splitting the data into 3 classes."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"ExecuteTime": {
"end_time": "2020-07-14T15:58:57.379045Z",
"start_time": "2020-07-14T15:58:57.198955Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/jovan/vaex/packages/vaex-core/vaex/viz/mpl.py:205: UserWarning: `scatter` is deprecated and it will be removed in version 5.x. Please use `df.viz.scatter` instead.\n",
" warnings.warn('`scatter` is deprecated and it will be removed in version 5.x. Please use `df.viz.scatter` instead.')\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA1gAAAFgCAYAAACmKdhBAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAACGSUlEQVR4nOzdd3wcxfnH8c/cXlVv7pViuiku2NRA6L13kgAJhBpaevulJySEHiCUBBJ6IEAglJCEUG2MMRgDpjf3bqvrys7vjzvJOt2efbLudCrf9+vlF/bcaPZZGe+jZ3d2xlhrERERERERkZ7zFTsAERERERGRgUIFloiIiIiISJ6owBIREREREckTFVgiIiIiIiJ5ogJLREREREQkT1RgiYiIiIiI5IkKLJEcGGO+b4y5Ld99cxjLGmO27ObX7GOMWZiP44uISP9ljLnDGPOL1O/3Msa810vHVe6SQU0Flgw6xpgzjDHzjDHNxpilxpibjDFVG/oaa+2vrLVfy2X87vQVEZHBzRjzqTGmxRjTaIxZZoz5szGmLN/Hsda+YK3dOod4zjDGvJjv44sMJiqwZFAxxlwOXAF8C6gEpgPjgGeMMcEsX+PvvQhFRGQQOsJaWwZMAqYCP+zaQblIpP9QgSWDhjGmAvgpcJG19ilrbcxa+ylwIski6/RUv58YYx40xtxljKkHzki13dVprC8bYz4zxqwyxvwodQdy/05ff1fq9+NTUyW+Yoz53Biz0hjzg07j7GqMmWGMWWuMWWKMuSFboedxPjWpO52LjTFrjDGPZOn3XWPMR8aYBmPMO8aYYzp9tqUx5jljzLpUbPen2o0x5mpjzPLUZ28aY3ZIfRYyxlyZOp9lxpibjTGR1Gd1xpjHU+ez2hjzgjFG1xkRkRxYaxcBTwLt11trjLnAGPMB8EGq7XBjzBup6+zLxpgd27/eGLOLMWZO6np/PxDu9FnaFDxjzBhjzN+NMStSuewGY8y2wM3AbqknamtTfbNe91OffyuVwxYbY87a0Dkqd8lgoP95ZDDZnWSy+XvnRmttI8mEdkCn5qOAB4Eq4O7O/Y0x2wE3AqcBI0g+CRu1kWPvCWwN7Af8OJXEABLApUAdsFvq8/NzPJ+/AiXA9sBQ4Oos/T4C9krF+VPgLmPMiNRnPwf+BVQDo4HrU+0HAnsDW5H8HpwErEp9dkWqfWdgS5Ln/uPUZ5cDC4EhwDDg+4DN8XxERAY1Y8wY4FDg9U7NRwPTgO2MMZOAPwFfB2qBPwL/SBUPQeARkrmhBvgbcFyW4zjA48BnwHiS1/H7rLXzgXOBGdbaMmttVepLsl73jTEHA98kmUMnAPtv5DSVu2TAU4Elg0kdsNJaG/f4bEnq83YzrLWPWGtda21Ll77HA49Za1+01kZJXqA3diH+qbW2xVo7F5gL7ARgrX3NWjvTWhtPPU37I/CFjZ1IKskcApxrrV2Tehr3nFdfa+3frLWLU+dyP8m7oLumPo6RfHo30lrbaq19sVN7ObANYKy18621S4wxBjgbuNRau9pa2wD8Cji509eNAMalYnrBWqskJSKyYY+knha9CDxH8rra7tep620LyevvH621r1hrE9baO4E2ktPdpwMB4JrU9fdB4NUsx9sVGAl8y1rb1OX6nyaH6/6JwJ+ttW9Za5uAn2Q7SeUuGSxUYMlgshKoM97z2EekPm+3YAPjjOz8ubW2mfV3yLJZ2un3zUAZgDFmq9S0hKUmOR3xV6QXetmMAVZba9dsrKNJTmdsn06yluTUk/ZjfBswwCxjzNvtUzustf8FbgD+ACwzxtxiklMsh5C88/hap/GeSrUD/A74EPiXMeZjY8x3czgXEZHB7mhrbZW1dpy19vwuN/Y656NxwOXt19/UNXgMybw0EljUpTD4LMvxxgCfZbnh2NXGrvtpOXEDx2w/rnKXDHgqsGQwmUHyTt+xnRuNMaUk76j9p1Pzhu5cLSE5JaH96yMkp2psipuAd4EJ1toKktMSTA5ftwCoMRtZ/dAYMw64FbgQqE1N93ir/RjW2qXW2rOttSNJTjm50aSW1rXWXmetnUxyGsdWJBcGWQm0ANunfhiostZWpl7OxlrbYK293Fq7OXAEcJkxZr9ufD9ERCRd53y0APhlp+tvlbW2xFp7L8ncNCr1tKbd2CxjLgDGZrnh2DX/bfC6nzrumByO2X5c5S4Z8FRgyaBhrV1Hch739caYg40xAWPMeJLz1BeSnBeeiweBI4wxu6fmvP+U3IoiL+VAPdBojNkGOC+XL7LWLiH53tiNxpjq1Lns7dG1lGSyXAFgjDmT1MvTqT+fYIxpLxbXpPomjDFTjTHTjDEBoAloBRLWWpdk0rvaGDM0NcYoY8xBqd8fnnr52KTOK5H6JSIiPXcrcG7q+myMMaXGmMOMMeUkbyLGgW8YY/zGmGNZP6Wuq1kkC6PfpMYIG2P2SH22DBidym9s7LoPPEByMajtjDElwP9lC165SwYLFVgyqFhrf0vyKdGVJC+ir5C8o7aftbYtxzHeBi4C7iOZoBqA5SSfjnXXN4FTU2PcCtzfja/9Esl54++mjn+JR6zvAL8nmXiXAROBlzp1mQq8YoxpBP4BXGyt/QSoSMWzhuR0j1Ukv2cA3yE5lWJmalrjv0ku4AHJF5z/DTSmjnmjtfZ/3TgnERHJwlo7m+S7RDeQvD5/CJyR+ixKcobGGanPTqLLok6dxkmQfFKzJfA5yZuMJ6U+/i/wNrDUGNM+dT7rdd9a+yRwTerrPkz9d0OUu2TAM3qHT6RnTHJDyLUkp/l9UuRwRERERKSI9ARLZBMYY44wxpSk3t+6EpgHfFrcqERERESk2FRgiWyao4DFqV8TgJO1pKuIiIiIaIqgiIiIiIhInugJloiIiIiISJ547X/Q59XV1dnx48cXOwwRESmQ1157baW1dsjGe/ZdylUiIgNbtlzVLwus8ePHM3v27GKHISIiBWKM+azYMfSUcpWIyMCWLVdpiqCIiIiIiEieqMASERERERHJExVYIiIiIiIieaICS0REREREJE9UYImIiIiIiOSJCiwREREREZE8UYElIiIiIiKSJyqwRERERERE8qSgBZYxZowx5lljzHxjzNvGmIs9+uxjjFlnjHkj9evHhYxJRKQnrLsOt/7nuMt3x12+F27D1Vjb2v1xWv+Lu/Jo3GVTcVd/CRudi7UWt+ke3BUH4C6bhrvmYmz88wKchbRTnhKRgchGX8NddVoyx6w8Dtv2XPfHcJtxG36Hu3xP3OV74Nb/Gus2YhNLcdd9N5mnlu+L23gb1sYLcBb9l7/A48eBy621c4wx5cBrxphnrLXvdOn3grX28ALHIiLSI9ZGsatOhMRCIJZsbPoTNvoK1NyLMSancdzmR6D+/4CWZEP0FezqL0FoX2j73/r2tqex0Zeg7nGMMzy/JyPtlKdEZECx0VnY1V8DUjf/4vOway7CVl6BL3JIbmNYF7v6dIh/ALQlG5vvxra9AO4asGuBBNg10HgdNv4OpuqqApxN/1TQJ1jW2iXW2jmp3zcA84FRhTymiEjBtD4DiWV0FFcAtEH8XYjNzmkIa11ovIKOImr94ND2VJd2F2wLtumOnkQtG6A8JSIDja2/go7iqkMrNPwGa21ug0RnQOJjOoqrZCMkPgNbDyTSx259RjMuOum1d7CMMeOBXYBXPD7ezRgz1xjzpDFm+96KSUSkO2zsTaDZ44M4xLo+8Mg2SD249dk+9GiLQTS34k16RnlKRAaE+Pve7e5y0gumDYi9Bdarb4z0m4wpJgDx+TkGOPD1SoFljCkDHgIusdZ2/cliDjDOWrsTcD3wSJYxzjHGzDbGzF6xYkVB4xUR8eSMBSKZ7SYATo4PPUxZsr/3hx5tPvBvlmOAsqnykadS4yhXiUhxOUO9200JEMxxjNFgwh4f+AEns9kmcs+Dg0DBCyxjTIBk0rrbWvv3rp9ba+uttY2p3z8BBIwxdR79brHWTrHWThkyZEihwxYRyWAiR6aKo86FkA9MOYS+kNsYxg8lXyajUDMRcLYmM/mFMKVf3fSgZaPyladSnytXiUhxlV5I5s3ACJR+FWNy/NE/vH8yL6WVCiZVpHW9SehP3gj06+F+u0KvImiA24H51lrPN9+MMcNT/TDG7JqKaVUh4xIR2RTGV46puRf825G8i+eHwGRM7X2YrE+lPMYpuxhKTieZAENgKqHsO5jaeyF8IMkiKwi+kZjqGzCBbQpyPqI8JSIDj6/kGCi/PHnzj1CyKCo9C1N6bs5jGBPC1NwHgZ1I5rsABCZiah/A1NwBzmbJNgIQ2htT8+ecF3oaDEzOL7ttyuDG7Am8AMwD3FTz94GxANbam40xFwLnkVzJqQW4zFr78obGnTJlip09W+8kiEjxWHcd4GB8ZZs+ho2C2wC+KoxxOrW3gG0BUz1oE5Yx5jVr7ZReOE5B8hQoV4lIcVkbB3ct+CowJsepgV7juA2AxfgqurSvBkIYX2mP4uzPsuWqgi7Tbq19Ee+XCjr3uQG4oZBxiIjkm/FV9nwMEwSn1qM9kpqaIYWmPCUiA5UxfnA8ZzN3bxxfeZb2mh6PPVD12iqCIiIiIiIiA50KLBERERERkTwp6BRBEZF8sG4ztulGaHkk2RA+HFN2YY/ef9oY122GNRdAbCbggjMeqq7HF9iqYMcUEZH+a97yZfzu5Rd4a/kyRpSV841pu3HQFhMKeky37QVY9/3UHld+CB8OFb/G59MzlGLSd19E+jRrXezq06HpjmQCcZdD813Y1adgbWKjX7/JVh4AsZdI7lZvIfEJrDoSN76scMcUEZF+6a3lyzj5wft48fPPWNvayvyVK7js6Se49603C3ZMN/oarPkquMtIblQfg9aHYfXJBTum5EYFloj0bdGXIfExEO3cCIkF0PZcQQ7ptv4XXK9NYl1o+FVBjikiIv3X715+gZZ4PK2tJR7ndy+9QMJ1s3xVD637kXd7/A3c+OLCHFNyogJLRPq22Ftg2zLbbTPE3y7MMdv+t4F43ijMMUVEpN96a/lyz/aWeJxVLc2FOWhiQfbPohvdSUIKSAWWiPRtzmgwYY8PSsA3qjDH9G/gPStndGGOKSIi/dao8ixLmRuoDHnlsDzY0HYh/m0Lc0zJiQosEenbwgekCqzOlysDJgThgwtzzMipJHeo91D+3cIcU0RE+q1vTNuNsD997biI388pO+xIyF+gNeXKL/Nu9w3BF9y+MMeUnKjAEpE+zZgQpuZ+COxIsujxg38HTO29GF9JQY7p8/mg9hEw1Z1aA1DxK3zBiQU5poiI9F/7b74lP/nCF6kKhwk5DuFUcfW9Pb9QsGP6IsdC6fmk/TjvGw11jxXsmJIbLdMuIn2e8Y/F1D6AdesBi9nQtIg88QUmwLBXUqsGNuHzb17wY4qISP914vYTOW7b7Vnd0kJFKFS4J1ed+MovwS39BiQ+BKcOn6+m4MeUjVOBJSL9hvFV9Poxff5hvX5MERHpnxyfjyGlpb16TJ/PBz7t0diXaIqgiIiIiIhInqjAEhERERERyRMVWCLS51m3Gbfh97jL98Jdvidu/RVYtxGbWIa77nu4y3fDXfFF3MY/YW0CG3sPd/U5uMum4a48HNvyRHKctpdxV52Iu2xX3FWnYKOvbkIs9bj1v8Rdvgfu8r1xG67F2tbs/WPzcFefkYrlGGzrf7P3tRa36T7cFQfiLpuOu/ZSbPzzbsdYDDaxFHfdd1J/F/vhNt2BtYlihyUi0mts7G3c1WelrvdHYVufSba3PYe78rhk++ovYaNvJK/3zQ/grjgo2b7mYmz8c6yN4jZcj7v8C7jLd8dd9zOsu7b7sUTn4K46PTn2quOxbS9k72td3Ka/4K7YP5V7vo1NLMneP7Ecd933O+Xe2/vN9d62Pou78tjU38VXsNG5BTmOsdYWZOBCmjJlip09e3axwxCRXmCti111PMQ/ANo3HA6CMw7sGnDXAO0X9jCEpkN0FtgWoP36FoHIYdDyONC5GApjqm/ChPbIMZYoduWRkFgIRFOtIQjsgKm5B2NMev/Ym9hVp2cck4r/w1dyXMb4bv2voOX+VOwAPjBlmLp/Ypy++y6YdddiVx4C7lrS/i7Ch+CrumKTxjTGvGatnZKvGItBuUpk8LCxd7CrTgFaOrWGIXwEtD5GRh4IHQBt/+7U3wemFPxbQ2we6/NdAJzhmLonMCaUWyzRV7Grv5p5zMrf4otkbm/irvsBtHSO0QFTgRnyJKbLohnWrceuPDh1vY93Os/98VVdlVN8xeI2/wPqf0jGzwE1d2KCu2zSmNlylZ5giUjfFp0BiY9Zn2wAopD4BNx1rP+BHqAV2p7vUlwBtEDLQ6RfVJP9bcNvco+l9d+QWMr64opkXPH5EHsto7ttuNLzmDT8Fmvd9L7uami+t1NxBeCCbcE2/Tn3GIvANt8HbhMZfxetT2ATi4oVlohIr7ENvye9uILkdfBBPPNA2+Nd+rtgmyH2Oun5LgbuKmh9ohuxXOF9zIZf0/XBik0shZZHu/RPgG3GNt2dOXbz38BtZH1xlRq79Zk+PePCWguNv8H754Ar8348FVgi0rfF5oFt8/ggDsQ82i3pxVXndq9hPsw5FBubCzR7fBCH2FuZ7bF3sgzUBHZtZhwm6NE5BtE+/hQk+iqZSQswAYjN7/VwRER6nVcOALLmHs/2BOBmNttmbHRON2J5z7vdXU7GtTo2P0vuaUtd27uOvYHrfTxLzusLbEPqqZuHAsStAktE+jZnNHhOi/DjfQkzHm0b4KvrRixjgIjHIQPgjPLoPzzbQGDKusQxAmzUo68P/ONzj7EY/JvhueuHTXh/X0REBhpnRDe/IFuu8sprYXDGdyOWoVkOGQG65FNndPImYeYg4LX/o5Pteu/27eu9KclSSAK+/E/BV4ElIn1b+IBUUuh8uTJACRmJAn+qCOraHgb/jqlxOotA6Xk5h2IiRyaLqTTJ96QI7ZPZv+wiMguyMJSciulyoTf+MRCcCnRNACFM6VdzjrEYTMmXgK7flwD4J2AC2xYjJBGRXmXKLsTzeu/fMfnfNBFwtsbreo+vlowfz40fU3JM7sGUnu8RSwRKv4ox6WObwAQIbEfGNdwEMaVfzhjalJzmUaj4wT8O/DvkHmMvM8YPJV/G6/uS/LvLLxVYItKnGRPC1NwPgYkkE0AA/Dtg6h7A1NyeXOyCYLI9uCem9gGo+CWYGpKFVggiR0PN3VB6dvIlYkLJ/5ZdgCk5JfdYfBWYmnvAv+36WAK7YGrvw2QUXmDCB0H5d8FUpmKJJIur8m96j191HYT3T51PEHwjMNXX9fkixfjHYWpuTRW3qb+L0F6YmtuKHZqISK8w4f2h4gedrvdhKDkJau6Bkq+Q/ME+BKYcyi/H1N4H4YNYf70fhqm6DlP7IASmsj7fbYWpuStjsYkNxhI5FsovTR6LUPLmYulXMFluKJrqW1I3CQPJWJzRmKqbMR5PsIx/NKb6tszcW/PnjIWe+hpTdjGUnM76v4sKKP8WJnJ4/o+lVQRFpL+w7jrAYnxV69usBXc1mBDGV9ap3U2+GOwrx5hwp/Zoch62r9qzKMo9ljWAg/FVbLyvjSdXO/RVZjy58h67Ofmys6+2zyeszrL9XWwKrSIoIv2RtYlU7qlMW/Vvfe6pST5NaW93m5Pv5frq0q731m0AYt0qrDJjiSevyb6qHHNPU3KhpRxyTz6v970t+XexLvVzgMd0x27Ilqt6NqqISC8yvsrMNmPAqfVo94EzxKM9mH1+erdiqc69r/F7xpJ97BKSUyD7l2x/FyIig4UxjmeOyZZ7sl3vja88D7H4u5XvjK8UKM1x7P57vU/+XeSekzeFpgiKiIiIiIjkiQosERERERGRPFGBJSLiwdoYbuMtuMu/iLt8d9x1P8YmVmXvH1+Iu/Zy3OW74a44ELfpnowNHfsTaxO4TXfirtg/eU7rvpfckFJERPoMG/8Md83FuMum4644BLf5oQ3mHtv6L9yVR+Eum4a7+mvYbPs19hM2sQh37Tf7XO7VO1giIh7s2kuh7Xk6NlRseRDb9j+oezI1T71T38QK7KpjkhsZ4gKroOEKbOJjTMUPezny/LD1P4CWJ4GWZEPLI53Ov6qIkYmICCSLC7vq2OQiGbiQWA31P8MmPsOUX5bR3226Bxp+Q0dei76AXfUq1N7X51er9WITK7ArjwFbT3ru/QhT8aOixqYnWCIiXdj4h+nFFQBxcNdhWx7J7N98R3LVP9xOrS3QfN8Gn3r1VTaxCFoep6O4AiABbhO2+f5ihSUiIp3YxluTq/51zT1Nf06tQtipr41D4+9Jz2sWaMU2XFPwWAvBNv8lS+69v+i5VwWWiEhXsbfBOB4ftEDs1czm6KtALLPdhCD+Xr6jK7zY/Cw73rdCdFavhyMiIh5irwHxzHYTgPjH6W3uCrAeeQoL8TcLEV3hRWcB0cx2E4L4u70eTmcqsEREunJGkbyz11UQnM0ym/2b4Xk5tbHUWP2MMxJIeHzgB//4Xg5GREQ8OWMBj/2qbAyc4eltviq88xrgG5nnwHpJH869KrBERLoKTAbfCDJeUzV+TMlJGd1NyVkkd7TvLAjBXTD+cYWKsmBMYDtwtgS6bsQcwJR8uRghiYhIF6b0HCDUpTUEoT0xzrD0viYCkeOAcJf+EUzZhQWMsnCy5t7ATpgi3wxUgSUi0oUxBlPzVwhOI1lkJJ9cmeo7MF3vCgImsDWm+g+poiyU/JrQvpiqP/Ry5Pljam6H0J6sP/8xmJpb+2XBKCIyEJngTpiqq8A3lGTuCUL4QEzl7737V/wASo5P9Q2BqYKKH2LC+/Ze0HlkAlthqm/MzL3VNxU7NExfWMqwu6ZMmWJnz55d7DBEZBCwbgPYKCaHHeutteCuBFOSsdJgf2XdRrCt4KvFGI+pKAVijHnNWjul1w5YAMpVItIbrHVTuacM4yvJoX8ruOvAV4fxfN+4fylm7s2Wq7RMu4jIBhhfee59jQFnSAGj6X3GVwaUFTsMERHJwhgfOEO70T8MTtepgv1XX8y9miIoIiIiIiKSJyqwRERERERE8kRTBEVk0LBtM7GN10D8U/BvgSm/NLliYOvj2KabwV0FgSmY8svAGYtt+jO03Au2DUL7Y8q+ASaIbbwBWv8JOBA5HlN2DsZ0XckpdczYW9iGqyD2DjgjMWUX9bkXim3r09jGGyGxLLn6UvllmMDWxQ5LRGTQsTaKbboVmh8EohA+GFN2EWCxDddD29NACEpOwJR+FRJLsA1XQ2wmmBoo/RomcgzE5mIbr4LY++Afgym7BBPaI8sxXWzzPdB8J9hGCO2NKbvUc1GnYrFuA7bxD6nc64PIcZiyr2fNvcWmRS5EZFCwrc9i115M+i72YQgfCm1Pgm1JtfnARCCwC0Rnd+rvB18dEAF3Ies3Fg5BYEdMzV0Zi0DY2DzsqtO6HDMCFf+Hr+TYvJ/jpnCb/gqNV3Y6fwMmgqn5GyYwoWhxaZELERmM3NVndck9geSeVtYFdznrc08YAjtB/B2wTYCbao9A+EBofZqMfFf5O3yRgzKPue6H0PIY0J4HHDAVmCFPYnw1eT/H7rI2hl15FCQ+Iz33TsTU3N2rCzB1lS1XaYqgiAwKtuFXpCcbkn9ufbhTcQHgJv8cndGlfxzc1V2KK4A2iL8NsTkex7zS45gt0HBFctWnIrM2Co1Xdzl/C7YF23ht0eISERmMbGwexF4jPW/EkrML3BWk555WiM0G28z64gqgBVr/gWe+a/gVXR+s2MRSaHmE9cUVQAJsc/KpVl/Q9h9wF5OZe99Jfg/6IBVYIjLgWeum7nx5furR5pKesNpFSb/Atw8Rh9i8zPbY21kO2QR2bZZ4elFiKdiExwcWYm/0djQiIoNb7C3wnFkWTf3qygWyXMO9uMvJKLxi88F03awXoA3aZm0g2N5jo3NThWTXD+LJ71kfpAJLRAY8Y3zJDRW7xevy6OD56qoJgDPKo3u2+esOmD6w9LmvBu9CEu/zERGRwnFGgue+VH6S+aerbv4YbyIkN+TtfMzRyUIlMxjwb9a98QvE+MekYu/6QSD5PeuDVGCJyOBQeg7Q9QIdAf92ZCQcQuAbRkYxZYJA171DfGBKIbRPxiFN2YUeSSEMJadgPO8Y9i7jK4PIUWSeUxhTdkExQhIRGbyCe4KvioxiyoTIzFMO+GrJvH6HwL+NR+6JQOlZyRuOnYcOTIDAtkCgyzGDmNIvb8JJFED4cDLi68i9fWvRqHYqsERkUDClX4XSr4IpAcLJC3PZeVBzb3KhC4IkC6taqLwCU/s3CE4jeVEPgrMZpvpOTO294N861R6AwM6YmvswpuvFH0z4YCj7NpiK5DEJQ8nJmPJv9t6Jb4Sp+DFEjiaZvMNgqqHiJ5jQ3kWOTERkcDHGwdTcC4FJrM89E5KLKNX+FZwtkm0Ekive1j4Ilb8H31CS1/BgcoGLmnuh7OLUTIlwstgq+TKm9Hzv41bfAqG9U8cMgW8UpuomjH+L3jnxjTC+CkzNPcnCsSP37oSpubdP3Kz0olUERWRQsTaaXKzCV5N2YbZuM9gG8A1Ju8Nn3QawbRinLn0cdzXgYHyVORwznlwC3lfVZ5eUtbYV3HXgq8N4TlHpXVpFUEQGM+uuAxvHOLXp7YlVYAIYX8X6NusmF8Ew5RhfSaf2WCr31ORUiFi3Mbnoka+uqCvzbUgy9/owvqpihwJkz1XaB0tEBhVjgp7vRiWTUolHezlQ7tGe+9K1xvjBGdadMHudMWFwuk41ERGRYsh2865rwQWp94w9cowxgQ28C+x1zDKgD7wfvAF9Ydn4XGiKoIiIiIiISJ6owBIREREREcmTgk4RNMaMAf4CDCe5FvAt1tpru/QxwLXAoUAzcIa1NnPHThEZUGzrM9jGG8FdAv6dMOWXgn8rbPO90Hwn2EYIfQFTdjH4KrGNf4SWhwELkSMxpeclN0JsvBbank0uXlFyOqbk9D7xDlEx2fin2IarIPZqcu596dchfES35tRb62Kb74bmvyT3Hwntgyn7BqaPT3XsLuUpEcmmMRrlhlkz+Mf77+IzhmO32Z7zp+7K6pYWrprxEs9//ikVwRBn7jKZU3bYEROfj224GuLzwDcCU3YRJvxFbNuLyVyVWAD+bTBll2KCOxX79IrOtjyObfojuCuTi3aUX4rxb969MRKLsQ3XQPTF5IJSJWdiSk4s+jtkBV3kwhgzAhhhrZ1jjCkHXgOOtta+06nPocBFJBPXNOBaa+20DY2rF4dF+je36V5o+A3rd443YMIQ3Bvanu/U7gAV4B8B8Y+AtlR7EJzxycUq7FqgfQ+PCIQPxFf1u146k77HxhdgVx2V2pQxtceViUDp2fjKLsx5HHfd96DlCdb/XfiTi3TUPdErLxf31iIXhcpToFwl0p/FXZcj7v0rn6xdQzSR3Mw35DhsVVvHovp61rW1kkj9DB3x+7lg5wrO3fwakhv5tv9sHYbwUdD6KOkb/IYxNXdggpN674T6GLfxFmj8A+tzjA9MBFP7CMY/LqcxbGIlduWhYOtZv6djBEpOwFfxwwJEnSlbriroFEFr7ZL2u3zW2gZgPtB198qjgL/YpJlAVSrhicgAZG0cGn/P+osqgE2uXNT2TJf2BNAI8Q9YX1wBRCHxCdh1rC+uSH5t61PY+IKCxd/X2aY/Jr+XnTcQti3QeAvWbcptjMRiaHmc9L+LOLgN2Ob78xlu0SlPiYiX/37yEQvr13UUVwBtiQTzV66gIdrWUVwBtMTjbFdyN5YW1hdXAK3Q+gDpxVWy3Tb8tpDh92nWtnQprgBcsC3YpptyH6f5zvSbiZAcs/m+5GqLRdRr72AZY8YDuwCvdPloFND5p6GFZCY3ERko3OVgo1k+9HqiHkv9yrHdBCD+Tmb7YBF9jWRh2oXxQ+LT3MaIvZP8PmZog+isHgTXtylPiUi7uUuX0hTLzDFx1yXmuhntE6uX4z0pLctMsfj8HsXXr8U/B+NVgripHJaj6CzA4+cJE4L4u5saXV70SoFljCkDHgIusdbWd/3Y40sy/m80xpxjjJltjJm9YsWKQoQpIr3BVJE14Xh/Ad6vi/rJ2O0ewCbAGbkpkQ0MzhjvdhsFX47vTzkj8SzS8IN//CYG1rflI0+lxlGuEhkARldWEvFn5h6/Mfg83u9Z0tzN5c19Qzc1tP7PGQLW68Yp2XOYF/94PEsZGwOnuPfACl5gGWMCJJPW3dbav3t0WQh0/m6OBhZ37WStvcVaO8VaO2XIkCGFCVZECs74SiByDNB1z6VI8r0quj45CYEpJf1yZZKLWmT0TRUA/h3yGXK/Ysq+Tub3NgShfTM2S846RmA7cLYgs7ANYEq+nIco+5Z85SlQrhIZKA6fsDVBx0m7u+IzhrJQiKCTfnMv4PPxxJIDgEiXUcLgn+jRHoHSC/IfdD9hfDUQ3h8IdfkkjCk7N/dxSs4Cum6gHITAjpgi3wwsaIGVWnnpdmC+tfaqLN3+AXzZJE0H1llrlxQyLhEpLlPxQ4gcS/LiGgZTCRU/wNTeD6E9SRZOQfCNxFTfjKn9G/i3T7UHwL8tpvY+TM1tqbtdoWR7cHdMzZ+LvnpQMZngZKj8LfjqSH5fghA+CFPVvfn+puZ2CLb/XYTAGY2p/mPOLx/3F8pTIuKlPBTi/uNPZpu6IQR8DgGfj4lDh/Hwiafxx8OOYmR5OSHHIeBz2Gf8Zpy7x/eg/HvJfEYYCEHJCVBzN5ScRrLICoMph/JL8JUcXdTzKzZT+RuIHEoyT4XAVwuVv8YEd819jMDWmOobwDecjp8DQvtgqnN/j6tQCr2K4J7AC8A81r+B9n1gLIC19uZUcrsBOJjk8rdnWms3uOySVmYSGRisbQF3HfiGpC2tbt3G5IurviFpxZJ1VwPpO7lba8FdkVx9yFfee8H3cda6yffdTEXyqeGmjuM2JBfJ6PJ3UWi9uIpgQfIUKFeJDBSrmpvxGUN1ZP2TKGsty5uaKAkEKA+FOrXHk8uO+6owJtypvQ3cNeCrxXi+4zo4Wbc5uQqgbyjG872sHMawNpXvSjG+bk7V7KFsuaqg+2BZa1/Ee+565z4WGLzPSUUGMWMi4HSdOkHqApl5kexcWK0fw4AziOeyZ2GMD5zhPR/HVw4M3MJVeUpENqa2JPMmlTGGYWUeecr4Pa+9xoTyck0eaJI3ADf9JiC0/xzQt/Zo7LVVBEVERERERAY6FVgiIiIiIiJ5UtApgiIi3WGti22+D5rvBNsIob0xZRdjTRDWXACx15Md/TtC1R/w+b1XabPxD7ENVyX7+4Zhys7DhA/qxTPZNNZabMtD0HR7chPl4B6Y8kswRV5uVkRE1lvS0MBVM1/i+c8+pTwU5KydJ3PyDjvy5Afv8X/P/Zc1LS2E/X6+OmkKl03fw3OM5PX+YWi+PfluVnA3TNklGH83likvEptYgW28Htr+k1zRt+R0TMnpae9SD3YFXeSiUPTisMjA5K77EbT8g/W7uztABdBK+o7vAGEYOhufL32JVhv/ELvq+OTCDO1bFZkIlF2Or7RvLzHu1l8Bzfew/lx9YMoxdf/EDLL3zHprkYtCUq4SGXhWNTdz0N13sK61lUTqZ+iI38/kkSN58fPPM/oft+32/O6AgzPa3YaroOlO0q/3ZZi6xzF9+F0t6zZgVx4C7mognmqNQPhAfFW/K2ZoRZEtV2mKoIj0CTaxFFoeJr2QSgD1ZBZXAK3QdHPmOA03gG0lbR9Y2wKNV2Otx47vfYR110DzXaSfqwu2Gdt8R5GiEhGRzu6c+zpN0WhHcQXQEo97FlcAf5//NtF4PK3NuvXQ9Gc8r/dNtxcg6vyxzQ+AW8/64gqgBVqfwsa9vweDkQosEekbYvPBdN0wEJJFVhbRVzzGeZ31q213ZiHhuTds3xD/IMv5x6BtVq+HIyIimWYtXkhbYgN5qQsLvL96VXpj/EPwXKo9DtE+fr2PvUpyVkkXJgDxd3o9nL5KBZaI9A3OSLC5J63k14z1aMvyvpKNJzcy7Kt8I8DzCZsPBtjmviIi/dX4yiqcbu4JOLaiMr3BGQ425tHTgNPHr/fOeJIb0HflJvO4ACqwRKSPMIGtIbAVmRfuEN7bFBkouyyztex8INylNQzhw/v0RsTGPwaCk4GuT7GCmNKvFiMkERHp4qxdJhN00hdzCPh8DCst9ey/TW0dFeH0nGSckRCcRub1PoQp+1oeo80/U3IqmWvk+ZM3PP0TixFSn6QCS0T6DFN9KwT3IFlkhcA3ElN9M1TfRnrRFIKqmzxXETShPaHiJ2CqUl8TgsjhmMqf9cIZ9IypugFC+7L+/Idiqq7FBLYrdmgiIgJsVVvHzYcdxYiyckKOQ8Dn8IXxm/HUaWdwxISt0/vW1PLgCad4jmOqroXQfiSLrBD4hmCqrsIEdiz8SfSA8Y/F1NwKzhiSN0ADEJyOqbkjueGvAFpFUET6IOs2gG1OFhidLthu/CMAfP4tNj6GTYC7AkxFaqf4/sO6jcll6n3DBm3C0iqCItKXWWtZ1tRIaSBIeSjU0d4ajzN/xXLGVVZRU7Lx3LP+ej8UY/rPcw9rLbjLwUQwvopih1M02XKV9sESkT4nOZUvczpfLoVVxxjGSc5z74eMrwwoK3YYIiKShTGG4WWZeSrs97PLiNzfReqv13tjDDjDih1Gn9V/SmUREREREZE+TgWWiIiIiIhInmiKoIgU1KdLb6Mmdj0l/hZaE0GWmzPZfNTlqV3s7wDaksunl/8EX+RAzzGsbcE23gotfwcsRI7ClJ6bt3er3HW/gJb7gSj4hkLlbzDBadimv0DLvcmNikMHYsovxPhqvGNMLMY2XAvRF8FUQOmZmMgJWd+hsm0vYBuvg8RC8G+PKb8UE9i+W3Fbd01yY+W2p8GEIXIypvQMjNGlXUQkVy3RKMc+cC/vrV4JQEUoxC2HH8W4ymrO++ejvLl8GQaYPGIUtxx+VMaqgO3mr1zB719+kTeXLWVkRQXf2HU3vrjZ5nmJ0Y1/Bmu/AfH3AJNcEKrqOoy7Ett4LURngK8GU3o2hI/Mnnta/4ttvAHcJRDYCVN2aXIVX6++thXbeEuX3Pt1jM97xcRsbPRVbMM1kPgY/BMwZRdjgpO79w3oZ7TIhYgUzCeLrmes73oA2q/11kKjO5Zyx2PH98rr8UUOSmuy1sWuPjm5ETFtqdYQ+LfA1D6UfNeqB9zVX4XoC5kfBKZCbB7rN1T0J19CrvtnRnKxiZXYlYeBrWf9xsgRKDkJX8X3M4/Z8jis+z7pmzWGMTV/xQR3yilua1uSx0wsA9r3UwlDaC981X/IaYy+TItciEhv2f7Ga2mJxzPagz4fUTd94/qyYJA3zrkAny99Etg7K5Zzwt/upTUep/0n64jfz0++8EVO2L5ny5e77lpYvjvQJUZTBSTANgHtcUag9Cx85RdnjtN8P9T/kvW5xyQXqai5P6PIylfutW3PYddcREa+q74ZE9o9pzH6smy5SlMERaRghtqbMWZ9cQXJ35f5PIorgIafZrZFZ0D8fdZf4En+PvEptD3fo/hcd7V3cQUeu9XHwV2DbXk0o6ttvjOV4DpvlNwCzfdi3dXpfa2Fhl91GRugFdvwu5xjt83/gMQq1hdXyTFoewEb+yDncUREBrO/v/O2Z3EFZBRXAI3RKHfOfT2j/Xcvv5BWXAG0xOP8+qXnSXiM0y31vyWjuAKwa7sUVwAt0HRbcnXCzl1tHBp+R3rusWBbsI1XZ44dnZmX3GvTCrp2rdiGX+c8Rn+kAktECibseO1UvwHuqsy22DywbZntthnib21aYO3aXunmF7SkCq8uoq8A0cx2E4TYe+ltth7cdd7Dx9/JPZTYq8l4Mo7pg/i83McRERnEHnv/3W5/zUsLPstoe3PZMrzmhLXE4qxqad6EyDrxyjsdPIo3E4DUtibru60A65GnsBB70+OY2XOvjeWWY6y1yYLMS/zDnMbor1RgiUjBJGx393DymNfujEy+X5QhAr4RmxLWeoGtuvsF4IzLbHbG43k5tTFwusRoSsj6+qtvaO6hOONIblDZlQFf7ksEi4gMZtsMydywfmPGVVZntI0o815q3QCVIe93tnLmG929/jaWuYS6rwo8S0DA57GliTMSTCiznQima17LwhiTmsbodUzv95kHChVYIlIwC6J70/U1T2shZgPeX1B6ZmZb+ECSu8V3LtZM8ulQ+JAexefzbwG+LPt4mBqgyxxzE8CUnJzZtfQsMoudAAR2xPjHp/c1ASg5jcxiMoIpuyDn2E3JiZCxmIUDvjoI7przOCIig9k3p+/Rrf4GuGT6bhntF03bjYg//Zoc9vs5cfsdCPl7uPBQxXeyfOAjM/cEIbgbpss+kMZEIHIMOeee8AGpvl6599DcYy89G4hkHJPSr+c+Rj+kAktECmazMTezoHVHrKXj1+LWzQgOeQmcLpsGh4/EV35JxhjGhDG194J/eyAABMG/Dab23m6vZOSp7rEudwcNRE7H1D0GwampY4bAGYup/lNG0gIwgW0w1delirVQMsbQPpjqmzwPacovg5KTSSavMJhyKL8MEzki57CNMwxTfUfqSVYoGWdgMqbmLozRpV1EJBeO43D3MSfgdFl174ydJnHTYUcSctbfaIv4/fz1mBMo93giddAWE/j+nl+gIhQi4vcTcvwcv+32/GCvfXocoy+wDVT+hmQ+SjFlUPMAVF6VvLFGGAhCeH9Mlcc7VYCp+GGqyAol+5tKqPg+JrxfZl8TxtTe1+Pca0q/BqVngYkAkeQsjrKzMSVfynmM/kirCIpIwUVj9axpfJOq0u0IBddPC3ATq1LLlG+Nz7fxKRTWXQ3WYpzavMfoxleAuxj82+Lzrb8jaN21yXnovqFZl73t6GstuMvAlGJ85Rs9prWt4K4BX13yydYmSB5zBZggxle1SWP0RVpFUER62zvLl7GipZk9R4/F6VRYvbtyBX7jY8vajeeeuOuyrKmRmnCESGDTrusb4kbfAV8pPv/66erWuqncU47xeU9V7MzallTuGZrTth75yL3WRsFdmcp3XtPb+6dsuUqbpYhIwQUDFQyr3jOj3efUQjcu2Nn2oMoHn38IkDkXvztFizEGPJ5wZe8fznxHq5uSx+zGu1siIuJpu6HeU8a3qcv9PS2/z8eo8op8hZTBF9wuo80YX7dyiTERcLpO29tA/zzkXmOCyfe6BgnNIxEREREREckTFVgiIiIiIiJ5oimCIuLJ2ii26c/Q8kByydfwoZiy8zG+wk19cK3l3rfe5I435tAQbWPf8ZtzybTdGZZl+VsRERncPl6zmqtmvMSrixcxtLSU86ZM49AJ3d2Co3tsYim28Tpoew5MBZSciSk5YaPv6crgoQJLRDzZNedDdBYdO7A3/xXb9izUPVawF1R//Oy/efjdd2iJJ3esf+idt/jPJx/xr9PPoCqc+3xxEREZ+D5Zu4aj7rublngM11pWNDfxrWeeZHFDPV+bVJg1cqy7GrvyaLDrgASwAhp/iU28h6n4UUGOKf2PpgiKSAYbewuir9JRXAEQS65S1PpUQY65tLGBh+a/3VFcAcStpaEtyt1vzi3IMUVEpP+6/pUZtKaKq3Yt8TjXvPIybZ1yST7ZprvANpIsrtobW6D5fmxiRUGOKf2PCiwRyRR7E88d320zNvpaQQ759orlBB0no70tEWfmogUFOaaIiPRfry1ZTMJjuyEDLKhfV5iDRl8BopntJgjx9wpzTOl3VGCJSCbfCDCZxU77hruFMLK8grjrZrQ7xjC+qrogxxQRkf5rdIX3O8Ex16WupKQwB/WPw/PHZxvv1jYdMrCpwBKRTKG9ki/udr1EGD+m5JiCHHLbuiFMqK0j4Es/ZtBxOGOnXQpyTBER6b/OnzqNiD99OYGQ43DQFhMK9t6uKTkD6PoecgAC22H8WxbkmNL/qMASkQzG+DG190JgZ5KJJATO5piavxZ0s987jjqWPcaMI+g4hBw/I8rK+ePhR7NFzabvHi8iIgPTHmPG8csvHkB1OEzE7yfkOBw2YWuu2P/Agh3TBLbCVP8BfMOBMBCE0F6Y6j8W7JjS/xjrMXe1r5syZYqdPXt2scMQGRSsuxpsHOMM7bVj1re10hSNMbysTMveDlLGmNestYVZBqyXKFeJ9I6E67K0qZGqUJjSYGFWue3KWgvuUjClBd2+RPq2bLlKy7SLyAYV8olVNhWhMBWhcK8fV0RE+h/H52NUee8WOcYYcEb06jGl/9AUQRERERERkTxRgSUiIiIiIpInmiIoIp6iiQR/ev017nv7TeIJl0MnbMWFu07POnXv83VruXrmy8xY8Dm1JSWcM3kqR261TV7eoWqNx7h59qv8/d23sRaO2WZbzp0yjZJAoMdj54tNLME2XgdtL4CvEkrOxESO0ztkIiIF9PGa1Vw18yVmL1rE0NJSzps6jUO23Cpr/yc/fJ+bXn2F5U1NTBk5ist224PNq/MzFX7+yhVcPfMl5i5dyqjyCi6aNp19x2+el7HzxbY+i236AySWQGAnTNklmED275dsGi1yISKeznz077yyaAGt8TgAQZ+P0ZWV/POULxPqsizu4oZ6Dr37LzRGo7ipDYojfj9fnzyVb0zbvUdxuNZy4t/u5e0Vy2lLJIDkMrwTaut4+MRTcXzFfxBvEyuxKw8DWw8kUq0RKDkFX8V3ixlav6VFLkRkYz5du4Yj7r2LlngM167PPZdN34OvTsq8fNw+ZzZXzXyJllRe8xlDxB/gsVNO7/F+i/NXLOf4v91HazxG+0/WEb+fn+6zH8dvt0OPxs4Xt/kBqP8l0JJqMWAimJr7MYGtixlav5UtVxX/JxMR6XPmLV/GrE7FFUDUdVna2MhTH32Q0f/m2bNojsc6iiuAlnicm2a/SmPUY8f7bpix8HPeXbWyo7gCaEsk+HjNap7//NMejZ0vtvlOsE2sL64AWqD57uQqjCIiknfXvTKDltj64gqSuefqV16mrVP+AmhLtbd0anetpTUe4/pZM3scy+9efjGtuGqP5dcvPkfCdXs8fk9ZG4eG37G+uAKwYFuwjdcUKaqBSwWWiGSYu3QJXg+3m2MxXl20KKN91uJFxD0SSMDx8dGanhUYc5cuzUiU7bG8uWxpj8bOm+grgEchaYIQe6/XwxERGQxeW7I47cZeOwMsqF+X1vb5unV4TdhOWMvsxZl5rbvmLlvqEQk0x+Ksamnu8fg95q4A2+bxgYXY3F4PZ6BTgSUiGUaWV3hOvQs5fsZWVma0j62o9ExcsUSC4aVlPY6l65REgJJAgBFl5T0aO2+ccXheTm1My/iKiBRItqXZY65LXUlJWtuQ0hJiWZ4kja7o+RLvw8u8c50BKkKhHo/fY6YSPEtAUpsmSz6pwBKRDHuPG09FKITTZYEGv8/Hcdtun9H/3Cm7Eu5SBAUdhz3GjGNYlqSTq4O33JKQ408r4AwQ9DkcNqFvzBk3pWcBXTe3DEBgIsY/vggRiYgMfOdPnUakS+4JOQ4HbbElVeFIWntVOMJBW2xJyHHS2iN+P+dPndbjWC7adbeMWMJ+PydsvwNhf/EXZDK+EogcDXRdqCqCKTu/CBENbCqwRCSD3+fjgeNPZsdhwwn6HEKOw+ZV1dx97AnUdrkrCDBpxEiuPOBgaiMRwn4/QcfhgM234NqDD+txLGF/gL+dcDLbDRlK0HEIOg5b1w3hvuNPojTYtagpDhPYFlN1LfiGkExeQQjtjam+qdihiYgMWHuOHcfP992fqnC4I/ccOmFrrtj/IM/+V+x/EIdO2Jqg4xD2+6kKh/n5vvuzx5hxPY7l4C0n8N099qY8GCLi9xNyHI7bdnt+uNc+PR47X0zFjyByFBACImDKofy7mPD+xQ5twNEqgiKyQauam4m7bk5PohKuy9KmRiqCIcoLMCViRXMTWBhSWpr3sfPBWhfcpWDKML6eTzkZzLSKoIjkKpFahKkyHKYshxtvjdEo61pbGVZWhj/PK9HGEgmWNzVRHYn0qa1EOrNuM9g14BuKMX0zxv4iW67SPlgiskFeT6yycXy+rHPi82FISd8srNoZ4wNnZLHDEBEZVByfj1HdeI+qLBjMqRDbFAHH6VYsxWB8JUDuuV26T1MERURERERE8kQFloiIiIiISJ7kPEXQGHMscAUwlOQiXgaw1tqsz0GNMX8CDgeWW2sztrE2xuwDPAp8kmr6u7X2Z7nGJCLe3l25gqtmvsSbS5cyqqKCi3bdjX3Gb9atMRatW8cBd/2Z1tQGvw6GPx11LDsOG84Nr87kyQ/eJ+g4nDpxJ76y0y4sbWzk6pkv8fLCz6mNlPD1yVM5YqttMMZrAXe4auZL3D7nNdricepKSvjZvvtxwOZbYJv/Bs13gm2A0D6YsovAV4FtvBVaHk5+ceQoTOnZqWkOIuspV4n0D9ZaHnh7Hn96Yw71bW3sM34zLp62G8O7uf3Gz5//L39+4/WOP4+rqOTZM77Ga0sWcfWMl/lw9Sq2rKnlkum7M2XkKJ768ANunP0Ky5samTpyFJdO34PNq2s8x25sXcvNM2/n0Q+b8fksx21VwTnTzmZtm8u1r8zgf59+QkUoxFk7T+LE7Sfy3qqVXD3zJeb2IPfKwJDzIhfGmA+BI6y183Me3Ji9gUbgLxtIWt+01h6e65igF4dFNmT+iuUc/7f70naUj/j9/Gyf/Thuu4x/hlltft3vPdtHlpSysq2VaKrwivj9TBk5irnLltIYjeKmrikRf4CvT57KN6btljHG5U8/ycPvvZPRfv/Bq5hc8U/W7zTvT+7d4YyE+PtA+yaJIfBPwNT+DWOcjHGk/9vURS6Uq0T6h58+918eePstWuIxAPzGUBkO8/TpZ1ATye3m2RUvPMcfX8/8Nxby+TA+H62dNqkP+/0cu812PPzu/I5j+owh4g/w2CmnM76qOm2MWCLK0ff8mo/XhWlz/alx42xX08pnjXXUt7UR78h3fvbffEv+/fFHGbn3p/vsx/HdyL3Sv2TLVd2ZIrisOwkLwFr7PLC6O18jIj3z25dfSLvAA7TE4/z6xedJZNlksavz/vlo1s8WNzd1FFftY7+84HOaOhVXyfYYN782i8ZoNO3rW+Nxz+IKLLe82cD64gogDrYe4u+yvrgi+fvExxB9IafzkUFFuUqkj1vR3MS9b73ZUegAxK2lMRrlL3PfyHkcr+IKoM1104orSOaersd0raUlHuP6WTMzxnj2g8f4rCHUUVwlx/Xz1qoI9dHWjuIKknnw8fff7XHulYFjo1MEU9MtAGYbY+4HHqHTTzrW2r/3MIbdjDFzgcUk7xC+3cPxRAa1N5ct9dyrvSkWY3VLS05LnM9YuKBbx3St9Tym3+fjozWr2WnY+l3i31mxPMsohjdWDfVoj3m0AbYZYvMgtE+3YpWBSblKpP+Yv2IFIcdJu1kH0JZI8Mqi7uWf7vDKU661zF68KKP9jSWf0hzPXMI8Zn3eA2UZvzkWY1VLM0NLN77ViQwcubyDdUSn3zcDB3b6swV6krTmAOOstY3GmENJJsQJXh2NMecA5wCMHTu2B4cUGdiGlZaxprU1o90AFTnuTTW8tIz6traNd0zxGUPCY7pxLJFgWJeCbmxlVfbjljR5tBrAAeJd2kvANyLnGGXAU64S6SdGlpcT83iq4xjDuA3kiELxWlZ9VEUVEWcFLYn0IstvXBLWyVZjZehO7pWBY6NTBK21Z1przwRua/99p7bbe3Jwa229tbYx9fsngIAxpi5L31ustVOstVOGDBnSk8OKDGgX7robEX/6vZOw38+J2+9AyJ/bujZ3HXti1s+CpC9a4RhDTaSEsJP+LlTIcdhjzLiMF5brSkrYvMtc93ZnbbOEzPs+ITAlkHZcAyYA4UM2ciYyWChXifQfW9bUsm3dEAJdNvkNOA5n7jI553F2HDos62dd82DE72fbuiGEnMz286dMy/j6I7Y7moDP0vm5lMGlNBAn2CXfBXw+xldVeebe47fbnrBfm/kONt15B+v6HNtyZowZblJLjBljdk3Fs6onY4oMdodO2Irv7LE35cEgEb+fkONw7Dbb8YO99sl5jLqSEi6fvntG+4GbbcG9J5zMuMoqQo5D0Oew8/ARPHzSqVx54CHUhCOE/X6CjsN+m23BtQcf5jn+IyefnlFkHTFha46efA0EdwMCQAh8wzDVN2JqHwD/NkAw+cu/FabmHoyvb288LEWhXCXSD9x+5DHsOXYcQcch7PczrLSUGw89kq1rPe9deHrk5NMZEs5cEOPJU77E1yZNIeL3E/EHiPgDfHWXKTx4wskcMmECQcch4vdTFQrz8333Z8+x4zLGqIjUct8x+zOhsomgL07Ql2C76iYeOv4IbjrsKIaVlnbkuz3HjuOhE0/1zL0/3HvfHn2fpH/a6CqCxpjdgN2BS4CrO31UARxjrd1pA197L7APUAcsA/6P5E9OWGtvNsZcCJxHcu5PC3CZtfbljQWtlZlENi6WSLC8qYnqSISSwKbfPXt0/jusaWnhtIk7EkiNY61lWVMjQcdJW+0p4bosbWykPBTKaUrEiqZGPl+3jm3rhlASDHa0W3cd2CbwjUhb5t0mVgIW4+jJwEDX3VUElatE+qd1ra00xqKMKCvHl2Vbj41Z2dzM/W/NZeqoMew6anRHe1s8zormJoaUlKbN4GiMRlnX2sqwsjL8vo0/a1hW/yk+4zCkfExHm2stSxobKAsEqQyHO9pjiQTLmhqpiZT0KPdK/5AtV+VSYH2BZOI5F7i500cNwGPW2g/yGGdOlLRERAa2TSiwlKtERKRXZctVG30hw1r7HPCcMeYOa+1nBYlORESkB5SrRESkr8hlmfbHSL3hZzwe3Vprj8x/WCIiIrlTrhIRkb4ilyXFrkz991hgOHBX6s+nAJ8WICaRQSvhutw1by53vfkGzbEYB26xJRftOj3nXe03heu6/N9z/+XBd94i5rqMqajgygMOYfLIUZ7917W2csOrM3nyw/cJOX5O2WFHzth5EnMWL+Zb/36KRQ31BHwOJ+8wkR/ttQ9vrVjO1TNfYv7KFWxWVc3F03Zn+ugxnmP3JdYmsM13Q/M9YFsgfCCm7HyMz3sFRCk65SqRXrKovp6rZ77ESws+pyYS4exJUzlq6208b27ky8drVnHhE4/z/upVOMaw/+ZbcPWBhxLMsjrua0sWcc3Ml/lg9Som1NRyyfTdmTxiFFe89Dx/mfs6bYkEw0rL+M3+B7L76LHcPW8uf03l3gO22JJvFDj35ouNL8Q2XgfRGeCrxZR+DcKHFfTvQjZuo+9gdXQ05nlr7d4ba+sNmtcuA9WlTz/Bvz76gJbUDvQBn48hJaU8ffoZlHZaBCKfTvzbfcxekrnJ4j9OPp0duiyB2xqPcejdf2FxY0PHBpFhv58dhw5jlsdGjdvXDeXjtas7zqe9/3UHH8b+m2+Z5zPJL3ftpdD6X5JrGgAEwBmGqX0c4+v7Sbe/6+47WJ2+TrlKpICWNTZy8N130hht69j/MOL387VJU7h0+h4FO+aef74lY7/FMRWVPHfG1zL6v7TgM85+7BFau+Se7eqGMGfpkoz+e4wZy5wlizNy71Onn0FZgXJvPtjEUuzKw8E2Aql9xUwESs/GV3ZhUWMbLLLlqu4s0z7EGLN5pwE3A7SUl0iefLp2DU99+H5aMRJzXda0tvD3+W8X5JgL69d5FlcA3/vPvzLaHn//PZY3NXUUVwCt8bhncQXw9srlaefT3v9nzz/bg6gLz8Y/htZ/s764AohBYhW25dFihSW5Ua4SKaBb57xKcyyaVuy0xOPc8tqr3dqgvjt+9vx/PTezX1C/jpcXfJ7R/vPnnk0rriCZe7yKK4CXFnzumXsffOetHkZeWLbxVrDNdBRXkJxx0XgL1m0sWlzSvQLrUuB/xpj/GWP+BzxLcjlcEcmDecuXeS4X2xKPM2PhgoIc83+ffpL1s4/WrM5om7VoIc3xWI+Pu7ihgbYuya9Pic0D4zXtpAVis3o9HOkW5SqRApq5aCEx181oDzoOH6xeWZBjvrZ4cdbP/v3xhxltH6zu+TZ1LfE4ryxa2ONxCio2i+TuEV0YP8Q/6vVwZL1c3sECwFr7lDFmArBNqulda21hblWIDEIjysrxmrDbvkN8IWxbl/3GflUonNE2trKSkOPQ1ukJ1qYo8QcIOk6PxigoZ2SWD4LgZG5IKX2HcpVIYY2tqGT+iuUZ+SqaSDC8rLwgxxxRXs7y5ibPz7asqc1oq45EWN3S4tE7dwGfj3GVVT0ao+Cc0RB/H7r+bdgYOMM8v0R6x0afYBljvpj677HAYcAWqV+HpdpEJA8mjxjJiLJynC4vpvp9Pk6dmHWP1J4dc+Qoz0IK4NLpu2e0nbj9RJwuT9kcY6jMsqlwid9PpMsLyBG/nzN3mdS3X8ANTAbfEKBLEWgcTMlJRQlJNky5SqR3nDN5KuEu1/Wg47DrqNGMKq8oyDG/v+cXPNsDPh8nbz8xo/3rk6d65p7yLO9T1UYinrn3tALl3nwxpecAXfNvEILTMM7wYoQkKblMEWz/v/oIj1+HFygukUHHGMPdx57A1JGjCToOIcfPqPIK/nTksYyuqCzYcf956pcZWlK6Pg7g7F0mc4JH0hpaWsZfjz6+40lW0Oew0/AR/PPUL3PmTpPonJ6Gl5bx7y+dxflTp1ESCFDiDxD2+zlt4k58Y9fdCnY++WCMD1PzVwhMAoJACJzRmOrbMM6IYocn3pSrRHrBzsNH8PsDDqE2EiHs9xN0HPYZtxk3HHJEwY45ddRofrTXPmlFUEUwxKMnnY7PY2r913aZwlm7TCbi91PiDyRv7O08mf995auM7lQEGuCk7XbgiVO/0pF7w/71uXdMZeFybz6Y4C5QeQWYaiACBCG0L6bqmiJHJjmvItiXaGUmGehWtzTTGo8zoqy81570LKxfx5KGBnYaNjzrsrftrLUsbWwk6DjUlqxfUS8ajzN32VJGllcwqmJ9EmuLx1ne1ERdSQmRQKBg51AI1l0Ntg18w/v2U7cBZlNXEexLlKtkIEu4LksaG6gIhajIMhMi31zXZe6ypVSGQ2xenTk1sKvWeIwVTc0MKS0h7F+fe5Y0NLCwfh0Thw1Pexq3uqWZlnickb2Ye/PB2gQkloCvEuMrzDRN8ZYtV+X8DpYx5iNgJvAC8Ly19p08xicinRRj743RFZU5PykzxjCiPPMiHvT7mTpqdEZ7yO/v83cCszG+mmKHIN2gXCXSOxyfr6CzK7z4fD52GZHtHdlMYX/AM/eMKC/3zGH9Yd8rL8Y44M/MvVI83VlFcDvgj0AtcKUx5mNjzMOFCUtERGSTKFeJiEhRdafASgCx1H9dYBmwvBBBiYiIbCLlKhERKaqcpwgC9cA84CrgVmttzzcZEBlAPlm7hqtnvMTsxYsYWlbG+VN25cAtJuRl7M/WruHUvz/AksbkxoFjKyp56MRTWNLYyKVPP8Ena9fg9/k4cqtt+M1+B/LMJx/x42f/zaqWFiJ+P+dO2ZULpk7n1jmzuWHWDJpiMarDYX60974csdU2/OP9d7nltVdZ09rCHmPGccn03RlaUsqf35jD/W/PI5pIcPhWW3PelGmA5Q+zXuGJD98n5DicMnFHvrLTJM89vESKQLlKJAtrLY9/8B63vPYqq5qb2W3MWC6dvnvepvrd+OpMrn1lBjHXxWcMJ28/kV988QBumDWDP772Ki3xOHWREn66734csNkWfPffT/OPD94j7rpsVlXN1QcdyrjKSs5/4jFeWbgQi2WX4SO48bCj8Bm4ftZMnvnoQyKBAF/acWdOm7gTn9ev45qZL/PqooVpuXfOksVc88rLfLBqFVvW1HDp9D2Y1I3phSI9kfMiF8aYo4A9gV2BKPAyyfnt/ylceN704rD0NZ+uXcOR991FcyyGm/o3FfH7+dbue3HGzpN6NHZjNMpON1+fseeIz5iOY3U2uryChQ31Ge1b19Tynsfmi/tttjkvd9rF3jGG8mCI7YcO5bUli2lNtQd9DmMqK4i7LksaG4mm9sIK+/3sPW48Nx92VI/OU6SzTV3kQrlKJLvrZ83g5tmzOq73PmMoDwZ54tSveL6T1B1XvvwCN87O3Ih9RGkZS5oaM9pHlZezqKEho700EKAplr6hfcjxUx0Ksaq1pWOT44jfz55jxzNj4ecZufe4bbfnwflvd+QvSOaqW484mj3GaC9DyZ9suSrnW87W2kettd8Cvg48AZwBPJ63CEX6setemUFLpws8JHeB//2MF2mLe+yy3g3ffuYpzw2IvYorwLO4AjyLK4D/fPJxR7IFSFhLYzTKKwsXpiWnqJtgQX09SxrWF1cArfE4z3/2Ke+uXJHD2YgUlnKViLfGaJSbOhVXkMwjzbEYt8x5tcfj3/ya9xhexRXgWVwBGcUVQFsizsqW5o7iCpI59j8ff0hzNDP33j1vblr+gmSu+vnzz270PETyIecCyxjzUGp1pmuBUuDLQHWhAhPpT2YvWUQiW8FTv65HY89ZsqhHX78p4tYlYd2M9mgiQdRNZLQb4M1lS3shMpENU64S8fbh6lWeU7ljrsvMhQt6PH62m375EvcY3wKuxy3IbJF8sEozhqV3dOcdrN8Ac6y1mT9dAcaYA6y1z+QnLJH+ZVR5BQvrM58cxVw3bZ+oTTG0tIzlzc09GmNT+IzJKBrbN3ns2u4zPkaUae8N6ROUq0Q8DCstI5bwvkE2th9so2HILJyMMXRnP9fqSCSvMYlk050pgq9mS1gpV+QhHpF+6fwp04h02Zw35DgctMWWVIV7dkG/Yv+DutU/22ITIcfxbB9SUkKwy2dhx6EqHMHXZaPFkN+f0dcxhqpwmN3HjO1WnCKFoFwl4m1EeTnTR4/NvN77/ZwzeWqPx580fIRne7YfNLu7MFKwS3/HGIaVlnnm3m1qh2S0R/x+zpnU8/MUyUU+l/3qP1tei+TZXuPG85MvfJHKUIiIP0DQcTh4y626XRx52XbIUL6z+15pbQa4Yv8D+cauu+Hr9E+vOhzhmdPP4KAttkzrv0VVNS+ceQ7b1NWltX9h3Hj+ddoZfGHceIKOQ8TvpzYS4aoDD+Xhk05l4tBhBB2HkOMwrrKKu489kb8ecwJjKyoJOclia8dhw7nv+JNwtIqg9A/KVTJoXX/I4ew7frOO631NOMJv9z+YySNG9Xjs+487ifGVVWlt1eEwL531dbaoSp+le9AWW/LM6WdQ3ekGpA/DN3bdjTuOOpaQs744Cvh8XH/wYfz56OMZWV5OOHWjb8rIUTx80qn8dJ/90nPvFhN48ISTOXPnyUT8fkr8ASJ+P2fuPImvTer2ujkimyTnVQQ3OpAxc6y1PVsuLUdamUn6qrjrsqShgapwmPJQKO/j/++Tjwk6fnYfu/5pUdx1eX3JYoaWljKuUxJrjEZ5Z/lyNquuYkhpWUf7yuZmPl69mm2H1FEeCne0r2ttpb6tjZHl5WnF0ormJmKJBCPKyjGpJ1rWWpY0NhB0/NT1cAqkiJdNXUUwh3GVq2TQq29rZV1r5vU+H1Y1N/Higs+YNHwUYzpNPVzW2Mgna1ezw9DhlAWDHe2frV3D8qYmdhkxMu2p1jsrlpFwLROHDe9os9ayuLGBsONPm37fnnsrw2EqOuXe1niM5U1NDC0tJewP5PU8RSB7rlKBJSIifY4KLBER6et6vEx7Dj7N41giIiKF8GmxAxARkYFto6sIGmOO3dDn1tq/p/67wX4iIiKFolwlIiJ9RS7LtB+xgc8s8Pc8xSIy6CVcl3veepO/vvkGzbEoB20+gQt3nU7I7+fm2bN49L35OMZw/HY78NVdJhPye/8Tfm/VSq6e8RJvLlvK6IpKLtp1OnuNG+/Z11rLY++/yy1zZrOmpYU9x47lG9N2Z1R5RbdiX1i/jmtmvszLCz+nLlLC1yfvymFbbd3db4HIplKuEuklixrquXbmy7y04HNqIhHOnjSFI7bahndWLOeaV2bw9vJljKuq4uJpuzN99BjPMay1PDT/bf70+musa2tjn/GbcfG03Rja6Z3hzla3NPOHWa/w9McfUBII8qUdd+a0iTtlrHa7Mf/66ANumj2LZU2NTB05mkun7874Km2VJ/mVt3ewepPmtctAdfm/nuSpD9+nJbUDfcDnY0hJKRXhMJ+sWU1bag+TsN/PzsNHcPcxJ3QsPNFu/soVnPC3e2mJxTr2DAn7/fzyiwdwzDbbZRzz6pkvc9ucVzuO6RhDeSjEU6d9JWui62pJQwOH3nMnDdFox2aTEb+f86dO44Kp0zflWyGDXKHewepNylUyEC1rbOSQe+6koa2tY0/EiN/PEVttw2Pvv0trPJ6We64+8FAO2nJCxji/eP5Z7n3rzY7c4/f5qAqHefq0MzL2q2qKRjn47jtZ3tRIzHU7jnnIlltx5YGH5Bz7n9+Yw5Uvv9BxTJ8xlAQCPHbylxhXVdXN74RInt7BMsYcZoz5tjHmx+2/8heiyOD22dq1PPHBex0XfkhuVLyyuYmPV68vrgBa43HeXLaU2UsWZYxz5csvpBVX7f1/+cL/OoqfdvVtbdzy2qy0YyaspSka5dY5uf9g+MfXZtEUi6WN3xKP84dXX6EpGs15HJF8UK4SKZzbXp9NUzSatuF8SzzO3955i5ZOxRUkc8/PX3g2YzPglc3N3D1vblruibsuDW1t/PXNNzKO+dD8t1nd0txRXLUf858fvMfn69bmFHdbPM5VM15MO6ZrLc2xGNfPmpHTGCK5yrnAMsbcDJwEXERyH5ETgHEFiktk0Jm3fKnnxotR1yXqZu6bGkskmLt0aUb7G0uXZOx2D9AUjbGquTmt7YPVKzM2nYRkYffKwgU5xz5z0ULinRJfO7/Px4drVuc8jkhPKVeJFNbMhQvSCp122eZDLWtsojkWS2ubv2K5Z+5pSySYsfDzjPYZCxekFUbt/D4fby7LzINeFtSv82x3reXVxZk3K0V6ojtPsHa31n4ZWGOt/SmwG+A9sVZEum14WblngvIZ41l4BR2HEWXlGe3ZpvUZSNsfBGBYaRnRRGaiNMDoisqM9mzGVHi/rxVLJBhWWprzOCJ5oFwlUkBjKiq7tVt3yO8Q7vK+8PCycs+bco4xjK3MzD3jKisJZNmva0R5Zh70Uhsp8SwMgW6/cyyyMd0psFpS/202xowEYsBm+Q9JZHCaPGIkI8rKcbq8UxX0+dJ2tYdkART2+9l/8y0yxrlo1+lEuiSzsN/Pcdttn7EoxuiKSqaMHEnAl34nMeT38/XJU3OO/euTd804ZtBxmD56LMM9ikCRAlKuEimgsydNycglQcdhy+oaz9zzpR13ztjMeEJtLVvXDckomoKOw5k7T8445mkTd8640egYw/CyciYNH5lT3NWRCAdsvgWhLk/OIn4/503dNacxRHLVnQLrcWNMFfA7YA7JvUTuK0BMIoOSMYa7jjmBySNGEXSSd/xGlpfzp6OO4/7jT2Lz6mpCjkPQcdimbgj3H3+y5yqCh07Ymm/uvhelgSAl/gAhx+GorbflR3vv63ncPxx6JF8YP56g4xDx+6kJR/j9AQez0/AROcc+ZeQofrPfQVSFw0T8AYKOw77jN+P6Qw7f5O+HyCZSrhIpoF1GjOTKAw6mJhwh4vcTdBz2HjeeB084hQumTqckEKAkECDk+Dlp+4lcvtuenuPcfsQx7DZmbEfuGVJSyvWHHME2dUMy+o6prOS2I45hRFk54dQxJ48Yxd3HZi70tCG/3f9gDtxiQuqYASpCIX7yhS+y19jxm/rtEPGU8yqCxpiQtbat/fdAGGhtb+tNWplJBrpVzc20xuOMLC9PSx5LGhpwfCan1f2iiQRLGxuoiZRQFgxutP+61lbWtbUyqrwi425jrhKuy6KGeipDYSrD4U0aQwQ2fRVB5SqR3pHtet8Wj7O0sZEhpaWUBAIbHWdNSwuN0SijKio2uuS6tZbFDQ2E/X5qS0o2Ofb6tjbWtrYwoqycgMe7YCK5yparctkHq90MYBJAKlG1GWPmtLeJSP5kSxy5zjWH5FSLsZVVOfevDPe8KHJ8vm4dU6QAlKtEekG2633I7+/WkufVkUjGsuzZGGMYleWd3+6oCIUy3kkWyaeNFljGmOHAKCBijNkFOt5trAA2/faBiIhInihXiYhIX5HLE6yDgDOA0cBVndrrge8XICYREZHuUq4SEZE+YaMFlrX2TuBOY8xx1tqHeiEmkZwtaqjnuldm8OLnn1FbUsI5k6Zw2IStu/XSa7E889GH3Dj7FZY3NbHrqNFcMm13xlRWcs+8ufz1zTdojsU4aIsJXDB1Ws7TJ0QGK+Uq6cv+8/FH3Dj7FZY2NjJl5Cgumb47m1VVFzusjWpoa+Pm12bx2PvvEnQcTt5+R76y0y6saG7i2vbcG4lw9uSpHN5Pcq9Ib+jOIhfDgV8CI621hxhjtgN2s9beXsgAvejFYQFY1tjIIXffSUO0rWNH+Yg/wNcnT+Ub03YrcnQb9uc35nDlyy90bJzoGENJIMBuo8fywuefdrQHfD6Glpbx1GlfoTSHhSpEBooeLHKhXCV9yl1vvsGvX3yu47ruM4aIP8A/Tjm9TxdZbfE4R9z7Vz6vX0c0kdzsPuL3M2XEKOatWEZDW+fc60/l3t2LGbJIr8uWq7qzVNifgaeB9g0H3gcu6XloIpvm1jmv0hSLdlzgAVriMW6aPYuGtl5fMCxnbfE4V814MW1X+oS1NEVj/PvjD9PaY67L6pZmHn73nWKEKtIfKVdJnxFNJPhtp5tpAK61tMZjXPfKjCJGtnFPffQBixsbOoorgJZ4nBkLF9DYqbhqb79p9izq+3DuFelN3Smw6qy1DwAugLU2DiQ2/CUihTNz0ULPXdkDjo8PV68qQkS5+XzdOs92F4vX8+SWeJyXF3xe2KBEBg7lKukzFjXU43rMFEpYy+zFC4sQUe5eXbSQ5lgsoz1hXeIe5xR0HD5YvbI3QhPp87pTYDUZY2oh+TOgMWY64P2TokgvGFNRgdds71giwdCyje8TVSx1JSWehSHgOX894PN1a8lbkUFOuUr6jNpIhHiW6/2Isp4vN15IoysqCXnsEZVtr6poIsHw0ty3EhEZyLpTYF0G/APY3BjzEvAX4KKCRCWSg7MnTSXsT1+nJeg47DpqNKPK+27iqo5E2G+zzTMSV8TvZ2hJKU6X5OX3+Th1h516M0SR/ky5SvqMilCYg7eYQMhJz1URv5/zp04rUlS5OW677fF32XTeZwyV4XBG/go6DlNGjsrLHlUiA0F3Cqx3gIeBV4FlwK0k57aLFMWkESP57f4HUx2OEPEHCDoOe48bz/WHHFHs0DbqygMO4YDNtyToOJT4A1SEQvzkC1/k4ZNOY5cRIwk6DmG/nxFl5dx2xDGMqawsdsgi/YVylfQpv9n/QA7eYv31vjwY4od778s+4zcrdmgbNKSklL8ecwJjKyoJ+/0EHYcdhgzl4RNP4/cHHpKWe/caO54/HHpksUMW6TO6s4rgAyT3E7k71XQKUG2tPaFAsWWllZmks4TrsrC+nspwiKpw/1rOvL6tlTUtrYwsLyfQ6Y7gyuZmWuMxRpVXaNlbGZR6sIqgcpX0SfVtbaxpacm43vd11loWNzQQcJKr2rbrz7lXJF+y5apcNhput7W1tvM8pWeNMXN7HppIzzj9+B2lilCYilA4o72upKQI0YgMCMpV0idVhEJUhELFDqPbjDGeU//6c+4VKbTuTBF8PfWyMADGmGnAS/kPSUREZJMpV4mISFF15wnWNODLxpj29aLHAvONMfMAa63dMe/RiYiIdI9ylYiIFFV3CqyDuzu4MeZPwOHAcmvtDh6fG+Ba4FCgGTjDWjunu8cR6QnXdfnhs//m4XffIe66bFFdwzUHHcY2Q4Z49v9kzRq+8dRjvLtyJY7xcfCWE7jywEMyVltq98bSJVwz82U+WL2KrWpruWTa7uw0fES3YmyOxbjltVd59L35+IzhxO124MxdJrO2tYXrZ83kf59+TEUozNd2mcLR22yr97ZkMFOukgHp3ZUruOTpJ/ho9Sr8Ph9Hbb0tv/riAfiy5J47577ONTNfoiEapSYc4Yd778ORW2/r2Tfuutz15hvcM28ubYkEh03YinOnTOv2lMb5K5ZzzSsv89by5YyvquLiabuz66jRvPD5p9wwayYL6+vZefhwLpm2BxNqa7v9PRDpL3Je5GKTBjdmb6AR+EuWpHUoyeVzDyV51/Faa+1G1y3Vi8OST0fc+1feXrE8rc0A//7SWWxWXZ3WvqyxkT3/fEvaDvYA4yur+O9Xvpox9ssLPudrjz1Mazze0Rb2+7n9iGPYbczYnOJLuC5H3383H65eRVsiuV9q2PEzcegwPlm3hrWtrR37rET8fk6buBPf32ufnMYW6as2dZGLTTyWcpX0aQvWrWXfO2+n645a29TV8cSpX8nof9XMl7hh1syM9l9/8QBO2iHzIe55/3yU5z/7lJZUrgr6HEZXVvDPU75MyJ/bvfi5y5Zy6kP30xqP054hw34/p2w/kXvfnteRB33GEPb7efCEU9imzvtGpkh/kS1XdecdrG6z1j4PrN5Al6NIJjRrrZ0JVBljundrX6QH3l25IqO4guQOpT969pmM9p89/9+M4grg03VrmbVwYWb/5/6bVlwBtMbj/OKF/+Uc47Offswna9d0FFcArYk4ry9bwrpOxRVASzzOX998g5XNzTmPLzLYKVdJX/f9/z6TUVwBvLtyJe+uWJHW5rouN736iuc4XrnnvVUrea5TcQUQdRMsbWzkiQ9y3+Hg1y88R0un4gqS+e7Oua+n5UHXWlpiMX4/Q69GysBV0AIrB6OABZ3+vDDVJtIr/vPJR1k/e2v5soy22YsXZ+3/9EcfpP3ZWssHq1d59n1v1cocI4Q5S5bQHItltCdcl5ibmXKDjsP8lZlFo4hsMuUqKSqvfNTu6Y/Tc09jNOp5IxCgySOXvLF0CV6TyptjMWYuWuDxSZYYV3jH6FUYWmDOkuz5VKS/K3aB5fVv2vOqYIw5xxgz2xgze0WXuzUim2rLmuxzwGsimUulDystzdp/i5qatD8bY6gKZy7BDmRt9zKivJyIxxQNxxjPf0Bx12V4aXnO44vIRilXSVHVeuSjdhNq6tL+XBIMZu3reLyfO7y0DJ9He9BxGOOxPPumxOhlQ/lUpL8rdoG1EBjT6c+jAc9bGtbaW6y1U6y1U4ZkWXxApLsO2mKCZ/EC8K099spo++6eX/DsG/D5OHn7iRntZ0+amjF+xO/n7ElTc47xyK22wenyErMBSoPBjLnxAZ+PreuG6OVhkfxSrpKi+rZHPoLkO06HTtgqrc3v8zFp+EjP/oduuVVG255jx1EZCmcUWX6fjxM98lo250+d5pnvtq0bQtij/YKp0xEZqIpdYP2D5HK6JrVvyTpr7ZIixySDzGOnnE5lp5WSDHDhrtM5xCMR7T5mLN/dY6+0RFQWDPLQiad6ruR0zuSpfHmnXQj7/ZQEAoT9fr680y6cPSn3d/crw2HuPfZENquqJuT4CTkOW9cN4aETT+W6gw+jNlJCxB8g6DhMHz2G2484pnvfABHZGOUqKaoDt5jAxdN2S3uUWhkK8Y+TTvfsf8+xJ7BNXfqTremjRnP1QYdm9HV8Pu47/iQmDh1G0HEI+/2MrqjgzqOPY2hpWc4xnrjdDqkiK0BJIEDIcTh+ux342wmncMRW2xByHEoCAUoDQS6bvgeHbbV1zmOL9DeFXkXwXmAfoA5YBvwfEACw1t6cWvr2BpLL6jYDZ1prN7rkklZmkkL4eM0qljY2seuo0VmXXG/nui6zFy+mIhzKaRWklliMpU2NDC8tIxIIbFJ81loWNzbgGMPwsvVTAF1rWbBuHeWhoOe0RpH+qJdXEVSukn4h7rrMXryQukgpW+YwU2FFUyPvrlrJxKHDqApHcujfRFsizqjyik3e7qMtHmdxYwNDS0op7TRdsaGtjZUtzYwsK895ZUKRvi5bripogVUoSloiIgNbbxZYhaJcJSIysBVlmXYREREREZHBRAWWiIiIiIhInqjAEhERERERyRMVWCIiIiIiInmiAktERERERCRPVGCJiIiIiIjkiQosERERERGRPFGBJSIiIiIikicqsERERERERPJEBZaIiIiIiEieqMASERERERHJExVYIiIiIiIieaICS0REREREJE9UYImIiIiIiOSJCiwREREREZE8UYElIiIiIiKSJyqwRERERERE8kQFloiIiIiISJ6owBIREREREckTFVgiIiIiIiJ5ogJLREREREQkT1RgiYiIiIiI5IkKLBERERERkTxRgSUiIiIiIpInKrBERERERETyRAWWiIiIiIhInqjAEhERERERyRMVWCIiIiIiInmiAktERERERCRPVGCJiIiIiIjkiQosERERERGRPFGBJSIiIiIikicqsERERERERPJEBZaIiIiIiEieqMASERERERHJE3+xAxisrLU8e99LPHT1Y9SvamT6YZM55fvHUDO8utihiYiIALBuZT33/vrvvPzobEorSzj24sPY/0t7Y4wpdmgiIn2WCqwi+dMP7uGR65+ktakNgMf/+C+ee3AGt827iora8iJHJyIig13TuibOm/xt1ixbRzwaB+C6C27l/dc+4oJrzypydCIifZemCBbBupX1/P2af3YUVwDxWIKmtU08+oenihiZiIhI0j9v/Q/1Kxs6iiuA1qY2/nnLv1m5eHURIxMR6dtUYBXBR298SiAUyGiPtsaY8+83ixCRiIhIutf/8yZtLdGM9kDIz/uzPypCRCIi/YMKrCKoHVVDPBbPaPf5DMM3G1qEiERERNINHz8Un5P5Y4KbcKkbVVOEiERE+gcVWEUwbtvRbLbDWJyAk9YeCAc59uLDihSViIjIekdfdAiBUPqr2o7fx4jNhzFh0uZFikpEpO9TgVUkv3j8e+y493YEQgHCZWEq6sr5zl8uUtISEZE+Ydx2Y/jR/ZdRPayScGmIQCjAdrtvw2+e/qFWERQR2QCtIlgklXUV/PaZH7Nm2Voa1zYxcsvhOI6z8S8UERHpJdMOm8x9i25h8YdLKamIaCsREZEcqMAqsuphVVQPqyp2GCIiIp58Ph+jtxpZ7DBERPoNTREUERERERHJExVYIiIiIiIieVLwAssYc7Ax5j1jzIfGmO96fL6PMWadMeaN1K8fFzqmviwei/PQ1Y/xtR0u5axtL+aeXz1Ea3Pbxr9QREQ2ifJU9y36cAlXfOV6vrT5BXzziz/htWfmFjskEZE+o6DvYBljHOAPwAHAQuBVY8w/rLXvdOn6grX28ELG0h9Ya/nxUVfw5vPv0Nac3Nzx7l8+xEuPvMp1M36pRTBERPJMear7Fr6/mAumfpfW5jbchMvST5fz7qwPueiGr3LQGfsWOzwRkaIr9BOsXYEPrbUfW2ujwH3AUQU+Zr81/5UPmPfC/I7iCiDaEmPBu4t45Z9zihiZiMiApTzVTXf8+H5am1pxE25HW1tzGzdffieJeKKIkYmI9A2FLrBGAQs6/Xlhqq2r3Ywxc40xTxpjti9wTH3W/BnvE49lJqeWxlbeevHdIkQkIjLgKU9101svvovr2oz2eFucFQtXFSEiEZG+pdAFltdOhF2vynOAcdbanYDrgUc8BzLmHGPMbGPM7BUrVuQ3yj6idmQ1gVDmrM1QJMjQsXVFiEhEZMDLW56CwZOrvCQSLuU1Zb0cjYhI31PoAmshMKbTn0cDizt3sNbWW2sbU79/AggYYzKqCWvtLdbaKdbaKUOGDClkzEWz+1FTCYQCmC7p3vE7fPHUPYsTlIjIwJa3PJX6fMDnqlO+dwyhklBaWzAcYO/jp1NaUVKkqERE+o5CF1ivAhOMMZsZY4LAycA/Oncwxgw3JllSGGN2TcU0KOcYBMNBrnruZ4zdbgzBcIBQSZARmw/jt//+MRU15cUOT0RkIFKe6qY9j5nGWb86hUhZmEhZmEAowB7HTOPSW75e7NBERPqEgq4iaK2NG2MuBJ4GHOBP1tq3jTHnpj6/GTgeOM8YEwdagJOttZmTuweJcduO5rZ5V7HssxUk4glGbD4M0/WRloiI5IXy1KY59huHcfg5B7Dkk+VUD6vUTUARkU5Mf8wRU6ZMsbNnzy52GCIiUiDGmNestVOKHUdPKFeJiAxs2XJVwTcaFhERERERGSxUYImIiIiIiORJQd/BGqhWLVnDPb/6O7Ofep3KIRWccPmR7HXc9Kz9b7jodh7/4zMk4glCJSHOu+YrHPrV/XnugZd58OrHaVjVwLTDJnHK946lvKaMf9z4NE/e9h8SCZf9v7Q3x158GOEuKza1W/bZCu76xYPMffZtakdVc/J3jmHaoZMKdeoiItJPvPbMXO799cMs/3wlO+y5Daf/6HhGbjHcs+/aFeu4bO8fs+C9xWBgzNajuOq5n2KM4f7fPsJLj7xKWWUJR3/jUPY/fW8Wf7SUu3/xEPNemM/w8UM55XvHMGn/HbPG8vI/XuWB3/2D1UvWMGn/iZz6g+MYOkbbj4jIwKR3sLppzfJ1nDPxMhrWNpFIbQocLg1x0reP4vQfnZDR/weH/4pZT7ye0b7LfhOZP/N9WpvaAPAHHMpryxm/3Wjemfk+bc1RAIKRAJvtMJZrX/4ljuOkjbH88xV8fZdv0dLQQiLuAhAqCXHOb0/nyPMPzut5i4j0Jr2D1TNP3/Es1194O23NyRzjc3yES0PcOPsKRm05Iq1vIpHg0PApuIn0nwd8jo/q4ZXUr2ggFo0DyXy313HTeemRWbQ2teEm1ueei/7wVQ76yr4ZsTx41WPc8eP7O2Jx/A4lFRFumXsldaNq837uIiK9Re9g5cnfr3mcpnXNHcUVQGtTG/f++hGa1jWl9Y1Go57FFcDr/5nXUVwBxGMJGlY38uYL8zuKK4BoS4zP5y/yHOfuX/49rbgCaGtu47bv3UO0LbbJ5ygiIv1XIp7g5svv7ChoANyES2tjK3/5yQMZ/a87/9aM4qr9a9YsXdtRXEEy3/37rudpaWjpKK4gmXv+ePlfSMQTaWO0NrelFVft8TU3tHD/FY/26DxFRPoqFVjdNOff89KSTbtAyM8n8z5Pa3t35ofdGjsejeN2SU4ALY2tzHthfkb7G8++lVZcdbbogyXdOraIiAwMyz9fSdwjT7mu5c3n3slon/PveVnH8iq8rLV4TX6JtcZYsTB9e7AF7y7C8Wf+qJGIJXj9v9mPKyLSn6nA6qZh4+rw2pYqFo1TO7ImrW30ViMyO26AMQafk/lXEooEGTo2c676kNHeUysSsThVQyu7dWwRERkYKmrLst58qx1Vk9k2srpb42fbmzGRcCmvKUtrqx5W6XlTEmCIR14TERkIVGB10/GXH0kwEkxr8wcctp6yBSM2H5bWXjO8mtKqEs9xSsoj+APp71QFwwHCZeGMAs7xO3zx1D0zxjj5u8cQ6rL4RSAUYMpBO1OtAktEZFAqrSxlz2OnEQwH0trDJSFO/d6xGf0vv/38rGMFQulrYTl+H0NG12bkwWA4wN7HT6e0Ij3n1Y2qZacvbJcxTqgkxEnfOiqn8xER6W9UYHXTdtO34rJbz6WsupRwWZhAKMBO+2zPTx7+lmf/O967jkhZOK1t5JbD+fP717LDntsSCCWLqoraMr5954Vc+9IvGbvtaIKRIKFIkOGbDeW3//4xFTXlGWNPOXAnzr3qK5SUR4ikYtn1kF347l8vKsi5i4hI/3DZreey25FTCIQCRMrDRMrCnPmrU9j9qKkZfcdsNZKv/vrUjPazrzidH//tm1QOqSBcGiIQCrDt9K24fuavOOuXpxApCxMpDxMIB9j96F259Jave8byw/suZdIBOyVjKQtTWlnChdefxc777pD38xYR6Qu0iuAmisfiLP5oGeXVpVQPq9po/w/f+IR3ZrzPbkdOYUinVZNWL11D49pmRm05HMe//onW0k+X4yZcRmw+LOt0jHaxaIzFHy2jakgFlXUVm3xOIiJ9hVYRzI/6VQ2sWb6OEZsNJRgObrBvIpHguftfxjiGvY/frWPl2kQiweIPlxIpj1DXaSp8tDXKkk+WUz2s0vMmYFdrV6xj3coGRm4xjEAwsNH+IiJ9XbZcpQJLRET6HBVYIiLS12mZdhERERERkQJTgSUiIiIiIpIn/o13ka4+fOMTfnvGDXw+fxHBUIDDzz2Qc377JR79w5Pc8u27iLZEcQIOR190COde+RXefP4d7vnlQyz+aBnb7rYVp//wOMZsPcpz7HgszmM3Pc0Tt/0HN+FywJe+wDEXH0ooEvLsLyIi4uX+3z7Cfb95hJbGVoaNq+OyW89jh7224bsH/oK5/3sbay3Vwyr5+ePfZbMdxvHI9U/yrzueBWM46Ix9OOrCQwiGvN+VWvzRUu7+xUPMe2E+w8YP4ZTvHcuk/Sb28hmKiPRNegermz59ewHn7Hg5Xb9vIzYfypKPl2f032b6BD558zPamqMA+BwfoUiQ62b8ivHbj0nra63lB4f9mjeff6dj1/tgJMjmE8dyzUu/6HjhWERkoNM7WD3z+6/dyFN/ejajvaQiQnN9S0b7VlO24LO3F9DWksxVoUiQrXfdkiv/+5OMhZYWfrCEC6Z+h9amNtxEcr+tUEmQb9x4Ngd+eZ/8n4yISB+ld7Dy5Kqzb8oorgDP4grg3ZkfdBRXAG7CpbWpldu/d3dG3/mvfMC8F9YXVwDRliifvbOQWU+8nofoRURkoIu2Rnnqz5nFFeBZXAF8MOfjjuIKoK0lyvuzP2bu/97O6Hvnj++jtbG1o7gCaGuOcvNld5KIJ3oYvYhI/6cCq5s+fvPzHo9hLbw9472M9vkz3icRy0xOLY2tzHthfo+PKyIiA9/8mR9ANyenWDfzC6Itbbz9cmaumvfCu7ge/WOtMVYsXNW9A4uIDEAqsLqprKpk451yUOOxd1btyGr8oczX4kKRIEPH1uXluCIiMrCN2HxoXsYJRoLUdtr3ql3NiCrP/omES3lNWV6OLSLSn6nA6qYv/d8J3h9k2Qs4EAoQjKRv7hguDXHyd4/J6Lv7UVMJhAJ03VfY8Tt88ZQ9NyVcEREZZIaOHcKQbt6U65qnAPwBP184YXpG+ynfO5ZQSfrCS8FwgL2Om0ZpRX5uQoqI9GcqsLrpsLMP4IjzDkorqMpryrh9/tUZT5lKyiPc/dkf2Pu46QRCASLlYcKlIU79/rHsd9peGWMHw0Gu+t9PGbPNKIKRIKGSIMPHD+GKZ35ERW15oU9NREQGiBtnX0Hd6PSnT7sfPZVfPvkDjC/9Lt4+J+3OtS/9gpFbDidUEiQUCTJyy+Fc+exPiJRFMsbe69hpnPHzkwmXhYmUhwmEA+x21FQuveXcgp6TiEh/oVUEN1Frcyuv//dtho2tZfMdx3e0L/lkGbOemMPEvbdl84nr2+tXN7Bm6VqGjR9KuGTjS64v+WQZbsJl5BbDM1ZwEhEZ6LSKYH4s+mAJn7+7kIl7bUtZ1frpe7OeeI3lC1az35f3JpLaBsRay5KPl2GMYfhmQzeae6KtUZZ8vIzqYVW6CSgig1K2XKUCS0RE+hwVWCIi0tdpmXYREREREZECU4ElIiIiIiKSJ5lrgg9wMx9/jft/9wirl6xl8gE7cur3j6VuVK1n3zXL1/Lr065l3gvv4vMZdjtyKt++4wJeemQWvzvrRmKtMQC23nVLbpj5ay7e8/u88/IHHV8/dGwdd396E1/f+Zt8/OZnHe0HfHlvLr31XM7a+hKWfprcoNjnGC677Tz2O3UvHrv5Xzx5239wEy77n743x1x8KKGI93tbyz9fwV2/eIi5/3ubulE1nPTto9n1kF3y9e0SEZFe1lTfzN9+/xjPP/AyoZIQR5x3EAeftS8+n/c90ecfnMGNl/6ZtcvqKa8t4+zfnMaBX9mX7x/6S1596g0guRrt2b89ncPO3Z9jqs4gHl2/5+LZvz2daYfuwtd3+XbHXoxOwOG2t6/m47mf8stTrsGNJzcVrh1Vw18/voEVn6/i7l88xFsvzmfouCGc+v1j2eWLE7OeU+fcO2n/iZz6/eMYMto794qI9HeD6h2sB69+nDt+dB9tzW1AMuGUVpbwx7lXUtdlr4/W5laOG/JVop12tgcorymlYXVT5uCG7m3smKX/lpM2Y8G7i2hrTh43GAmy+cSxXPPSL3AcJ63v8s9X8PVdvkVLQwuJVPILlYQ453df4sjzDupGMCIifctgfQcr2hrl3EnfZukny4m1JW/ihUtC7HX8dL59x4UZ/Z+47d9cfc4fM9orasupX9WwaYFvhPEZImVhWpvacBPtuSfIN248mwO/vE9G/79f+zh/+kF67i2pjHDL3N9n5F4Rkf5k0L+D1drcllZcASTiCZrrm3ngd49m9L/jx/dnFFeAd3EF3SuuNtD/wzmfdBRXANGWKJ+9s5BXn3wjo+/dv/x7WnEF0Nbcxu3fvZtoKjGLiEj/8ex9L7FiwcqO4gqS+eu5B15m4QdLMvrfdOmdnuMUqrgCsK6lub6lo7gCaGuOcvNld5KIJ9L6tja3pRVX0J57W7j/ikcKFqOISDENmgLr8/kLcfyZpxuPJXj93/My2uf8a25vhJWTlsZW3nz+nYz2N559K624amdJLs0rIiL9y+v/mUdrU1tGu+N3mD/j/Yz21qbW3ggrJ7HWGCsWrkprW/DuIs/cm4gleP0/mblXRGQgGDQFVvWwKuLRuOdnQ8ZkzgOv7UNzw0ORIEPH1GW0Z5u/nojFqRpaWeiwREQkz4aNG4I/6GR+YAy1I6szm319Z5/ERMKlvKYsra16WCWxbuReEZGBYNAUWENG17LDXtsSCKav6xEqCXLit47K6H/2b07rrdDS+AN+uu7t6Pgdvnjqnhl9T/rO0YS6bFocCPmZfMBOVKvAEhHpdw49e3/8/vQ85fMZyqtL2Wnf7TP6Tzt0Um+FliYYCab/ORxgr+OmUVpRktZeN6qWHT1zb8gz94qIDASDpsAC+NH9l7HzfhMJhAJEysOUVpZwwXVnsfO+O2T03XzH8Vz0h6/hc9Z/iwIhP79+8geM3W5URv+f/uNb3sf826UZbT7Hx+V/Oi+jfei4Om5+/beM3noUoUiQUEmIYeOG8Jt//YiK2vKM/lMP2pmvX/klIuVhIuVhAqEAUw7ame/e9Y0Nfh9ERKRvGjZuCD999DvUjKgiXBoiGA6w+U7j+f3/fpqx0BHA//39m2w1dYu0tnHbj+Gm13+b8XSremglh593YMYYkYoIkw7YMaN98kE7MWabkRnt3/jD1zjz5ycTLgsTKY8QCAXY7YgpXHrLuZ7n9KMHLmOX/dfn3pKKCBdce+YGVx0UEenPBtUqgu3WLFvLupUNjJownEAwsMG+rusy55k3CZWGmLjnth3ta1eu5eFrnmCLncez9/G7d7S/9Ogr3HfFo+xz8h4c943DOtqf+ev/ePnR2Rx/+RFsv9vWHe0P/+FJPnvzM8785SlU1q1/6rTkk2Uk4i6jthyO6fpIq4toW4zFHy6lckiFnlyJyIAwWFcRbOe6LgvfX0K4JMjQsUM22n/l4tW8+8oHbDV587T+s56aw7zn3+Wwc/Zn+PihHe1XfvUPLPxwCd+//3KGDk9OPYxGo9xw4e34fA7nX3cGwWDyKdW6dU3c8b27GbP1SI69+PCOMdpa2ljy8XKqh1VSWVex0Ri7k3tFRPqDbLlqUBZYIiLStw32AktERPq+Qb9Mu4iIiIiISKGpwBIREREREckT/8a7DHyu6/LUn57lsZuepq0lyj4n7c7xlx1BSXmkW+N8Nn8hd//iQd579SNGTxjBaT88ju1225qfn/h7nn9oJtjkAhdf+cmJnPqD4wp0NiIiMhCtWLiKe371EK//5y1qR1Zz4reO6vYqgtZa/nvPizx8/RM0rWtmj6OncuK3jqKtuY3zp36XtcvWAVA1rJKbXv8ddcMzl4YXEZEN0ztYwBVfuZ4XHnqlY6f5YDjAiC2Gc+PsKwiGcnsR96O5n3LJnj8i2tKG6ya/p6FIkJqR1Sz5aFlG/1O/fwxn/uLUvJ2DiMhAonew0q1ctIpzdvomzfUtJOIJILnU+Vd/fSrHXHRozuPcdOmfeeK2/3RsZhwI+akcUsHKhas9+z8ZvTdj2XgREUnSO1hZLHhvEc8/OLOjuAKItsZY9ulynv/bjJzHueXbf6W1qbWjuAJoa4l6FlcA9/76kU2OWUREBpd7f/MIzQ3riyuAtuY2/vSDe2lradvAV663cvFqHrv5mY7iCiDWFmfVYu/iCuBnJ1y16UGLiAxSg77Aevvl9/H5MpdBb21qY85/3sx5nPkzP+jWcfvjk0MRESmON/47j0QskdHu8xkWvLc4pzHef/UjAqHMp1HWzf41855/J+cYRUQkadAXWLUjq/H5Mr8NgaCfYeM2vvdIu6ohmRsBi4iI5EPd6FrP9ng0TvWwqpzGqBlRlTbLIheVdcptIiLdNegLrEn7T6SkMpKx470TcDjkrC/mPM5J3zmacEkorS0UCWKyfIdHbz2y27GKiMjgdNK3jiLUJccEgn52+sL21I7IbSGKraduydAxtfic9MTkDzpZv+b7917a/WBFRAa5QV9gOY7DVf/7GZtNHEswHCBcGqJmRBU/e/Q7DB2b+xOsQ7+2P8ddfjihSJCS8giBUIAvnLg7t79zbUYyK6su5dZ5v8/3qYiIyAA1af8dOe/qr1BSHiFSHiYQCrDL/hP5wX2X5DyGMYYrnvkxW0/dIpnvysJU1JXz4799kyPOOzCj/1EXHMRWkzbP41mIiAwOWkWwk6WfLifaGmP0ViM8pw3moqWxhaWfrqB2ZDUVNeunVrz2zFzm/OdNDvjSFxi//dh8hSwiMiBpFUFv0bYYiz5YQtWQipynBnpZvmAlLQ0tjN56JI6TfIIVj8f52+8fw+czHHfp4Vo9UERkI7LlKhVYIiLS56jAEhGRvk7LtIuIiIiIiBSYCiwREREREZE8KXiBZYw52BjznjHmQ2PMdz0+N8aY61Kfv2mMmVTomERERNopT4mISD4VtMAyxjjAH4BDgO2AU4wx23XpdggwIfXrHOCmQsYkIiLSTnlKRETyrdBPsHYFPrTWfmytjQL3AUd16XMU8BebNBOoMsaMKHBcIiIioDwlIiJ5VugCaxSwoNOfF6bauttHRESkEJSnREQkrwpdYBmPtq7rwufSB2PMOcaY2caY2StWrMhLcCIiMujlLU+BcpWIiBS+wFoIjOn059HA4k3og7X2FmvtFGvtlCFDhuQ9UBERGZTylqdAuUpERAq80bAxxg+8D+wHLAJeBU611r7dqc9hwIXAocA04Dpr7a4bGXcF8FkPw6sDVvZwjP5isJyrznNg0XkOPN0513HW2oJXKIXKU6mvU67Knc5z4Bks56rzHFi6e56eucqfv3gyWWvjxpgLgacBB/iTtfZtY8y5qc9vBp4gmbQ+BJqBM3MYt8dJ1xgz22vn5YFosJyrznNg0XkOPH3xXAuVp1Jfq1yVI53nwDNYzlXnObDk6zwLWmABWGufIJmcOrfd3On3Frig0HGIiIh4UZ4SEZF8KvhGwyIiIiIiIoPFYC6wbil2AL1osJyrznNg0XkOPIPpXPNlsHzPdJ4Dz2A5V53nwJKX8yzoIhciIiIiIiKDyWB+giUiIiIiIpJXKrBERERERETyZNAVWMaYPxljlhtj3ip2LIVkjBljjHnWGDPfGPO2MebiYsdUCMaYsDFmljFmbuo8f1rsmArJGOMYY143xjxe7FgKyRjzqTFmnjHmDWPM7GLHUyjGmCpjzIPGmHdT/1Z3K3ZM+WaM2Tr199j+q94Yc0mx4+rrlKsGFuWqgWew5ClQrtqk8QbbO1jGmL2BRuAv1todih1PoRhjRgAjrLVzjDHlwGvA0dbad4ocWl4ZYwxQaq1tNMYEgBeBi621M4scWkEYYy4DpgAV1trDix1PoRhjPgWmWGsH9KaGxpg7gRestbcZY4JAibV2bZHDKhhjjENyM99p1tqebsA7oClXKVf1Z4MhVw2WPAXKVZsyxqB7gmWtfR5YXew4Cs1au8RaOyf1+wZgPjCquFHln01qTP0xkPo1IO8aGGNGA4cBtxU7Fuk5Y0wFsDdwO4C1NjqQE1bKfsBHKq42TrlqYFGukv5KuWrTDLoCazAyxowHdgFeKXIoBZGaivAGsBx4xlo7IM8TuAb4NuAWOY7eYIF/GWNeM8acU+xgCmRzYAXw59RUmtuMMaXFDqrATgbuLXYQ0jcpVw0Y1zA4ctVgyFOgXLVJVGANcMaYMuAh4BJrbX2x4ykEa23CWrszMBrY1Rgz4KbTGGMOB5Zba18rdiy9ZA9r7STgEOCC1HSpgcYPTAJustbuAjQB3y1uSIWTmlZyJPC3YscifY9y1cAwyHLVYMhToFy1SVRgDWCped4PAXdba/9e7Hj+v727C9l7juM4/v4wbEONLJHWPISQsBHmwOOipEhqIXMgRUrynDwcKFJOyGNEeYplirS1EkVbHtaYtZ0QbXmWsgNN9HVw/ZarO9vtWtfVf/d1v18n1//+/Z++98Hd5/7+/7///xq1dsv6feDCbisZiUXAJW3O92vAuUle6rak0amq79rnT8By4LRuKxqJLcCWvqvYy+iF2Li6CFhbVT92XYh2L2bVWJk2WTVNcgrMql1igzWm2gO1zwEbq+rRrusZlSRzk8xpy7OA84FNnRY1AlV1V1UdVlXz6d26fq+qruq4rJFIsm972J02DWExMHZvUquqH4DNSY5pQ+cBY/Vg/wRLcHqgJjCrxst0yarpklNgVu2qGUMoZEpJ8ipwNnBQki3AfVX1XLdVjcQi4GpgfZvzDXB3Vb3bXUkjcQjwYnvjyx7A61U1tq+FnSYOBpb3/u9iBvBKVa3otqSRuQl4uU1J+Bq4tuN6RiLJbOAC4Pqua5kqzCqzSru16ZRTYFYNfqzp9pp2SZIkSRoVpwhKkiRJ0pDYYEmSJEnSkNhgSZIkSdKQ2GBJkiRJ0pDYYEmSJEnSkNhgSZIkSdKQ2GBJI5RkaZJD/8d2LyS5fCfr30+ycMi1zUlyQ9/PZyfxe1kkaRoxp6Ths8GSRmspMGlwdWQOcMNkG0mSxtpSzClpqGywpAEkmZ9kU5IXk3yRZFmS2UkWJPkgyWdJViY5pF3pW0jv28/XJZmV5N4knyT5MskzaV8DP2ANi5OsTrI2yRtJ9mvj3yR5oI2vT3JsG5+bZFUbfzrJt0kOAh4Cjmy1PdIOv1/7nTYleXlX6pMkdceckrpngyUN7hjgmao6EfgduBF4DLi8qhYAzwMPVtUy4FPgyqo6qar+AB6vqlOr6gRgFnDxICdugXMPcH5VndKOf0vfJr+08SeBW9vYfcB7bXw5MK+N3wl81Wq7rY2dDNwMHAccASwapD5J0m7BnJI6NKPrAqQpaHNVfdSWXwLuBk4AVrULaXsC3+9g33OS3A7MBg4ENgBvD3Du0+mFykftXHsDq/vWv9k+PwMua8tnAZcCVNWKJL/t5PgfV9UWgCTrgPnAhwPUJ0nqnjkldcgGSxpcTfh5K7Chqs7Y2U5JZgJPAAuranOS+4GZA547wKqqWrKD9dva59/8+/c9yPSJbX3L/ceQJE0d5pTUIacISoObl2R7SC0B1gBzt48l2SvJ8W39VmD/trw9pH5p89F3+DamnVgDLEpyVDvX7CRHT7LPh8AVbfvFwAH/UZskaXyYU1KHbLCkwW0ErknyBb3pE4/RC6GHk3wOrAPObNu+ADzVpjFsA54F1gNvAZ8MeuKq+pneG59ebedfAxw7yW4PAIuTrAUuojctZGtV/UpvCseXfQ8PS5KmPnNK6lCqJt5FlrQjSeYD77SHf6eEJPsAf1fVX+3q5ZNVdVLHZUmSRsCckrrnvFVp/M0DXk+yB/AncF3H9UiS1M+c0ljxDpa0G0myHDh8wvAdVbWyi3okSepnTkmTs8GSJEmSpCHxJReSJEmSNCQ2WJIkSZI0JDZYkiRJkjQkNliSJEmSNCT/ANuTvVl5/HO8AAAAAElFTkSuQmCC\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"fig = plt.figure(figsize=(12, 5))\n",
"\n",
"plt.subplot(121)\n",
"df_trans.scatter(df_trans.petal_length, df_trans.petal_width, c_expr=df_trans.class_)\n",
"plt.title('Original classes')\n",
"\n",
"plt.subplot(122)\n",
"df_trans.scatter(df_trans.petal_length, df_trans.petal_width, c_expr=df_trans.predicted_kmean_map)\n",
"plt.title('Predicted classes')\n",
"\n",
"plt.tight_layout()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As with any algorithm implemented in `vaex.ml`, K-Means can be used on billions of samples. Fitting takes **under 2 minutes** when applied on the oversampled Iris dataset, numbering over **1 billion** samples."
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"ExecuteTime": {
"end_time": "2020-07-14T15:58:58.284463Z",
"start_time": "2020-07-14T15:58:58.280028Z"
},
"tags": [
"skip-ci"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Number of samples in DataFrame: 1,005,000,000\n"
]
}
],
"source": [
"df = vaex.datasets.iris_1e9()\n",
"n_samples = len(df)\n",
"print(f'Number of samples in DataFrame: {n_samples:,}')"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"ExecuteTime": {
"end_time": "2020-07-14T15:59:20.061389Z",
"start_time": "2020-07-14T15:58:58.855735Z"
},
"tags": [
"skip-ci"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Iteration 0, inertia 838974000.0037192\n",
"Iteration 1, inertia 535903134.000306\n",
"Iteration 2, inertia 530190921.4848897\n",
"Iteration 3, inertia 528931941.03372437\n",
"Iteration 4, inertia 528931941.0337243\n",
"CPU times: user 2min 37s, sys: 1.26 s, total: 2min 39s\n",
"Wall time: 19.9 s\n"
]
}
],
"source": [
"%%time\n",
"\n",
"features = ['petal_length', 'petal_width', 'sepal_length', 'sepal_width']\n",
"kmeans = vaex.ml.cluster.KMeans(features=features, n_clusters=3, max_iter=100, verbose=True, random_state=31)\n",
"kmeans.fit(df)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Supervised learning\n",
"\n",
"While `vaex.ml` does not yet implement any supervised machine learning models, it does provide wrappers to several popular libraries such as [scikit-learn](https://scikit-learn.org/), [XGBoost](https://xgboost.readthedocs.io/), [LightGBM](https://lightgbm.readthedocs.io/) and [CatBoost](https://catboost.ai/). \n",
"\n",
"The main benefit of these wrappers is that they turn the models into `vaex.ml` transformers. This means the models become part of the DataFrame _state_ and thus can be serialized, and their predictions can be returned as _virtual columns_. This is especially useful for creating various diagnostic plots and evaluating performance metrics at no memory cost, as well as building ensembles. \n",
"\n",
"### `Scikit-Learn` example\n",
"\n",
"The `vaex.ml.sklearn` module provides convenient wrappers to the `scikit-learn` estimators. In fact, these wrappers can be used with any library that follows the API convention established by `scikit-learn`, i.e. implements the `.fit` and `.transform` methods.\n",
"\n",
"Here is an example:"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"ExecuteTime": {
"end_time": "2020-07-14T15:59:30.707188Z",
"start_time": "2020-07-14T15:59:30.385719Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
#
sepal_length
sepal_width
petal_length
petal_width
class_
prediction
\n",
"\n",
"\n",
"
0
5.9
3.0
4.2
1.5
1
1
\n",
"
1
6.1
3.0
4.6
1.4
1
1
\n",
"
2
6.6
2.9
4.6
1.3
1
1
\n",
"
3
6.7
3.3
5.7
2.1
2
2
\n",
"
4
5.5
4.2
1.4
0.2
0
0
\n",
"
...
...
...
...
...
...
...
\n",
"
145
5.2
3.4
1.4
0.2
0
0
\n",
"
146
5.1
3.8
1.6
0.2
0
0
\n",
"
147
5.8
2.6
4.0
1.2
1
1
\n",
"
148
5.7
3.8
1.7
0.3
0
0
\n",
"
149
6.2
2.9
4.3
1.3
1
1
\n",
"\n",
"
"
],
"text/plain": [
"# sepal_length sepal_width petal_length petal_width class_ prediction\n",
"0 5.9 3.0 4.2 1.5 1 1\n",
"1 6.1 3.0 4.6 1.4 1 1\n",
"2 6.6 2.9 4.6 1.3 1 1\n",
"3 6.7 3.3 5.7 2.1 2 2\n",
"4 5.5 4.2 1.4 0.2 0 0\n",
"... ... ... ... ... ... ...\n",
"145 5.2 3.4 1.4 0.2 0 0\n",
"146 5.1 3.8 1.6 0.2 0 0\n",
"147 5.8 2.6 4.0 1.2 1 1\n",
"148 5.7 3.8 1.7 0.3 0 0\n",
"149 6.2 2.9 4.3 1.3 1 1"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from vaex.ml.sklearn import Predictor\n",
"from sklearn.ensemble import GradientBoostingClassifier\n",
"\n",
"df = vaex.datasets.iris()\n",
"\n",
"features = ['petal_length', 'petal_width', 'sepal_length', 'sepal_width']\n",
"target = 'class_'\n",
"\n",
"model = GradientBoostingClassifier(random_state=42)\n",
"vaex_model = Predictor(features=features, target=target, model=model, prediction_name='prediction')\n",
"\n",
"vaex_model.fit(df=df)\n",
"\n",
"df = vaex_model.transform(df)\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"One can still train a predictive model on datasets that are too big to fit into memory by leveraging the on-line learners provided by `scikit-learn`. The `vaex.ml.sklearn.IncrementalPredictor` conveniently wraps these learners and provides control on how the data is passed to them from a `vaex` DataFrame. \n",
"\n",
"Let us train a model on the oversampled Iris dataset which comprises over 1 billion samples."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"ExecuteTime": {
"end_time": "2020-07-14T16:08:08.898670Z",
"start_time": "2020-07-14T15:59:33.194286Z"
},
"tags": [
"skip-ci"
]
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "88b195fa8e9b4086a999e5da0b53a6a6",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"HBox(children=(FloatProgress(value=0.0, max=1.0), Label(value='In progress...')))"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"
\n",
"\n",
"
#
sepal_length
sepal_width
petal_length
petal_width
class_
prediction
\n",
"\n",
"\n",
"
0
5.9
3.0
4.2
1.5
1
1
\n",
"
1
6.1
3.0
4.6
1.4
1
1
\n",
"
2
6.6
2.9
4.6
1.3
1
1
\n",
"
3
6.7
3.3
5.7
2.1
2
2
\n",
"
4
5.5
4.2
1.4
0.2
0
0
\n",
"
...
...
...
...
...
...
...
\n",
"
1,004,999,995
5.2
3.4
1.4
0.2
0
0
\n",
"
1,004,999,996
5.1
3.8
1.6
0.2
0
0
\n",
"
1,004,999,997
5.8
2.6
4.0
1.2
1
1
\n",
"
1,004,999,998
5.7
3.8
1.7
0.3
0
0
\n",
"
1,004,999,999
6.2
2.9
4.3
1.3
1
1
\n",
"\n",
"
"
],
"text/plain": [
"# sepal_length sepal_width petal_length petal_width class_ prediction\n",
"0 5.9 3.0 4.2 1.5 1 1\n",
"1 6.1 3.0 4.6 1.4 1 1\n",
"2 6.6 2.9 4.6 1.3 1 1\n",
"3 6.7 3.3 5.7 2.1 2 2\n",
"4 5.5 4.2 1.4 0.2 0 0\n",
"... ... ... ... ... ... ...\n",
"1,004,999,995 5.2 3.4 1.4 0.2 0 0\n",
"1,004,999,996 5.1 3.8 1.6 0.2 0 0\n",
"1,004,999,997 5.8 2.6 4.0 1.2 1 1\n",
"1,004,999,998 5.7 3.8 1.7 0.3 0 0\n",
"1,004,999,999 6.2 2.9 4.3 1.3 1 1"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from vaex.ml.sklearn import IncrementalPredictor\n",
"from sklearn.linear_model import SGDClassifier\n",
"\n",
"df = vaex.datasets.iris_1e9()\n",
"\n",
"features = ['petal_length', 'petal_width', 'sepal_length', 'sepal_width']\n",
"target = 'class_'\n",
"\n",
"model = SGDClassifier(learning_rate='constant', eta0=0.0001, random_state=42)\n",
"vaex_model = IncrementalPredictor(features=features, target=target, model=model, \n",
" batch_size=500_000, partial_fit_kwargs={'classes':[0, 1, 2]})\n",
"\n",
"vaex_model.fit(df=df, progress='widget')\n",
"\n",
"df = vaex_model.transform(df)\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `XGBoost` example\n",
"\n",
"Libraries such as `XGBoost` provide more options such as validation during training and early stopping for example. We provide wrappers that keeps close to the native API of these libraries, in addition to the `scikit-learn` API. \n",
"\n",
"While the following example showcases the `XGBoost` wrapper, `vaex.ml` implements similar wrappers for `LightGBM` and `CatBoost`."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"ExecuteTime": {
"end_time": "2020-07-14T16:08:44.463784Z",
"start_time": "2020-07-14T16:08:43.893355Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[13:41:31] WARNING: /home/conda/feedstock_root/build_artifacts/xgboost_1607604574104/work/src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'multi:softmax' was changed from 'merror' to 'mlogloss'. Explicitly set eval_metric if you'd like to restore the old behavior.\n"
]
},
{
"data": {
"text/html": [
"
\n",
"\n",
"
#
sepal_length
sepal_width
petal_length
petal_width
class_
xgboost_prediction
\n",
"\n",
"\n",
"
0
5.9
3.0
4.2
1.5
1
1.0
\n",
"
1
6.1
3.0
4.6
1.4
1
1.0
\n",
"
2
6.6
2.9
4.6
1.3
1
1.0
\n",
"
3
6.7
3.3
5.7
2.1
2
2.0
\n",
"
4
5.5
4.2
1.4
0.2
0
0.0
\n",
"
...
...
...
...
...
...
...
\n",
"
80,395
5.2
3.4
1.4
0.2
0
0.0
\n",
"
80,396
5.1
3.8
1.6
0.2
0
0.0
\n",
"
80,397
5.8
2.6
4.0
1.2
1
1.0
\n",
"
80,398
5.7
3.8
1.7
0.3
0
0.0
\n",
"
80,399
6.2
2.9
4.3
1.3
1
1.0
\n",
"\n",
"
"
],
"text/plain": [
"# sepal_length sepal_width petal_length petal_width class_ xgboost_prediction\n",
"0 5.9 3.0 4.2 1.5 1 1.0\n",
"1 6.1 3.0 4.6 1.4 1 1.0\n",
"2 6.6 2.9 4.6 1.3 1 1.0\n",
"3 6.7 3.3 5.7 2.1 2 2.0\n",
"4 5.5 4.2 1.4 0.2 0 0.0\n",
"... ... ... ... ... ... ...\n",
"80,395 5.2 3.4 1.4 0.2 0 0.0\n",
"80,396 5.1 3.8 1.6 0.2 0 0.0\n",
"80,397 5.8 2.6 4.0 1.2 1 1.0\n",
"80,398 5.7 3.8 1.7 0.3 0 0.0\n",
"80,399 6.2 2.9 4.3 1.3 1 1.0"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from vaex.ml.xgboost import XGBoostModel\n",
"\n",
"df = vaex.datasets.iris_1e5()\n",
"df_train, df_test = df.ml.train_test_split(test_size=0.2, verbose=False)\n",
"\n",
"features = ['petal_length', 'petal_width', 'sepal_length', 'sepal_width']\n",
"target = 'class_'\n",
"\n",
"params = {'learning_rate': 0.1,\n",
" 'max_depth': 3, \n",
" 'num_class': 3, \n",
" 'objective': 'multi:softmax',\n",
" 'subsample': 1,\n",
" 'random_state': 42,\n",
" 'n_jobs': -1}\n",
"\n",
"\n",
"booster = XGBoostModel(features=features, target=target, num_boost_round=500, params=params)\n",
"booster.fit(df=df_train, evals=[(df_train, 'train'), (df_test, 'test')], early_stopping_rounds=5)\n",
"\n",
"df_test = booster.transform(df_train)\n",
"df_test"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `CatBoost` example\n",
"\n",
"The CatBoost library supports summing up models. With this feature, we can use CatBoost to train a model using data that is otherwise too large to fit in memory. The idea is to train a single CatBoost model per chunk of data, and than sum up the invidiual models to create a master model. To use this feature via `vaex.ml` just specify the `batch_size` argument in the `CatBoostModel` wrapper. One can also specify additional options such as the strategy on how to sum up the individual models, or how they should be weighted."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"ExecuteTime": {
"end_time": "2020-07-14T16:09:54.623370Z",
"start_time": "2020-07-14T16:08:46.494467Z"
},
"tags": [
"skip-ci"
]
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "bce3f89da0d24245969e3416310865f2",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"HBox(children=(FloatProgress(value=0.0, max=1.0), Label(value='In progress...')))"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"
\n",
"\n",
"
#
sepal_length
sepal_width
petal_length
petal_width
class_
catboost_prediction
\n",
"\n",
"\n",
"
0
5.9
3.0
4.2
1.5
1
array([1])
\n",
"
1
6.1
3.0
4.6
1.4
1
array([1])
\n",
"
2
6.6
2.9
4.6
1.3
1
array([1])
\n",
"
3
6.7
3.3
5.7
2.1
2
array([2])
\n",
"
4
5.5
4.2
1.4
0.2
0
array([0])
\n",
"
...
...
...
...
...
...
...
\n",
"
80,399,995
5.2
3.4
1.4
0.2
0
array([0])
\n",
"
80,399,996
5.1
3.8
1.6
0.2
0
array([0])
\n",
"
80,399,997
5.8
2.6
4.0
1.2
1
array([1])
\n",
"
80,399,998
5.7
3.8
1.7
0.3
0
array([0])
\n",
"
80,399,999
6.2
2.9
4.3
1.3
1
array([1])
\n",
"\n",
"
"
],
"text/plain": [
"# sepal_length sepal_width petal_length petal_width class_ catboost_prediction\n",
"0 5.9 3.0 4.2 1.5 1 array([1])\n",
"1 6.1 3.0 4.6 1.4 1 array([1])\n",
"2 6.6 2.9 4.6 1.3 1 array([1])\n",
"3 6.7 3.3 5.7 2.1 2 array([2])\n",
"4 5.5 4.2 1.4 0.2 0 array([0])\n",
"... ... ... ... ... ... ...\n",
"80,399,995 5.2 3.4 1.4 0.2 0 array([0])\n",
"80,399,996 5.1 3.8 1.6 0.2 0 array([0])\n",
"80,399,997 5.8 2.6 4.0 1.2 1 array([1])\n",
"80,399,998 5.7 3.8 1.7 0.3 0 array([0])\n",
"80,399,999 6.2 2.9 4.3 1.3 1 array([1])"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from vaex.ml.catboost import CatBoostModel\n",
"\n",
"df = vaex.datasets.iris_1e8()\n",
"df_train, df_test = df.ml.train_test_split(test_size=0.2, verbose=False)\n",
"\n",
"features = ['petal_length', 'petal_width', 'sepal_length', 'sepal_width']\n",
"target = 'class_'\n",
"\n",
"params = {\n",
" 'leaf_estimation_method': 'Gradient',\n",
" 'learning_rate': 0.1,\n",
" 'max_depth': 3,\n",
" 'bootstrap_type': 'Bernoulli',\n",
" 'subsample': 0.8,\n",
" 'sampling_frequency': 'PerTree',\n",
" 'colsample_bylevel': 0.8,\n",
" 'reg_lambda': 1,\n",
" 'objective': 'MultiClass',\n",
" 'eval_metric': 'MultiClass',\n",
" 'random_state': 42,\n",
" 'verbose': 0,\n",
"}\n",
"\n",
"booster = CatBoostModel(features=features, target=target, num_boost_round=23, \n",
" params=params, prediction_type='Class', batch_size=11_000_000)\n",
"booster.fit(df=df_train, progress='widget')\n",
"\n",
"df_test = booster.transform(df_train)\n",
"df_test"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `Keras` example\n",
"\n",
"`Keras` is the most popular high-level API to building neural network models with tensorflow as its backend. Neural networks can have very diverse and complicated architectures, and their training loops can be both simple and sophisticated. This is why, at least for now, we leave the users to train their `keras` models as they normaly would, and in `vaex-ml` provides a simple wrapper for serialization and lazy evaluation of those models. In addition, `vaex-ml` also provides a convenience method to turn a DataFrame into a generator, suitable for training of `Keras` models. See the example below."
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2021-08-14 23:47:55.800260: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory\n",
"2021-08-14 23:47:55.800282: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Recommended \"steps_per_epoch\" arg: 516.0\n",
"Recommended \"steps_per_epoch\" arg: 65.0\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"2021-08-14 23:47:57.111408: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero\n",
"2021-08-14 23:47:57.111910: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory\n",
"2021-08-14 23:47:57.111974: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublas.so.11'; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory\n",
"2021-08-14 23:47:57.112032: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublasLt.so.11'; dlerror: libcublasLt.so.11: cannot open shared object file: No such file or directory\n",
"2021-08-14 23:47:57.112093: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcufft.so.10'; dlerror: libcufft.so.10: cannot open shared object file: No such file or directory\n",
"2021-08-14 23:47:57.112150: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcurand.so.10'; dlerror: libcurand.so.10: cannot open shared object file: No such file or directory\n",
"2021-08-14 23:47:57.112206: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusolver.so.11'; dlerror: libcusolver.so.11: cannot open shared object file: No such file or directory\n",
"2021-08-14 23:47:57.112261: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory\n",
"2021-08-14 23:47:57.112317: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory\n",
"2021-08-14 23:47:57.112327: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1835] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.\n",
"Skipping registering GPU devices...\n",
"2021-08-14 23:47:57.112682: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA\n",
"To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Epoch 1/11\n",
" 11/516 [..............................] - ETA: 2s - loss: 1.7922 "
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"2021-08-14 23:47:57.326751: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"516/516 [==============================] - 3s 6ms/step - loss: 0.2172 - val_loss: 0.1724\n",
"Epoch 2/11\n",
"516/516 [==============================] - 3s 6ms/step - loss: 0.1736 - val_loss: 0.1715\n",
"Epoch 3/11\n",
"516/516 [==============================] - 3s 6ms/step - loss: 0.1729 - val_loss: 0.1705\n",
"Epoch 4/11\n",
"516/516 [==============================] - 3s 6ms/step - loss: 0.1725 - val_loss: 0.1707\n",
"Epoch 5/11\n",
"516/516 [==============================] - 3s 6ms/step - loss: 0.1722 - val_loss: 0.1708\n",
"Epoch 6/11\n",
"516/516 [==============================] - 3s 6ms/step - loss: 0.1720 - val_loss: 0.1701\n",
"Epoch 7/11\n",
"516/516 [==============================] - 3s 6ms/step - loss: 0.1718 - val_loss: 0.1697\n",
"Epoch 8/11\n",
"516/516 [==============================] - 3s 6ms/step - loss: 0.1717 - val_loss: 0.1706\n",
"Epoch 9/11\n",
"516/516 [==============================] - 3s 6ms/step - loss: 0.1715 - val_loss: 0.1698\n",
"Epoch 10/11\n",
"516/516 [==============================] - 3s 6ms/step - loss: 0.1714 - val_loss: 0.1702\n",
"Epoch 11/11\n",
"516/516 [==============================] - 3s 6ms/step - loss: 0.1713 - val_loss: 0.1701\n",
"INFO:tensorflow:Assets written to: /tmp/tmp14gsptzz/assets\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"2021-08-14 23:48:31.519641: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.\n"
]
},
{
"data": {
"text/html": [
"
\n",
"\n",
"
#
id
x
y
z
vx
vy
vz
E
L
Lz
FeH
minmax_scaled_x
minmax_scaled_y
minmax_scaled_z
minmax_scaled_vx
minmax_scaled_vy
minmax_scaled_vz
keras_pred
\n",
"\n",
"\n",
"
0
23
0.137403
-5.07974
1.40165
111.828
62.8776
-88.121
-134786
700.236
576.698
-1.7935
0.375163
0.72055
0.397008
0.570648
0.56065
0.414253
array([-1.6143968], dtype=float32)
\n",
"
1
31
-1.95543
-0.840676
1.26239
-259.282
20.8279
-148.457
-134990
676.813
-258.7
-0.623007
0.365132
0.738746
0.395427
0.266912
0.5249
0.357964
array([-1.509573], dtype=float32)
\n",
"
2
22
2.33077
-0.570014
0.761285
-53.4566
-43.377
-71.3196
-177062
196.209
-131.573
-0.889463
0.385676
0.739908
0.389737
0.43537
0.470313
0.429927
array([-1.5752358], dtype=float32)
\n",
"
3
26
0.777881
-2.83258
0.0797214
256.427
202.451
-12.76
-125176
884.581
883.833
-1.65996
0.378233
0.730196
0.381998
0.688994
0.679314
0.484558
array([-1.6558373], dtype=float32)
\n",
"
4
1
3.37429
2.62885
-0.797169
300.697
153.772
83.9173
-97150.4
681.868
-271.616
-1.6496
0.390678
0.753639
0.372041
0.725228
0.637928
0.574749
array([-1.6719546], dtype=float32)
\n",
"\n",
"
"
],
"text/plain": [
" # id x y z vx vy vz E L Lz FeH minmax_scaled_x minmax_scaled_y minmax_scaled_z minmax_scaled_vx minmax_scaled_vy minmax_scaled_vz keras_pred\n",
" 0 23 0.137403 -5.07974 1.40165 111.828 62.8776 -88.121 -134786 700.236 576.698 -1.7935 0.375163 0.72055 0.397008 0.570648 0.56065 0.414253 array([-1.6143968], dtype=float32)\n",
" 1 31 -1.95543 -0.840676 1.26239 -259.282 20.8279 -148.457 -134990 676.813 -258.7 -0.623007 0.365132 0.738746 0.395427 0.266912 0.5249 0.357964 array([-1.509573], dtype=float32)\n",
" 2 22 2.33077 -0.570014 0.761285 -53.4566 -43.377 -71.3196 -177062 196.209 -131.573 -0.889463 0.385676 0.739908 0.389737 0.43537 0.470313 0.429927 array([-1.5752358], dtype=float32)\n",
" 3 26 0.777881 -2.83258 0.0797214 256.427 202.451 -12.76 -125176 884.581 883.833 -1.65996 0.378233 0.730196 0.381998 0.688994 0.679314 0.484558 array([-1.6558373], dtype=float32)\n",
" 4 1 3.37429 2.62885 -0.797169 300.697 153.772 83.9173 -97150.4 681.868 -271.616 -1.6496 0.390678 0.753639 0.372041 0.725228 0.637928 0.574749 array([-1.6719546], dtype=float32)"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import vaex.ml.tensorflow\n",
"import tensorflow.keras as K\n",
"\n",
"df = vaex.example()\n",
"df_train, df_valid, df_test = df.split_random([0.8, 0.1, 0.1], random_state=42)\n",
"\n",
"features = ['x', 'y', 'z', 'vx', 'vy', 'vz']\n",
"target = 'FeH'\n",
"\n",
"# Scaling the features\n",
"df_train = df_train.ml.minmax_scaler(features=features)\n",
"features = df_train.get_column_names(regex='^minmax_')\n",
"\n",
"# Apply preprocessing to the validation\n",
"state_prep = df_train.state_get()\n",
"df_valid.state_set(state_prep)\n",
"\n",
"# Generators for the train and validation sets\n",
"gen_train = df_train.ml.tensorflow.to_keras_generator(features=features, target=target, batch_size=512)\n",
"gen_valid = df_valid.ml.tensorflow.to_keras_generator(features=features, target=target, batch_size=512)\n",
"\n",
"# Create and fit a simple Sequential Keras model\n",
"nn_model = K.Sequential()\n",
"nn_model.add(K.layers.Dense(3, activation='tanh'))\n",
"nn_model.add(K.layers.Dense(1, activation='linear'))\n",
"nn_model.compile(optimizer='sgd', loss='mse')\n",
"nn_model.fit(x=gen_train, validation_data=gen_valid, epochs=11, steps_per_epoch=516, validation_steps=65)\n",
"\n",
"# Serialize the model\n",
"keras_model = vaex.ml.tensorflow.KerasModel(features=features, prediction_name='keras_pred', model=nn_model)\n",
"df_train = keras_model.transform(df_train)\n",
"\n",
"# Apply all the transformations to the test set\n",
"state = df_train.state_get()\n",
"df_test.state_set(state)\n",
"\n",
"# Preview the results\n",
"df_test.head(5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `River` example\n",
"\n",
"`River` is an up-and-coming library for online learning, and provides a variety of models that can learn incrementally. While most of the `river` models currently support per-sample training, few do support mini-batch training which is extremely fast - a great synergy to do machine learning with vaex."
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"ExecuteTime": {
"end_time": "2021-04-13T11:12:20.713420Z",
"start_time": "2021-04-13T11:12:20.695920Z"
},
"tags": [
"skip-ci"
]
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "385a30c0435042b0a69ec5e8ef3c3a48",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"HBox(children=(FloatProgress(value=0.0, max=1.0), Label(value='In progress...')))"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"
\n",
"\n",
"
#
sepal_length
sepal_width
petal_length
petal_width
class_
prediction_raw
\n",
"\n",
"\n",
"
0
5.9
3.0
4.2
1.5
1
1.2262451850482554
\n",
"
1
6.1
3.0
4.6
1.4
1
1.3372106202149072
\n",
"
2
6.6
2.9
4.6
1.3
1
1.3080263625894342
\n",
"
3
6.7
3.3
5.7
2.1
2
1.8246442870772779
\n",
"
4
5.5
4.2
1.4
0.2
0
-0.1719159051653813
\n",
"
...
...
...
...
...
...
...
\n",
"
200,999,995
5.2
3.4
1.4
0.2
0
-0.06961837848289065
\n",
"
200,999,996
5.1
3.8
1.6
0.2
0
-0.04133966888449841
\n",
"
200,999,997
5.8
2.6
4.0
1.2
1
1.1380612859534056
\n",
"
200,999,998
5.7
3.8
1.7
0.3
0
-0.005633275295105093
\n",
"
200,999,999
6.2
2.9
4.3
1.3
1
1.2171097577656713
\n",
"\n",
"
"
],
"text/plain": [
"# sepal_length sepal_width petal_length petal_width class_ prediction_raw\n",
"0 5.9 3.0 4.2 1.5 1 1.2262451850482554\n",
"1 6.1 3.0 4.6 1.4 1 1.3372106202149072\n",
"2 6.6 2.9 4.6 1.3 1 1.3080263625894342\n",
"3 6.7 3.3 5.7 2.1 2 1.8246442870772779\n",
"4 5.5 4.2 1.4 0.2 0 -0.1719159051653813\n",
"... ... ... ... ... ... ...\n",
"200,999,995 5.2 3.4 1.4 0.2 0 -0.06961837848289065\n",
"200,999,996 5.1 3.8 1.6 0.2 0 -0.04133966888449841\n",
"200,999,997 5.8 2.6 4.0 1.2 1 1.1380612859534056\n",
"200,999,998 5.7 3.8 1.7 0.3 0 -0.005633275295105093\n",
"200,999,999 6.2 2.9 4.3 1.3 1 1.2171097577656713"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from vaex.ml.incubator.river import RiverModel\n",
"from river.linear_model import LinearRegression\n",
"from river import optim\n",
"\n",
"\n",
"df = vaex.datasets.iris_1e9()\n",
"df_train, df_test = df.ml.train_test_split(test_size=0.2, verbose=False)\n",
"\n",
"features = ['petal_length', 'petal_width', 'sepal_length', 'sepal_width']\n",
"target = 'class_'\n",
"\n",
"river_model = RiverModel(features=features,\n",
" target=target,\n",
" model=LinearRegression(optimizer=optim.SGD(0.001), intercept_lr=0.001),\n",
" prediction_name='prediction_raw',\n",
" batch_size=500_000)\n",
"river_model.fit(df_train, progress='widget')\n",
"river_model.transform(df_test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Metrics\n",
"\n",
"`vaex-ml` also provides several of the most common evaluation metrics for classification and regression tasks. These metrics are implemented in `vaex-ml` and thus are evaluated out-of-core, so you do not need to materialize the target and predicted columns. \n",
"\n",
"Here is a list of the currently supported metrics:\n",
"\n",
"- Classification (binary, and macro-average for multiclass problems):\n",
" - Accuracy\n",
" - Precision\n",
" - Recall\n",
" - F1-score\n",
" - Confusion matrix\n",
" - Classification report (a convenience method, which prints out the accuracy, precision, recall, and F1-score at the same time)\n",
" - Matthews Correlation Coeficient\n",
"- Regression\n",
" - Mean Absolute Error\n",
" - Mean Squared Error\n",
" - R2 Correlation Score\n",
"\n",
"Here is a simple example:"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
" Classification report:\n",
"\n",
" Accuracy: 0.933\n",
" Precision: 0.928\n",
" Recall: 0.928\n",
" F1: 0.928\n",
" \n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/jovan/vaex/packages/vaex-core/vaex/dataframe.py:5516: UserWarning: It seems your column class_ is already ordinal encoded (values between 0 and 2), automatically switching to use df.categorize\n",
" warnings.warn(f'It seems your column {column} is already ordinal encoded (values between {min_value} and {max_value}), automatically switching to use df.categorize')\n",
"/home/jovan/vaex/packages/vaex-core/vaex/dataframe.py:5516: UserWarning: It seems your column pred is already ordinal encoded (values between 0 and 2), automatically switching to use df.categorize\n",
" warnings.warn(f'It seems your column {column} is already ordinal encoded (values between {min_value} and {max_value}), automatically switching to use df.categorize')\n"
]
}
],
"source": [
"import vaex.ml.metrics\n",
"from sklearn.linear_model import LogisticRegression\n",
"\n",
"df = vaex.datasets.iris()\n",
"df_train, df_test = df.split_random([0.8, 0.2], random_state=55)\n",
"\n",
"features = ['petal_length', 'petal_width', 'sepal_length', 'sepal_width']\n",
"target = 'class_'\n",
"\n",
"model = LogisticRegression(random_state=42)\n",
"vaex_model = Predictor(features=features, target=target, model=model, prediction_name='pred')\n",
"\n",
"vaex_model.fit(df=df_train)\n",
"\n",
"df_test = vaex_model.transform(df_test)\n",
"\n",
"print(df_test.ml.metrics.classification_report(df_test.class_, df_test.pred, average='macro'))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## State transfer - pipelines made easy\n",
"\n",
"Each `vaex` DataFrame consists of two parts: _data_ and _state_. The _data_ is immutable, and any operation such as filtering, adding new columns, or applying transformers or predictive models just modifies the _state_. This is extremely powerful concept and can completely redefine how we imagine machine learning pipelines. \n",
"\n",
"As an example, let us once again create a model based on the Iris dataset. Here, we will create a couple of new features, do a PCA transformation, and finally train a predictive model. "
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"ExecuteTime": {
"end_time": "2020-07-14T16:10:19.919524Z",
"start_time": "2020-07-14T16:10:19.873625Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"