Skip to content
This repository was archived by the owner on Sep 3, 2022. It is now read-only.

Commit 0d595d1

Browse files
brandondutraqimingj
authored andcommitted
markdown edits to iris sample (#84)
* punctuation * removed refs to there being cloud notebooks too
1 parent 91dbafd commit 0d595d1

File tree

1 file changed

+9
-13
lines changed

1 file changed

+9
-13
lines changed

samples/ML Toolbox/Classification/Iris/1 Local End to End.ipynb

Lines changed: 9 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,7 @@
1313
"source": [
1414
"# About this notebook\n",
1515
"\n",
16-
"This notebook uses the datalab structured data package for building and running a Tensorflow classification model locally. This notebook uses the classic <a href=\"https://en.wikipedia.org/wiki/Iris_flower_data_set\">Iris flower data set.</a>\n",
17-
"\n",
18-
"In the notebooks that follow, an example of running preprocessing, training, and prediction using the Google Cloud Machine Learning Engine services are given. Note that running the cloud versions of preprocessing, training, and prediction take longer than the local versions. The performance advantage of using the cloud applies to very large data sets, and you don't see it with this sample because the data is small and run time is dominated by setup overhead."
16+
"This notebook uses the datalab structured data package for building and running a Tensorflow classification model locally. This notebook uses the classic <a href=\"https://en.wikipedia.org/wiki/Iris_flower_data_set\">Iris flower data set.</a>"
1917
]
2018
},
2119
{
@@ -29,7 +27,7 @@
2927
"cell_type": "markdown",
3028
"metadata": {},
3129
"source": [
32-
"Lets look at the versions of datalab_structured_data and TF we have. Make sure TF and SD are 1.0.0"
30+
"Let's look at the versions of TensorFlow and the structured data package we have. Make sure TF and SD are 1.0.0"
3331
]
3432
},
3533
{
@@ -74,7 +72,7 @@
7472
"cell_type": "markdown",
7573
"metadata": {},
7674
"source": [
77-
"This notebook will write files during preprocessing, training, and prediction. Please give a root folder you wish to use."
75+
"This notebook will write files during preprocessing, training, and prediction into a folder called 'iris_notebook_workspace'. Edit the next cell if you want to write files to a different location."
7876
]
7977
},
8078
{
@@ -90,8 +88,7 @@
9088
"# already exists, it will be deleted.\n",
9189
"LOCAL_ROOT = './iris_notebook_workspace'\n",
9290
"\n",
93-
"# No need to edit anything else in this cell. But if you do, you \n",
94-
"# might need to chagne the global variables in the cloud notebooks.\n",
91+
"# No need to edit anything else in this cell.\n",
9592
"LOCAL_PREPROCESSING_DIR = os.path.join(LOCAL_ROOT, 'preprocessing')\n",
9693
"LOCAL_TRAINING_DIR = os.path.join(LOCAL_ROOT, 'training')\n",
9794
"LOCAL_BATCH_PREDICTION_DIR = os.path.join(LOCAL_ROOT, 'batch_prediction')\n",
@@ -407,7 +404,6 @@
407404
" } \n",
408405
"]\n",
409406
"\n",
410-
"# Write schema to a file so that the cloud notebooks can use it.\n",
411407
"file_io.write_string_to_file(\n",
412408
" LOCAL_SCHEMA_FILE,\n",
413409
" json.dumps(schema, indent=2))"
@@ -466,7 +462,7 @@
466462
"cell_type": "markdown",
467463
"metadata": {},
468464
"source": [
469-
"The output of preprocessing is a numerical_analysis file that contains analysis from the numerical columns, and a vocab file from each categorical column. The files preoduced by preprocessing are consumed in training, and you should not have to worry about these files. Just for fun, lets look at them."
465+
"The output of analyze is a stats file that contains analysis from the numerical columns, and a vocab file from each categorical column. The files preoduced by analyze are consumed in training, and you should not have to worry about these files. Just for fun, lets look at them."
470466
]
471467
},
472468
{
@@ -558,7 +554,7 @@
558554
"cell_type": "markdown",
559555
"metadata": {},
560556
"source": [
561-
"The files in the output folder of preprocessing are consumed by the trainer. Training requires a transform config file to describe what transforms to apply on the data. The key and target transform are the only required transform, a default transform will be applied to every other column if it is not listed in the transforms."
557+
"The files in the output folder of analyze are consumed by the trainer. Training requires a features file to describe what transforms to apply on the data. The key and target transform are the only required transform, a default transform will be applied to every other column if it is not listed in the features dict."
562558
]
563559
},
564560
{
@@ -576,7 +572,7 @@
576572
" \"flower\": {\"transform\": \"target\"}\n",
577573
" }\n",
578574
"\n",
579-
"# Write the features to a file so that the cloud notebooks can use the same features.\n",
575+
"# Write the features to a file.\n",
580576
"file_io.write_string_to_file(\n",
581577
" LOCAL_FEATURES_FILE,\n",
582578
" json.dumps(features, indent=2)\n",
@@ -861,7 +857,7 @@
861857
"cell_type": "markdown",
862858
"metadata": {},
863859
"source": [
864-
"Local batch prediction runs prediction on batched input data. This is ideal if the input dataset is very large or you have limited available main memory. However, for very large datasets, it is better to run batch prediction using the Google Cloud Machine Learning Engine services. Two output formats are supported, csv and json. The output may also be shardded. Another feature of batch prediction is the option to run evaluation--prediction on data that contains the target column. Like local_predict, the input data must batch the schema used for training."
860+
"Local batch prediction runs prediction on batched input data. This is ideal if the input dataset is very large, or you have limited available main memory. However, for very large datasets, it is better to run batch prediction using the Google Cloud Machine Learning Engine services. Two output formats are supported: csv and json. The output may also be sharded. Another feature of batch prediction is the option to run evaluation--prediction on data that contains the target column. Like local_predict, the input data must match the schema used for training."
865861
]
866862
},
867863
{
@@ -1139,7 +1135,7 @@
11391135
"cell_type": "markdown",
11401136
"metadata": {},
11411137
"source": [
1142-
"As everything was written to LOCAL_ROOT, we can simply remove this folder. If you want to delete those files, uncomment and run the next cell. If you want to run any Service notebook, don't delete LOCAL_ROOT."
1138+
"As everything was written to LOCAL_ROOT, we can simply remove this folder. If you want to delete those files, uncomment and run the next cell."
11431139
]
11441140
},
11451141
{

0 commit comments

Comments
 (0)