Skip to content
This repository was archived by the owner on Dec 29, 2022. It is now read-only.
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions docs/concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,9 @@ An encoder reads in "source data", e.g. a sequence of words or an image, and pro

## Decoder

A decoder is a generative model that is conditioned on the representation created by the encoder. For example, a Recurrent Neural Network decoder may learn generate the translation for an encoded sentence in another language. For a list of available decoder, see the [Decoder Reference](decoders/).
A decoder is a generative model that is conditioned on the representation created by the encoder. For example, a Recurrent Neural Network decoder may learn to generate the translation for an encoded sentence in another language. For a list of available decoder, see the [Decoder Reference](decoders/).


## Model

A model defines how to put together an encoder and decoder, and how to calculate and minize the loss functions. It also handles the necessary preprocessing of data read from an input pipeline. Under the hood, each model is implemented as a [model_fn passed to a tf.contrib.learn Estimator](https://www.tensorflow.org/api_docs/python/tf/contrib/learn/Estimator). For a list of available models, see the [Models Reference](models/).

A model defines how to put together an encoder and decoder, and how to calculate and minimize the loss functions. It also handles the necessary preprocessing of data read from an input pipeline. Under the hood, each model is implemented as a [model_fn passed to a tf.contrib.learn Estimator](https://www.tensorflow.org/api_docs/python/tf/contrib/learn/Estimator). For a list of available models, see the [Models Reference](models/).
4 changes: 1 addition & 3 deletions docs/encoders.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ An encoder that pools over embeddings, as described in [https://arxiv.org/abs/16
| --- | --- | --- |
| `pooling_fn` | `tensorflow.layers.average_pooling1d` | The 1-d pooling function to use, e.g. `tensorflow.layers.average_pooling1d`. |
| `pool_size` | `5` | The pooling window, passed as `pool_size` to the pooling function. |
| `strides` | `1` | The stride during pooling, passed as `strides` the pooling function. |
| `strides` | `1` | The stride during pooling, passed as `strides` to the pooling function. |
| `position_embeddings.enable` | `True` | If true, add position embeddings to the inputs before pooling. |
| `position_embeddings.combiner_fn` | `tensorflow.add` | Function used to combine the position embeddings with the inputs. For example, `tensorflow.add`. |
| `position_embeddings.num_positions` | `100` | Size of the position embedding matrix. This should be set to the maximum sequence length of the inputs. |
Expand All @@ -56,5 +56,3 @@ hidden layer before the logits as the feature representation.
| --- | --- | --- |
| `resize_height` | `299` | Resize the image to this height before feeding it into the convolutional network. |
| `resize_width` | `299` | Resize the image to this width before feeding it into the convolutional network. |


4 changes: 2 additions & 2 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ We built tf-seq2seq with the following goals in mind:

- **Usability**: You can train a model with a single command. Several types of input data are supported, including standard raw text.

- **Reproducibility**: Training pipelines and models are configured using YAML files. This allows other to run your exact same model configurations.
- **Reproducibility**: Training pipelines and models are configured using YAML files. This allows others to run your exact same model configurations.

- **Extensibility**: Code is structured in a modular way and that easy to build upon. For example, adding a new type of attention mechanism or encoder architecture requires only minimal code changes.
- **Extensibility**: Code is structured in a modular way and that's easy to build upon. For example, adding a new type of attention mechanism or encoder architecture requires only minimal code changes.

- **Documentation**: All code is documented using standard Python docstrings, and we have written guides to help you get started with common tasks.

Expand Down
2 changes: 1 addition & 1 deletion docs/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ python -m bin.infer \
...
```

By default, this script generates an `attention_score.npy` array file and one attention plot per example. The array file can be [loaded used numpy](https://docs.scipy.org/doc/numpy/reference/generated/numpy.load.html) and will contain a list of arrays with shape `[target_length, source_length]`. If you only want the raw attention score data without the plots you can enable the `dump_atention_no_plot` parameter.
By default, this script generates an `attention_score.npy` array file and one attention plot per example. The array file can be [loaded used numpy](https://docs.scipy.org/doc/numpy/reference/generated/numpy.load.html) and will contain a list of arrays with shape `[target_length, source_length]`. If you only want the raw attention score data without the plots you can enable the `dump_attention_no_plot` parameter.



Expand Down
2 changes: 1 addition & 1 deletion docs/nmt.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ export TRAIN_STEPS=1000000

## Alternative: Generate Toy Data

Training on real-world translation data can take a very long time. If you do not have access to a machine with a GPU but would like to play around with a smaller dataset, we provide a way to generate toy data. The following command will generate a dataset where the target sequences are reversed source sequences. That is, the model needs to learn the reverse the inputs. While this task is not very useful in practice, we can train such a model quickly and use it as as sanity-check to make sure that the end-to-end pipeline is working as intended.
Training on real-world translation data can take a very long time. If you do not have access to a machine with a GPU but would like to play around with a smaller dataset, we provide a way to generate toy data. The following command will generate a dataset where the target sequences are reversed source sequences. That is, the model needs to learn the reverse of the inputs. While this task is not very useful in practice, we can train such a model quickly and use it as as sanity-check to make sure that the end-to-end pipeline is working as intended.

```
DATA_TYPE=reverse ./bin/data/toy.sh
Expand Down
2 changes: 1 addition & 1 deletion docs/tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ To run training on characters you must pass set `source_delimiter` and `target_d

## Visualizing Beam Search

If you use the `DumpBeams` inference task (see [Inference](inference/) for more details) you can inspect the beam search data by loading the array using numpy, or generate beam search visualizations using the `generate_beam_viz.py` script. This required the `networkx` module to be installed.
If you use the `DumpBeams` inference task (see [Inference](inference/) for more details) you can inspect the beam search data by loading the array using numpy, or generate beam search visualizations using the `generate_beam_viz.py` script. This requires the `networkx` module to be installed.

```
python -m bin.tools.generate_beam_viz \
Expand Down
12 changes: 6 additions & 6 deletions seq2seq/test/hooks_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,16 +39,16 @@ class TestPrintModelAnalysisHook(tf.test.TestCase):
def test_begin(self):
model_dir = tempfile.mkdtemp()
outfile = tempfile.NamedTemporaryFile()
tf.get_variable("weigths", [128, 128])
tf.get_variable("weights", [128, 128])
hook = hooks.PrintModelAnalysisHook(
params={}, model_dir=model_dir, run_config=tf.contrib.learn.RunConfig())
hook.begin()

with gfile.GFile(os.path.join(model_dir, "model_analysis.txt")) as file:
file_contents = file.read().strip()
file_contents = tf.compat.as_text(file.read()).strip()

self.assertEqual(file_contents.decode(), "_TFProfRoot (--/16.38k params)\n"
" weigths (128x128, 16.38k/16.38k params)")
self.assertEqual(file_contents, "_TFProfRoot (--/16.38k params)\n"
" weights (128x128, 16.38k/16.38k params)")
outfile.close()


Expand Down Expand Up @@ -94,7 +94,7 @@ def test_sampling(self):
outfile = os.path.join(self.sample_dir, "samples_000000.txt")
with open(outfile, "rb") as readfile:
self.assertIn("Prediction followed by Target @ Step 0",
readfile.read().decode("utf-8"))
tf.compat.as_text(readfile.read()))

# Should not trigger for step 9
sess.run(tf.assign(global_step, 9))
Expand All @@ -108,7 +108,7 @@ def test_sampling(self):
outfile = os.path.join(self.sample_dir, "samples_000010.txt")
with open(outfile, "rb") as readfile:
self.assertIn("Prediction followed by Target @ Step 10",
readfile.read().decode("utf-8"))
tf.compat.as_text(readfile.read()))


class TestMetadataCaptureHook(tf.test.TestCase):
Expand Down