Two ways to generate text from a trained encoder-decoder model at serving time are:

1. Greedy Search: In this method, the model makes the most probable prediction at each step and then feeds this prediction as input to the next step. It's called "greedy" because it takes the best short-term decision at each step in the hope that these would lead to the best overall sequence.

2. Beam Search: This method is an improved version of the greedy search where instead of choosing the most probable next step as the prediction, the model keeps track of multiple predictions and chooses the sequence of predictions that has the overall highest probability. The number of alternative sequences that it keeps track of is determined by a parameter called the "beam width".

Question

Two ways to generate text from a trained encoder-decoder model at serving time are:

1. Greedy Search: In this method, the model makes the most probable prediction at each step and then feeds this prediction as input to the next step. It's called "greedy" because it takes the best short-term decision at each step in the hope that these would lead to the best overall sequence.

2. Beam Search: This method is an improved version of the greedy search where instead of choosing the most probable next step as the prediction, the model keeps track of multiple predictions and chooses the sequence of predictions that has the overall highest probability. The number of alternative sequences that it keeps track of is determined by a parameter called the "beam width".

Knowee AI · Accepted Answer

Two ways to generate text from a trained encoder-decoder model at serving time are:

1. Greedy Search: In this method, the model makes the most probable prediction at each step and then feeds this prediction as input to the next step. It's called "greedy" because it takes the best short-term decision at each step in the hope that these would lead to the best overall sequence.

2. Beam Search: This method is an improved version of the greedy search where instead of choosing the most probable next step as the prediction, the model keeps track of multiple predictions and chooses the sequence of predictions that has the overall highest probability. The number of alternative sequences that it keeps track of is determined by a parameter called the "beam width".

What are two ways to generate text from a trained encoder-decoder model at serving time?Teacher forcing and attentionTeacher forcing and beam searchGreedy search and attentionGreedy search and beam search

Question

Solution

Similar Questions

Upgrade your grade with Knowee