Implementing seq2seq with beam search

2024/10/4 3:31:31

I'm now implementing seq2seq model based on the example code that tensorflow provides. And I want to get a top-5 decoder outputs to do a reinforcement learning.

However, they implemented translation model with attention decoder so, I should implement beam-search for getting top-k results.

There is a part of code that now implement (this code is added to translate.py).

Reference by https://github.com/tensorflow/tensorflow/issues/654

with tf.Graph().as_default():beam_size = FLAGS.beam_size # Number of hypotheses in beamnum_symbols = FLAGS.tar_vocab_size # Output vocabulary sizeembedding_size = 10num_steps = 5embedding = tf.zeros([num_symbols, embedding_size])output_projection = Nonelog_beam_probs, beam_symbols, beam_path = [], [], []def beam_search(prev, i):if output_projection is not None:prev = tf.nn.xw_plus_b(prev, output_projection[0], output_projection[1])probs = tf.log(tf.nn.softmax(prev))if i > 1:probs = tf.reshape(probs + log_beam_probs[-1], [-1, beam_size * num_symbols])best_probs, indices = tf.nn.top_k(probs, beam_size)indices = tf.stop_gradient(tf.squeeze(tf.reshape(indices, [-1, 1])))best_probs = tf.stop_gradient(tf.reshape(best_probs, [-1, 1]))symbols = indices % num_symbols      # which word in vocabularybeam_parent = indices // num_symbols # which hypothesis it came frombeam_symbols.append(symbols)beam_path.append(beam_parent)log_beam_probs.append(best_probs)return tf.nn.embedding_lookup(embedding, symbols)# Setting up graph.inputs = [tf.placeholder(tf.float32, shape=[None, num_symbols]) for i in range(num_steps)]for i in range(num_steps):beam_search(inputs[i], i+1)input_vals = tf.zeros([1, beam_size], dtype=tf.float32)input_feed = {inputs[i]: input_vals[i][:beam_size, :] for i in xrange(num_steps)}output_feed = beam_symbols + beam_path + log_beam_probssession = tf.InteractiveSession()outputs = session.run(output_feed, feed_dict=input_feed)print("Top_5 Sentences ")for predicted in enumerate(outputs[:5]):print(list(predicted))print("\n")

In input_feed part, there is an error:

ValueError: Shape (1, 12) must have rank 1

Is there any problem on my code to do beam-search?

Answer

A tried and true demo:

# -*- coding: utf-8 -*-from __future__ import unicode_literals, print_function
from __future__ import absolute_import
from __future__ import divisionimport tensorflow as tftf.app.flags.DEFINE_integer('beam_size', 4, 'beam size for beam search decoding.')
tf.app.flags.DEFINE_integer('vocab_size', 40, 'vocabulary size.')
tf.app.flags.DEFINE_integer('batch_size', 5, 'the batch size.')
tf.app.flags.DEFINE_integer('num_steps', 10, 'the batch size.')
tf.app.flags.DEFINE_integer('embedding_size', 50, 'the batch size.')FLAGS = tf.app.flags.FLAGSwith tf.Graph().as_default():batch_size = FLAGS.batch_sizebeam_size = FLAGS.beam_size  # Number of hypotheses in beamvocab_size = FLAGS.vocab_size  # Output vocabulary sizenum_steps = FLAGS.num_stepsembedding_size = FLAGS.embedding_sizeembedding = tf.random_normal([vocab_size, embedding_size], -2, 4, dtype=tf.float32, seed=0)output_projection = [tf.random_normal([embedding_size, vocab_size], mean=2, stddev=1, dtype=tf.float32, seed=0),tf.random_normal([vocab_size], mean=0, stddev=1, dtype=tf.float32, seed=0),]index_base = tf.reshape(tf.tile(tf.expand_dims(tf.range(batch_size) * beam_size, axis=1), [1, beam_size]), [-1])log_beam_probs, beam_symbols = [], []def beam_search(prev, i):if output_projection is not None:prev = tf.nn.xw_plus_b(prev, output_projection[0], output_projection[1])# (batch_size*beam_size, embedding_size) -> (batch_size*beam_size, vocab_size)log_probs = tf.nn.log_softmax(prev)if i > 1:# total probabilitylog_probs = tf.reshape(tf.reduce_sum(tf.stack(log_beam_probs, axis=1), axis=1) + log_probs,[-1, beam_size * vocab_size])# (batch_size*beam_size, vocab_size) -> (batch_size, beam_size*vocab_size)best_probs, indices = tf.nn.top_k(log_probs, beam_size)# (batch_size, beam_size)indices = tf.squeeze(tf.reshape(indices, [-1, 1]))best_probs = tf.reshape(best_probs, [-1, 1])# (batch_size*beam_size)symbols = indices % vocab_size       # which word in vocabularybeam_parent = indices // vocab_size  # which hypothesis it came frombeam_symbols.append(symbols)# (batch_size*beam_size, num_steps)real_path = beam_parent + index_base# get rid of the previous probabilityif i > 1:pre_sum = tf.reduce_sum(tf.stack(log_beam_probs, axis=1), axis=1)pre_sum = tf.gather(pre_sum, real_path)else:pre_sum = 0log_beam_probs.append(best_probs-pre_sum)# adapt the previous symbols according to the current symbolif i > 1:for j in range(i)[:0:-1]:beam_symbols[j-1] = tf.gather(beam_symbols[j-1], real_path)log_beam_probs[j-1] = tf.gather(log_beam_probs[j-1], real_path)return tf.nn.embedding_lookup(embedding, symbols)# (batch_size*beam_size, embedding_size)# Setting up graph.init_input = tf.placeholder(tf.float32, shape=[batch_size, embedding_size])next_input = init_inputfor i in range(num_steps):next_input = beam_search(next_input, i+1)seq_rank = tf.stack(values=beam_symbols, axis=1)seq_rank = tf.reshape(seq_rank, [batch_size, beam_size, num_steps])# (batch_size*beam_size, num_steps)init_in = tf.random_uniform([batch_size], minval=0, maxval=vocab_size, dtype=tf.int32, seed=0),init_emb = tf.squeeze(tf.nn.embedding_lookup(embedding, init_in))session = tf.InteractiveSession()init_emb = init_emb.eval()seq_rank = session.run(seq_rank, feed_dict={init_input: init_emb})best_seq = seq_rank[:, 1, :]for i in range(batch_size):print("rank %s" % i, end=": ")print(best_seq[i])

It is simplified from the beam search model in my seq2seq model. Python2.7 and TF1.4

https://en.xdnf.cn/q/70654.html

Related Q&A

Pandas Random Weighted Choice

I would like to randomly select a value in consideration of weightings using Pandas.df:0 1 2 3 4 5 0 40 5 20 10 35 25 1 24 3 12 6 21 15 2 72 9 36 18 63 45 3 8 1 4 2 7 5 4 16 2 8 4…

Matplotlib TypeError: NoneType object is not callable

Ive run this code many times but now its failing. Matplotlib wont work for any example, even the most trivial. This is the error Im getting, but Im not sure what to make of it. I know this is vague and…

Resize image faster in OpenCV Python

I have a lot of image files in a folder (5M+). These images are of different sizes. I want to resize these images to 128x128. I used the following function in a loop to resize in Python using OpenCVdef…

How to install Yandex CatBoost on Anaconda x64?

Iv successfully installed CatBoost via pip install catboostBut Iv got errors, when I tried sample python script in Jupiter Notebookimport numpy as np from catboost import CatBoostClassifierImportError:…

pyspark returns a no module named error for a custom module

I would like to import a .py file that contains some modules. I have saved the files init.py and util_func.py under this folder:/usr/local/lib/python3.4/site-packages/myutilThe util_func.py contains al…

Perform a conditional operation on a pandas column

I know that this should be simple, but I want to take a column from a pandas dataframe, and for only the entries which meet some condition (say less than 1), multiply by a scalar (say 2).For example, i…

How to programmatically get SVN revision number?

Like this question, but without the need to actually query the SVN server. This is a web-based project, so I figure Ill just use the repository as the public view (unless someone can advise me why this…

Convert fractional years to a real date in Python

How do I convert fractional years to a real date by using Python? E. g. I have an array [2012.343, 2012.444, 2012.509] containing fractional years and I would like to get "yyyy-mm-dd hh:mm".

Django template: Translate include with variable

I have a template in which you can pass a text variable. I want to include this template into another one but with a translated text as its variable. How can you achieve this?I would like something li…

Pandas - Creating a New Column

I have always made new columns in pandas using the following:df[new_column] = valueI am using this method, however, am receiving the warning for setting a copy.What is the way to make a new column with…