![]() |
Returns a lookup table that maps a Tensor
of indices into strings.
tf.contrib.lookup.index_to_string_table_from_file(
vocabulary_file,
vocab_size=None,
default_value='UNK',
name=None,
key_column_index=TextFileIndex.LINE_NUMBER,
value_column_index=TextFileIndex.WHOLE_LINE,
delimiter='\t'
)
This operation constructs a lookup table to map int64 indices into string
values. The table is initialized from a vocabulary file specified in
vocabulary_file
, where the whole line is the value and the
zero-based line number is the index.
Any input which does not have a corresponding index in the vocabulary file
(an out-of-vocabulary entry) is assigned the default_value
The underlying table must be initialized by calling
session.run(tf.compat.v1.tables_initializer())
or
session.run(table.init())
once.
To specify multi-column vocabulary files, use key_column_index and value_column_index and delimiter.
- TextFileIndex.LINE_NUMBER means use the line number starting from zero, expects data type int64.
- TextFileIndex.WHOLE_LINE means use the whole line content, expects data type string.
- A value >=0 means use the index (starting at zero) of the split line based
on
delimiter
.
Sample Usages:
If we have a vocabulary file "test.txt" with the following content:
emerson
lake
palmer
indices = tf.constant([1, 5], tf.int64)
table = tf.lookup.index_to_string_table_from_file(
vocabulary_file="test.txt", default_value="UNKNOWN")
values = table.lookup(indices)
...
tf.compat.v1.tables_initializer().run()
values.eval() ==> ["lake", "UNKNOWN"]
Args:
vocabulary_file
: The vocabulary filename, may be a constant scalarTensor
.vocab_size
: Number of the elements in the vocabulary, if known.default_value
: The value to use for out-of-vocabulary indices.name
: A name for this op (optional).key_column_index
: The column index from the text file to get thekey
values from. The default is to use the line number, starting from zero.value_column_index
: The column index from the text file to get thevalue
values from. The default is to use the whole line content.delimiter
: The delimiter to separate fields in a line.
Returns:
The lookup table to map a string values associated to a given index int64
Tensors
.
Raises:
ValueError
: whenvocabulary_file
is empty.ValueError
: whenvocab_size
is invalid.