最近复现代码过程中,需要用到 torchtext.data 中的 Field 类。本篇博客记录使用过程中的问题及解决方式。
- 注意
torchtext
版本不宜过新
在较新版本的 torchtext.data
里面并没有 Field
方法,这一点需要注意。
启示:在复现别人代码时,应同时复制他们使用环境的版本信息。
- 运行下述代码:
from torchtext.data import Field
SRC = Field(tokenize = tokenize_en,
init_token = '<sos>',
eos_token = '<eos>',
fix_length = max_length,
lower = True,
batch_first = True,
sequential=True)
TRG = Field(tokenize = tokenize_en,
init_token = '<sos>',
eos_token = '<eos>',
fix_length = max_length,
lower = True,
batch_first = True,
sequential=True)
print(SRC.vocab.stoi["<sos>"])
print(TRG.vocab.stoi["<sos>"])
报错信息:
print(SRC.vocab.stoi["<sos>"]) # 2
AttributeError: 'Field' object has no attribute 'vocab'
于是查看 Field
类的定义,寻找和词表建立相关的函数,发现其 build_vocab()
函数中有建立词表的操作, build_vocab()
函数定义如下:
class Field(RawField):
...
def build_vocab(self, *args, **kwargs):
"""Construct the Vocab object for this field from one or more datasets.
Arguments:
Positional arguments: Dataset objects or other iterable data
sources from which to construct the Vocab object that
represents the set of possible values for this field. If
a Dataset object is provided, all columns corresponding
to this field are used; individual columns can also be
provided directly.
Remaining keyword arguments: Passed to the constructor of Vocab.
"""
counter = Counter()
sources = []
for arg in args:
if isinstance(arg, Dataset):
sources += [getattr(arg, name) for name, field in
arg.fields.items() if field is self]
else:
sources.append(arg)
for data in sources:
for x in data:
if not self.sequential:
x = [x]
try:
counter.update(x)
except TypeError:
counter.update(chain.from_iterable(x))
specials = list(OrderedDict.fromkeys(
tok for tok in [self.unk_token, self.pad_token, self.init_token,
self.eos_token] + kwargs.pop('specials', [])
if tok is not None))
self.vocab = self.vocab_cls(counter, specials=specials, **kwargs)
...
解决方式:在程序中 Field
定义后添加 SRC.build_vocab()
和 TRG.build_vocab()
,程序变成:
SRC.build_vocab()
TRG.build_vocab()
print(SRC.vocab.stoi["<sos>"]) # 输出结果:2
print(TRG.vocab.stoi["<sos>"]) # 输出结果:2
至此,程序就会顺利执行啦!文章来源:https://www.toymoban.com/news/detail-539560.html
参考资料文章来源地址https://www.toymoban.com/news/detail-539560.html
- python - BucketIterator 抛出 ‘Field’ 对象没有属性 ‘vocab’ - IT工具网 (coder.work)
- ImportError: cannot import name ‘Field‘ from ‘torchtext.data‘, No module named “legacy“_no module named 'torchtext.legacy_御用厨师的博客-CSDN博客
到了这里,关于成功解决 AttributeError: ‘Field‘ object has no attribute ‘vocab‘的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!