修改 num_workers 和文件 seq2seq.md #1375
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
当 num_workers > 0时,每个子进程都会完整地读取整个数据集。这导致数据集被多次加载,浪费了计算资源,并且在运行时显示了重复的输出。而且在Windows上使用multiprocessing模块时,子进程的创建必须在if name == 'main':的保护下进行。建议num_workers修改为0。
在文件seq2seq中,使用 super(MaskedSoftmaxCELoss, self).forward(...) 调用基类的 forward 方法时,由于 forward 方法多了一个 valid_len 参数,这导致了签名不匹配的警告。因此不直接继承 nn.CrossEntropyLoss 并重写其 forward 方法,而是将 nn.CrossEntropyLoss 作为 MaskedSoftmaxCELoss 类的一个内部成员,并在其 forward 方法中调用内部 CrossEntropyLoss 实例的 forward 方法。