Why and how attention works in neural nets

What does it mean for a machine to "pay attention"? Is it possible for dead transistors to do something that seems so alive? Possibly. ML researchers have been working on neural architectures featuring so-called "attention" mechanisms. They are proving useful in different applications of ML, especially tasks with sequence-style inputs or outputs like text. Attention… Continue reading Why and how attention works in neural nets