> We can see that the scheme that uses sequential batching actually performs worse than the CPU alone, whereas the new approach using DuHL achieves a 10× speed-up over the CPU.
I had to get down to the graph to realize they’re talking about SVM, not deep learning.
This could be pretty cool. Training a SVM has usually been “load ALL the data and go”, and sequential implementations are almost non-existent. Even if this was 1x or 0.5x speed and didn’t require the entire dataset at once it’s a big win.