It's unclear how the 'wait multiple cycles' is such a deal-breaker. Even when using a GPU (in the standard way) there are cycles that get used in processing the different stages of a computation. But, more significantly, the Spiking Neuron thing doesn't (necessarily) have to work at a low clock rate, or any clock rate, since the same integration-through-time approach could even work asynchronously, or with jitter, etc. It's a pretty robust & low-power design (and evolution found it, who would have guessed?)