Part 1 is here:

How To Make A Rhythm Game (Technical Guide)

This is a high-level kind of guide on our approach to syncing sounds, originally written to someone else for syncing sounds in a separate mini-game they made.

I'll have to rewrite it so its more readable as a standalone document. For now it references our games a bit too much. But it still might be useful so I'm making it available.

todos: show working example of PlayScheduled being used


In the first part, we looked at how to make visual things not drift from the music. We did it by always using reference points and incrementing them, rather than using any 'current' timestamps.

That worked for visuals. For sounds it's not completely sufficient. The reason is that differences in sound vs sound (playing a metronome sound on beat of a song), are more obvious at a fine scale, than sync in sound vs visual (flashing a cube on beat).

Because the GAME LOOP itself (e.g. the Update() function) updates at 60fps, even in the flashing-cube kinda situation, it means there's technically up to 1/60s delay between the beat and seeing the flash. We don't really notice it though cause all the visuals are already running at 60fps.

With the sound vs sound situation it's a lot more obvious that there's this up-to-1/60s variability! So even if you compensate for latency by triggering all sounds early, you can't get rid of the unevenness of the sounds, it becomes really obvious if you have a single script, where all it does is play a tick sound at 300bpm, like a machine gun. Using the same working/'correct' example we had in Part 1, but for playing sound instead:

float lastbeat; //this is the ‘moving reference point’

	float bpm = 140;

	void Start(){
		lastbeat = 0;
		crotchet = 60 / bpm;
	}

	void Update(){
		if (Conductor.songposition > lastbeat + crotchet) {
		tickSound.Play();
		lastbeat += crotchet;
		}
	}

The code might look like it should play the sound at a constant interval. But you'll hear some stuttering, like its being played by someone struggling to keep time. This is because of how it can only trigger at the start of frames. So if the game's frames are at 0ms, 20ms, 40ms, 60ms, ... and the beat happens to fall at 50ms, there will be a 10ms delay until it's triggered.

The second problem is latency. While visuals can react within 1 frame, audio architectures on PCs aren't built for low latency normally. There'll always be around 200ms of delay (or down to 20ms if you use particular low latency drivers, but that's pretty complicated to do) when you trigger a sound.

When you trigger a sound on beat and expect it to play immediately, it's like you're asking someone to press the piano key the moment you tell them to press it. It can't work like that. There's a delay for them to think about your instruction and react to it, and by the time they press it, it's already slightly offbeat. The proper solution is that you tell them ahead of time to press the key on the downbeat that's coming up, so they can prepare for it! (See solution 4 below)

Solutions

  1. Just increase the framerate of the game to like 200fps. This will help with the inconsistent beats problem because the times are checked more often, but wouldn't help much with the latency problem.

  2. Decreasing audio latency as much as possible. While this will still leave you with 20ms kind of delay, it will at least make button presses feel more responsive. But the delay still definitely be noticeable if there's e.g. a drum sound played automatically that you intend to be on-beat, and you only trigger it on-beat.

  3. Baking in all sound effects into the track itself. I actually did this for ADOFAI, for the game jam, as a cheap and quick way to get those hitsounds on beat. Also because the next approach required too much time (eventually it took a few days, and I needed to do it to allow the level editor to work)

  4. The solution that requires more work: scheduling all audio ahead of time (how much ahead depends on the latency of your system, but I found around 0.6s was sufficient). To do this is a little complicated. You might imagine a 'ghost' invisible timeline moving before your actual timeline, and triggering those PlayScheduled sounds from the ghost timeline as it moves.

In Unity, you'd be using the PlayScheduled function for this, which lets you specify beforehand an exact millisecond you want your sound to be played.