Lip-sync Based on Volume of Wav Files (Native)

Updated: 02/17/2021

This page is for Cubism version 4.2 or earlier. Click here for the latest version.

Summary

The Cubism SDK for Native sample uses the lip-sync feature to provide real-time lip-sync movement based on the volume of the audio data in the wav file.
This sample is provided with the Cubism 4 SDK for Native R2 or later.

How to Use

Preparation

Describe the wav file associated with the motion in the .json file. Specify the wav file path corresponding to the “Sound” key for each motion.

{
	・
	・
	・
		"Motions": {
			"Idle": [
			  {"File":"motions/haru_g_idle.motion3.json" ,"FadeInTime":0.5, "FadeOutTime":0.5, "Sound": "sounds/haru_normal_05.wav"},
			  {"File":"motions/haru_g_m15.motion3.json" ,"FadeInTime":0.5, "FadeOutTime":0.5, "Sound": "sounds/haru_normal_06.wav"}
			],
			"TapBody": [
			  {"File":"motions/haru_g_m06.motion3.json" ,"FadeInTime":0.5, "FadeOutTime":0.5, "Sound": "sounds/haru_normal_01.wav"},
			  {"File":"motions/haru_g_m09.motion3.json" ,"FadeInTime":0.5, "FadeOutTime":0.5, "Sound": "sounds/haru_normal_02.wav"},
			  {"File":"motions/haru_g_m20.motion3.json" ,"FadeInTime":0.5, "FadeOutTime":0.5, "Sound": "sounds/haru_normal_03.wav"},
			  {"File":"motions/haru_g_m26.motion3.json" ,"FadeInTime":0.5, "FadeOutTime":0.5, "Sound": "sounds/haru_normal_04.wav"}
			]
		},
	・
	・
	・
}

Place the wav file in the location specified above; if the path to the wav file is incorrect, lip-sync will not be performed.

Start of volume acquisition

In the sample, information is acquired from a wav file via the LAppWavFileHandler class.
Executing LAppWavFileHandler::Start() reads audio data from a wav file and initializes the internal state necessary to perform lip-sync.

CubismMotionQueueEntryHandle LAppModel::StartMotion(const csmChar* group, csmInt32 no, csmInt32 priority, ACubismMotion::FinishedMotionCallback onFinishedMotionHandler)
{
    ・
    ・
    ・
    //voice
    csmString voice = _modelSetting->GetMotionSoundFileName(group, no);
    if (strcmp(voice.GetRawString(), "") != 0)
    {
        csmString path = voice;
        path = _modelHomeDir + path;
        LAppPal::PrintLog("[APP]start lipsync:%s .", path.GetRawString());
        _wavFileHandler.Start(path);
    }
    ・
    ・
    ・ 
}

Status update and volume acquisition

LAppWavFileHandler::Update() is executed to measure the volume of the part corresponding to the elapsed time. The volume of the measurement results can be obtained with LAppWavFileHandler::GetRms(). Set the obtained volume as a lip-sync value for the model with the CubismModel::AddParameterValue function.

In the sample, the acquired volume is multiplied by 0.8 to obtain the lip-sync value. Here, it is also possible to set the lip-sync value to a volume obtained from a sound library or other source.

void LAppModel::Update()
{
    ・
    ・
    ・
    // Lip-sync settings
    if (_lipSync)
    {
        // For real-time lip-sync, get the volume from the system and enter a value in the range between 0 and 1.
        csmFloat32 value = 0.0f;

        // Status update / RMS value acquisition
        _wavFileHandler.Update(deltaTimeSeconds);
        value = _wavFileHandler.GetRms();

        for (csmUint32 i = 0; i < _lipSyncIds.GetSize(); ++i)
        {
            _model->AddParameterValue(_lipSyncIds[i], value, 0.8f);
        }
    }
    ・
    ・
    ・
}

Additional Information

  • The sample does not have the ability to play audio on the device.
  • LAppWavFileHandler::GetRms() returns the current volume value in the range between 0 and 1.
    • The unit for volume is RMS (root mean square).
    • Calculates the average of all channel audio.
      • For example, in the case of stereo audio, the average value of the audio including the left and right channels is calculated.

Restrictions

The sample supports loading the following wav files. Lip-sync will not be performed if a file that is not in one of the supported formats is loaded.

  • Microsoft WAV (Little Endian format)
  • Linear PCM
    • Wav files encoded in μ-raw, ADPCM, etc. are not supported.
  • Number of channels: Mono/Stereo
  • Supported bit depths: 8, 16, 24-bit signed integer
Was this article helpful?
YesNo
Please let us know what you think about this article.