Phoneme Level Prosody Encoder
          PhonemeLevelProsodyEncoder
  
            Bases: Module
Phoneme Level Prosody Encoder Module
This Class is used to encode the phoneme level prosody in the speech synthesis pipeline.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| preprocess_config | PreprocessingConfig | Configuration for preprocessing. | required | 
| model_config | AcousticModelConfigType | Acoustic model configuration. | required | 
Returns:
| Type | Description | 
|---|---|
| torch.Tensor: The encoded tensor after applying masked fill. | 
Source code in models/tts/delightful_tts/reference_encoder/phoneme_level_prosody_encoder.py
              | 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 |  | 
          forward(x, src_mask, mels, mel_lens, encoding)
  The forward pass of the PhonemeLevelProsodyEncoder. Input tensors are passed through the reference encoder, attention mechanism, and a bottleneck.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| x | Tensor | Input tensor of shape [N, seq_len, encoder_embedding_dim]. | required | 
| src_mask | Tensor | The mask tensor which contains  | required | 
| mels | Tensor | The mel-spectrogram with shape [N, Ty/r, n_mels*r], where r=1. | required | 
| mel_lens | Tensor | The lengths of each sequence in mels. | required | 
| encoding | Tensor | The relative positional encoding tensor. | required | 
Returns:
| Type | Description | 
|---|---|
| Tensor | torch.Tensor: Output tensor of shape [N, seq_len, bottleneck_size]. | 
Source code in models/tts/delightful_tts/reference_encoder/phoneme_level_prosody_encoder.py
            | 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 |  |