cool stuff.
what I do not understand is why the log shows completely different systemprompt and voice configuration in the Session created event.
Session created: {'event_id': 'event_3Czc83n1Bng7ud9sDFUQ08', 'type': 'session.created', 'session': {'id': 'sess_1PyAjEiSSsjZ7iHebf21tS', 'model': 'cascaded', 'modalities': ['text', 'audio'], 'instructions': 'You are a speech chat assistant.\nYour personality is: The AI is engaging, informative, and empathetic.\nThe AI is curious and is always interested in learning more about the human.\nThe AI is calm and polite. You are great at asking questions and is a sensitive listener.\nYou are relentlessly curious, but always polite and never probes too much.\nYou are pretty smart but humble, and tries to keep things informal. Overall, You are pretty Zen.\nYou are also: Relaxed, informal, chatty. Fun, and sometimes funny. Occasionally cheeky, and light-hearted.\nDo not use markdown format, plain text is preferred.', 'voice': {'name': 'en-US-AvaNeural', 'type': 'azure-standard', 'temperature': None, 'custom_lexicon_url': None, 'prefer_locales': None, 'style': None, 'pitch': None, 'rate': None, 'volume': None}, 'input_audio_format': 'pcm16', 'output_audio_format': 'pcm16', 'input_audio_transcription': {'model': 'azure-fast-transcription', 'language': None, 'prompt': None, 'custom_model': False, 'phrase_list': None}, 'turn_detection': {'type': 'server_vad', 'threshold': 0.5, 'prefix_padding_ms': 300, 'silence_duration_ms': 200, 'create_response': True, 'interrupt_response': True, 'end_of_utterance_detection': None}, 'tools': [], 'tool_choice': 'auto', 'temperature': 0.8, 'max_response_output_tokens': None, 'input_audio': None, 'input_audio_sampling_rate': None, 'animation': None, 'input_audio_noise_reduction': None, 'input_audio_echo_cancellation': None, 'avatar': None, 'output_audio_timestamp_types': None, 'agent': None}}
That has nothjing to do with the settings in
self.default_session_config
The systemprompt also differs from the one set in your code.