Conference protocol
This document aims to describe evolutions we will do for managing conferences (audio/video). The goal is to improve the current implementation which simply merge SIP calls together and provide a grid view, to a view where participants are listed, can be muted independantly, or the video layout changed (to show only one participant)
Definitions
Host: Is the user which mix the audio/video streams for the others
Participant: Every user in the conference, even the host
Disclaimer
This document only describes the first steps for now. This means the identification of participants and position in the video mixer sent to all participants.
Improve on layouts
Actually, Jami only provides the possibility to show a grid view of the users. We want to be able to only show one member of the conference (the one which shares its screen for example)
Possible layouts
GRID: Every member is shown with the same height/width
ONE_BIG_WITH_SMALL: One member is zoomed and the other preview are shown
ONE_BIG: One member take the full screen rendered
New API
Two new methods are available to manage the conference Layout in CallManager:
/**
* Change the conference layout
* @param confId
* @param layout 0 = matrix, 1 = one big, others in small, 2 = one in big
*/
void setConferenceLayout(const std::string& confId, int layout);
/**
* Change the active participant (used in layout != matrix)
* @param confId
* @param participantId If participantId not found, the local video will be shown
*/
void setActiveParticipant(const std::string& confId, const std::string& participantId);
Implementation
The implementation is pretty straight forward. Everything is managed by conference.cpp (to link participant to sources) and video_mixer.cpp (to render the wanted layout).
Syncing Conferences Informations
Note: Actually, the word participant is used for callId mixed in a conference. This can lead at first to some problems for the API and must be fixed in the furture
The goal is to notify all participants the metadatas of the rendered video. This means what participant is in the conference and where the video is located.
Layout Info
The Layout is stored as a VectorMapStringString for clients and internally with a vector
Layout = {
{
"uri": "participant", "x":"0", "y":"0", "w": "0", "h": "0", "isModerator": "true"
},
{
"uri": "participant1", "x":"0", "y":"0", "w": "0", "h": "0", "isModerator": "false"
}
(...)
}
New API
A new method (in CallManager) and a new signal to respectively get current conference infos and updates are available:
VectorMapStringString getConferenceInfos(const std::string& confId);
void onConferenceInfosUpdated(const std::string& confId, const VectorMapStringString& infos);
Implementation
The Conference Object (which only exists if we mix calls, this means that we are the master) manages the informations for the whole conference, based on the LayoutInfos of each Call objects. The getConferenceInfos will retrieve infos directly from this object.
So, every Call object now have a LayoutInfo and if updated, ask the Conference object to updates its infos.
The master of a conference sends its infos via the SIP channel as a message with the following MIME type:
application/confInfo+json
So, if a call receives some confInfo, we know that this call is a member of a conference.
To summarize, Call manages received layouts, Conference managed sent layouts.
Moderators
A conference can be controlled (for now, only the layout an be controlled). A moderator is added if on the same device of the host for now.
Implementation
To change a layout, the moderator can send a payload with “application/confOrder+json” as type:
{
"layout": int
}
To set a participant as active, the moderator can send a payload with “application/confOrder+json” as type:
{
"activeParticipant": "uri"
}
Future
Control moderators on a conference
Moderators can hangup
Moderators can mute/unmute
Multiple master management
Separate streams to allow more controls?