From SIP to RTP (Part 5) – Trunks & surroundings

Definition of Trunks
Trunk lines are the phone lines coming into the PBX from the telephone provider. Trunking saves cost, because there are usually fewer trunk lines than extension lines, since it is unusual in most offices to have all extension lines in use for external calls at once.

Att.: Definition partially taken from Wikipedia (http://en.wikipedia.org/wiki/Trunking)
Att.: Normally it is possible to use the ratio 1:5 for trunks:extensions.

Similarly a Sip Trunk is a service offered by an ITSP (Internet Telephony Service Provider) that permits businesses that have a PBX installed to call outside the enterprise network to all phone in the public network (SIP or not) by using the same connection as the Internet connection, .

In the other words if Bob, that use a SIP Pbx, want to call Ada, and Ada’s phone is an old-fashioned analog phone, the Bob’s Pbx must use a trunk line and a service offered by an ITSP.

NAT & SIP
It is impossible tell about SIP & SDP/RTP without mentioning problems related to NAT and the problems it can introduce.

Att.: If the pbx, phone, and other related devices are all in the same LAN, the NAT it is not involved, and it is possible to not know anything about these problems. But very often the pbx use a trunk that is connected to ITSP, and the connection very often traverse a NAT device: in this case the NAT interfere with this process.

NAT (Network Address Translation or Network Address Translator) is the process of translation of an Internet Protocol address (IP address) used within one network (i.e. internal LAN) to a different IP address known within another network (i.e. WAN, that is the “external network”). Typically, an office maps its local inside network addresses that accesses to internet to one or more global outside IP addresses and unmaps the global IP addresses on incoming packets back into local IP addresses. NAT conserves on the number of global IP addresses that a company needs to connects to internet, and it lets the company use up to a single IP address: this address is often used by the router that connects the computers to the Internet.

The simplest type of NAT provides a one to one translation of IP addresses (basic NAT or one-to-one NAT). In this type of NAT only the IP addresses, IP header checksum and any higher level checksums that include the IP address need to be changed. The rest of the packet can be left untouched (at least for basic TCP/UDP functionality, some higher level protocols may need further translation). Basic NATs can be used when there is a requirement to interconnect two IP networks with incompatible addressing.

However it is common to hide an entire IP address space, usually consisting of private IP addresses, behind a single IP address (or in some cases a small group of IP addresses) in another (usually public) address space. To avoid ambiguity in the handling of returned packets, a one-to-many NAT must alter higher level information such as TCP/UDP ports in outgoing communications and must maintain a translation table so that return packets can be correctly translated back. The term for this kind of NAT are NAPT (network address and port translation), PAT (port address translation), IP masquerading, NAT Overload and many-to-one NAT.

Att.: Since this is the most common type of NAT it is often referred to simply as NAT.

As described, the method enables communication through the router only when the conversation originates in the masqueraded network, since this establishes the translation tables. For example, a web browser in the masqueraded network can browse a website outside, but a web browser outside could not browse a web site in the masqueraded network. However, most NAT devices today allow the network administrator to configure translation table entries for permanent use. This feature is often referred to as “static NAT” or port forwarding and allows traffic originating in the “outside” network to reach designated hosts in the masqueraded network.

We have to tell that SIP & SDP/RTP are good protocols, but things kind of break down when NAT gets involved. SIP packets themselves tend to move about without too much trouble (generally), as they ‘hop’ from one server to another: RTP sessions (voice transport) are somewhat more troublesome. The reason is that the NAT modify the port and the address of the Ip protocols, left unchanged the SDP/RTP packets, and it lead to inconsistent message between devices.

Either both clients need to be aware they are behind a NAT, and substitute their local IP addresses for their public IPs in their Session Description messages (the messages that specify the ip address/port to use to transmit voice stream) and open the appropriate firewall ports, or something has to modify the SIP packets en route.

Alternatively it is possible to use NAT device that are equipped with SIP proxy (i.e. siproxd) that intercept all the SIP/SDP/RTP packet and check the used Ip address, substitute the wrong value and retransmit the packet and “open the port” in the NAT for the incoming streaming audio.

Att: Very often if the SIP UA does not modify the Ip address in SIP/SDP message, and the NAT device is not using a Sip proxy, and all works fine too: it depends on the kind of the NAT that the LAN is in using, and if the receiver of the SIP/SDP message is capable of handle message with private local Ip address in SIP/SDP message.

Products known as Back-to-Back User Agents (i.e. Asterisk), can actually proxy RTP traffic: Asterisk can modify SIP packets to direct the caller and destination to establish an RTP session with itself, rather than with each other. This is useful in situations where two SIP clients may not have direct access to each other, most commonly, when one or both of the SIP clients are behind a NAT.

The argument SIP & NAT is very difficult, and to truly understand something to be studied in depth and much documentation. In general, to avoid any problem when possible is always best to use the pbx with a public IP address to connect to ITSP, but this leads to problems relating to safety.

Otherwise in the next some advice.
– Configure the pbx to substitute their local IP addresses for their public IPs in their Session Description messages and related messages
– Configure the pbx to transmit periodically an OPTION packet to the ITSP
– If you have differente devices that connect to external ITSP using SIP you have to modify the originating port used by the protocol: every devices must use a unique different port.
– If you can configure router create static NAT to forward to the pbx all the ports used by the SIP protocol & RDP stream.

PREVIOUS POST: From Sip to RTP (Part 4) – Invite & Register friendship
NEXT POST:  From SIP to RTP (Part 6) – The phone is ringing….

Linkografia
http://www.techterms.com/definition/nat
http://en.wikipedia.org/wiki/Network_address_translation

From Sip to RTP (Part 4) – Invite & Register friendship

INVITE
Session Call Establishment

Sip: Alice wants to call Bob

Call Flow

  1. The UAC (Alice) sends an INVITE message to Bob (UAS).
  2. The UAS receives the request and responds using 100 Trying.
  3. The UAS sends message 180 Ringing response to UAC when the phone begins ringing.
  4. Once the call is picked up, the UAS send a 200 Ok message to the UAC.
  5. The UAC sends an ACK request to confirm the 200 Ok response was received.

Note: The ACK method completes what is known as the three-way handshake-confirmation that a session has been successfully established. In SIP the INVITE is the only method where this occurs, and this is due to the large gap of time that often occurs between the INVITE itself and the 200 OK response (when a user can’t find the phone, is running to the phone, etc.). So the ACK it is important: it tells the callee party that the caller hasn’t hung up and has accepted the call.

Real Example
Alice -> Bob

INVITE sip:41@pbx.company-alice-and-bob.com;user=phone SIP/2.0
Via: SIP/2.0/TLS 10.10.10.32:2061;branch=z9hG4bK-9gg3wzak;rport
From: “Alice” <sip:40@pbx.company-alice-and-bob.com>;tag=g5ua0i7fz6
To: <sip:41@pbx.company-alice-and-bob.com;user=phone>
Call-ID: 3c2812339279-zvojwzvof6we
CSeq: 1 INVITE
Max-Forwards: 70
Contact: <sip:40@10.10.10.32:2061;transport=tls;line=i339wesg>;reg-id=1
P-Key-Flags: resolution=”31x13”, keys=”4”
User-Agent: snom360/7.3.14
Allow: INVITE, ACK, CANCEL, BYE, REFER, OPTIONS, NOTIFY, SUBSCRIBE, PRACK, MESSAGE, INFO
Content-Length: 452

Bob -> Alice

SIP/2.0 100 Trying
Via: SIP/2.0/TLS 10.10.10.32:2061;branch=z9hG4bK-9gohvng3wzak;rport=2061
From: “Alice” <sip:40@pbx.company-alice-and-bob.com>;tag=g5ua0i7fz6
To: <sip:41@pbx.company-alice-and-bob.com;user=phone>;tag=eed7a3b4e0
Call-ID: 3c2812339279-zvojwzvof6we
CSeq: 1 INVITE
Content-Length: 0

Bob -> Alice

SIP/2.0 180 Ringing
Via: SIP/2.0/TLS 10.10.10.34:5060;branch=z9hG4bKe299f160c512cb066a3a536253aa4d44;rport=5061
From: "Bob" <sip:41@pbx.company-alice-and-bob.com>;tag=ozac09qwnh
To: “Alice” <sip:40@pbx.company-alice-and-bob.com>;tag=1521860827
Call-ID: f6f17567@pbx
CSeq: 28592 INVITE
Contact: <sip:41@10.10.10.31:1037;transport=tls;line=zs4m8lei>;reg-id=1
Require: 100rel
RSeq: 1
Allow: INVITE, ACK, CANCEL, BYE, REFER, OPTIONS, NOTIFY, SUBSCRIBE,PRACK, MESSAGE, INFO
Allow-Events: talk, hold, refer, call-info
Content-Length: 0

Alice -> Bob

SIP/2.0 200 Ok
Via: SIP/2.0/TLS 10.10.10.34:5060;branch=z9hG4bK-08d0f56ed1e8d82deb32779c5a2cc55b;rport=5061
From: “Alice” <sip:40@pbx.company-alice-and-bob.com>;tag=1521860827
To: "Bob" <sip:41@pbx.company-alice-and-bob.com>;tag=ozac09qwnh
Call-ID: f6f17567@pbx
CSeq: 28593 PRACK
Contact: <sip:41@10.10.10.31:1037;transport=tls;line=zs4m8lei>;reg-id=1
Content-Length: 0

Bob -> Alice

ACK sip:41@10.10.10.31:1037 ;transport=tls;line=zs4m8lei SIP/2.0
Via: SIP/2.0/TLS 10.10.10.34:5060;branch=z9hG4bK-6dcf1018159b8e96b7b6d62a758d77fd;rport
From: "Bob" <sip:41@pbx.company-alice-and-bob.com>;tag=ozac09qwnh
To: “Alice” <sip:40@pbx.company-alice-and-bob.com>;tag=1521860827
Call-ID: f6f17567@pbx
CSeq: 28592 ACK
Max-Forwards: 70
Contact: <sip:41@10.10.10.34:5060;transport=tls>
Content-Length: 0

Att.: Note: The “user=phone” parameter indicates that the user portion of the URI (the part to the left of the @ sign) should be treated as a tel URI: so 40 is the number assigned to the Alice’s phone, and 41 to Bob’s phone. Generally if dial a telephone number on a keypad, this is converted into a SIP URI of the form sip:nnnnn@domain;user=phone (in this case Alice dialed 41 using the keypad of the her phone to call Bob).

REGISTER
While going through a typical SIP session the proxy servers do the job of finding out the exact location of the recipient, and must be knows the ip address of the recipient UA. What actually happens is that every user registers its current location to a REGISTRAR server. The application sends a message callee REGISTER informing the server of its present location. The Registrar stores this binding (between the user and its present address) in a location server which is used by other proxies to locate the user.

The register process is very important because permit to bind UA ↔ Ip Address where the UA itself answer to INVITE etc. In the registration process it is possible to use an authentication (realm, username, password), but it is not mandatory.

Real Example

Alice -> Registrar Server

REGISTER sip:10.10.10.110:5060 SIP/2.0
Via: SIP/2.0/UDP 10.10.10.81:5060;branch=z9hG4bK97f7825d2
Max-Forwards: 70
Content-Length: 0
To: 40 <sip:40@10.10.10.110:5060>
From: 40 <sip:40@10.10.10.110:5060>;tag=60b2a3eb3e07b6e
Call-ID: b72007f19f018538b3ac254fa026dbb8@10.10.10.81
CSeq: 1507731226 REGISTER
Contact: 40 <sip:40@10.10.10.81:5060;transport=udp>;expires=120
Allow-Events: talk,hold,conference
Allow:NOTIFY,REFER,OPTIONS,INVITE,ACK,CANCEL,BYE,INFO
Authorization:Digest response="368b75e6f3d7244fbbf01da27b271feb",username="40",realm="asterisk",nonce="3b2d808b",algorithm=MD5,uri="sip:10.10.10.110:5060"
User-Agent: Aastra 9112i/1.4.3.1001 Brcm Callctrl/1.5.1.0 MxSF/v3.2.8.45

Registrar Server -> Alice

SIP/2.0 401 Unauthorized
Via: SIP/2.0/UDP 10.10.10.81:5060;branch=z9hG4bK97f7825d2;received=10.10.10.81
From: 40 <sip:40@10.10.10.110:5060>;tag=60b2a3eb3e07b6e
To: 40 <sip:40@10.10.10.110:5060>;tag=as2efd2925
Call-ID: b72007f19f018538b3ac254fa026dbb8@10.10.10.81
CSeq: 1507731226 REGISTER
User-Agent: FPBX-2.8.1(1.4.40)
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY, INFO
Supported: replaces
WWW-Authenticate: Digest algorithm=MD5, realm="asterisk", nonce="7e318af4"
Content-Length: 0

Alice -> Registrar Server

REGISTER sip:10.10.10.110:5060 SIP/2.0
Via: SIP/2.0/UDP 10.10.10.81:5060;branch=z9hG4bK824b13bb5
Max-Forwards: 70
Content-Length: 0
To: 40 <sip:40@10.10.10.110:5060>
From: 40 <sip:40@10.10.10.110:5060>;tag=60b2a3eb3e07b6e
Call-ID: b72007f19f018538b3ac254fa026dbb8@10.10.10.81
CSeq: 1507731227 REGISTER
Contact: 40 <sip:40@10.10.10.81:5060;transport=udp>;expires=120
Allow-Events: talk,hold,conference
Allow:NOTIFY,REFER,OPTIONS,INVITE,ACK,CANCEL,BYE,INFO
Authorization:Digest response="50d379545b2fd91e1a132eb42b120cf0",username="40",realm="asterisk",nonce="7e318af4",algorithm=MD5,uri="sip:10.10.10.110:5060"
User-Agent: Aastra 9112i/1.4.3.1001 Brcm Callctrl/1.5.1.0 MxSF/v3.2.8.45

Registrar Server -> Alice

SIP/2.0 200 OK
Via: SIP/2.0/UDP 10.10.10.81:5060;branch=z9hG4bK824b13bb5;received=10.10.10.81
From: 40 <sip:40@10.10.10.110:5060>;tag=60b2a3eb3e07b6e
To: 40 <sip:40@10.10.10.110:5060>;tag=as2efd2925
Call-ID: b72007f19f018538b3ac254fa026dbb8@10.10.10.81
CSeq: 1507731227 REGISTER
User-Agent: FPBX-2.8.1(1.4.40)
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY, INFO
Supported: replaces
Expires: 120
Contact: <sip:40@10.10.10.81:5060;transport=udp>;expires=120
Date: Tue, 29 Nov 2011 00:29:34 GMT
Content-Length: 0

The authentication process checks that both communicate parties know a shared password. In the response 401 Unauthorized the server rejects the client registration and sends it back a challenge digest composed of an algorithm type, a realm and a nonce. When the client sends a new registration request but this time with a digest response composed of the:
”username”, “realm”, “nonce”, “uri”, “response” and the algorithm: the response is computed using the algorithm type, the nonce, the realm and the password. Now the server server will check the response using the password (that is the shared secret) 
for that user, and if all is correct it will send a OK message.

While going through a typical SIP session you have already seen that the caller doesn’t know the address of the callee initially. The proxy servers do the job of finding out the exact location of the recipient (ip address). What actually happens is that every user registers its current location to a REGISTRAR server. The application sends a message callee REGISTER informing the server of its present location. The Registrar stores this binding (between the user and its present ip address) in a location server which is used by other proxies to locate the user.

Att.: The Contact field in INVITE and others SIP-messages is related to the same field used in REGISTER method.

Att.: The ‘Expire’ field reflects the duration for which this registration (bind UA<->Ip Address) will be valid. So the UA has to refresh its registration from time to time.

OPTIONS
The SIP method OPTIONS allows a UA to query another UA or a proxy server as to its capabilities. This allows a client to discover information about the supported SIP methods, codecs, etc. without call the other party.

For example, before a client inserts a field into an INVITE listing an option that it is not certain the destination UAS supports, the client can query the destination UAS with an OPTIONS to see if this option is returned in a Supported header field. All UAs MUST support the OPTIONS method !

Other use is to check the availability of an UA. I.e. it is possible bind statically an UA with a ip address: in this case the UA will not register himself. Using the OPTIONS message it is possible to verify that the UAS is on-line.

Att.: The UAs that register himself will receive the OPTIONS message too !

Real Example

The PBX send a OPTION message to an UA

OPTIONS sip:202@10.10.10.81:5060;transport=udp SIP/2.0
Via: SIP/2.0/UDP 10.10.10.110:5060;branch=z9hG4bK026408ff;rport
From: "Unknown" <sip:Unknown@10.10.10.110>;tag=as264d051f
To: <sip:202@10.10.10.81:5060;transport=udp>
Contact: <sip:Unknown@10.10.10.110>
Call-ID: 787c4d983b006a5a3011bd356140ed15@10.10.10.110
CSeq: 102 OPTIONS
User-Agent: FPBX-2.8.1(1.4.40)
Max-Forwards: 70
Date: Tue, 29 Nov 2011 00:29:34 GMT
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY, INFO
Supported: replaces
Content-Length: 0

Response from UA

SIP/2.0 200 OK
Call-ID: 787c4d983b006a5a3011bd356140ed15@10.10.10.110
CSeq: 102 OPTIONS
From: "Unknown" <sip:Unknown@10.10.10.110>;tag=as264d051f
To: <sip:202@10.10.10.81:5060>;tag=4fe041f88ba9bed
Via: SIP/2.0/UDP 10.10.10.110:5060;branch=z9hG4bK026408ff;rport
Content-Length: 0
Allow:NOTIFY,REFER,OPTIONS,INVITE,ACK,CANCEL,BYE,INFO
Contact: <sip:202@10.10.10.81:5060;transport=udp>
Supported: replaces
User-Agent: Aastra 9112i/1.4.3.1001 Brcm Callctrl/1.5.1.0 MxSF/v3.2.8.45

In this case the UA inform that it exist and support the below SIP message.

Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY, INFO

Att.: Generally OPTION is sent regularly to all UA from the PBX, whether they are registered or not (Ip Address binded statically to UA). In asterisk it is possible modify this behavior with the qualify parameter. If you turn on qualify in the configuration of a SIP device in sip.conf, Asterisk will send a SIP OPTIONS command regularly to check that the device is still online. If the device does not answer within the configured period (in ms) Asterisk considers the device off-line for future calls. This status can be checked by in Asterisk with the command sip show peers: this will provide status information for peers which have qualify=yes (in status information there is a column that show the delay in response to OPTION message, that is a measure in connection latency between device and Pbx).

Att.: It is possible to use OPTION to try to solve NAT problem in order to keep open the connection from Asterisk to the peer behind NAT. I will write about SIP Pbx protected by Firewall/NAT in future posts.

PREVIOUS POST: From Sip tot RTP (Part 3) – B2BUA… What ?!
NEXT POST:  From SIP to RTP (Part 5) – Trunks and surroundings

From Sip to RTP (Part 3) – B2BUA… What ?!

Def. of Back-to-Back User Agent (B2BUA).
A B2BUA essentially bolts two user agents together in a back-to-back fashion, similar to two people standing back to back. A B2BAU establishes a two-legged call, keeping the SIP server in the middle of the call to orchestrate the details.

One side of the session acts as the SIP UA server that receives the calls; the other side acts as the SIP UA client that establishes the other leg of the call. This “middle position” of the SIP server allows the system to execute difficult call scenarios, like recording a call, stepping out of the voicemail system (by pressing 0), barging into a call, and many other call scenarios that are very hard to do without this center position.

Att.: Typically Asterisk (and most of the pbx) acts like a B2BUA, although you can configure it to behave differently.

In short a SIP call using a B2BUA can be described as the following: both Alice and Bob register to B2BUA Pbx.

If Alice wants to initiate a call with Bob, she will send an INVITE message (a call request) to B2BUA Pbx: it will then send the INVITE message to Bob’s ip phone (B2BUA Pbx knows the ip address where is located the Bob’s ip phone: we will see in details this step).

When Bob ip phone accepts the INVITE (answer the call), then he will send back an OK message to B2BUA Pbx, which will propagate back to ALICE ip phone. Alice then sends an ACK to B2BUA Pbx, that propagate to Bob’s ip phone: a media session that transport voice takes place from Alice’s Iphone → B2BUA Pbx → Bob’s Ip phone to transport the Alice voice, and Bob’s Ip Phone → B2BUA Pbx → Alice’s ip phone to transport the Bob’s voice. B2BUA Pbx in every call is in the middle !

Att.: It is important to note that pbx only proxy’s RTP media traffic when it has to, and when configured to do so: proxy’s RTP traffic is CPU/RAM intensive.

SIP Language
SIP shares some common characteristics with HTTP and SMTP. Like the latter two, SIP is an ASCII text-based protocol which makes it easy to read and troubleshoot. The text below is a SIP trace that shows a user inviting another use to a session.


Users are identified by a SIP address, known as a Uniform Resource Identifier (URI). A SIP URI is similar to an email address and is typically built around the user’s phone number or host name (e.g., sip:[your_number]@companyA.3tsistemi.it). This allows users to be redirected to another phone as easily as they would be redirected to another web page.

SIP communication consists of two types of SIP messages: methods and responses. Methods are sent from the client to the server and are used to indicate the purpose of the request. The following methods are the most important and common: there are some others but these are the must-be-known in trouble shutting process in SIP PBX environment.

INVITE
Establishes a session

ACK
Confirms an INVITE request

BYE
Ends a session

CANCEL
Cancels establishing of a session

REGISTER
Registers a Ip Phone with a registrar server (which is normally incorporated in the pbx: need to know the IP address of the phone, that is where to send the SIP messages)

OPTIONS
Communicates information about the capabilities of the other side.

Responses are sent from the server to the client and are used to indicate the status of the transaction. Responses are delivered in integer form (from 100 to 699) and are categorized as shown in the next.
1xx Informational responses
2xx Success responses
3xx Redirection responses
4xx Request failures
5xx Server errors
6xx Global failures

SIP messages consist of the following three parts:

SIP URI
The SIP URI is typically built around the user’s phone number.

This first line also indicates either the purpose of the request or the response given by the callee party

Message body
SIP requests and responses can both contain message bodies.

The content of the message body is usually a session description and contains syntax as shown in the message below.

 

Att: There are SIP message that does not have the Message Body, but only the Headers (i.e. Cancel, ACK).

Headers
SIP header fields provide additional information about the message. Common headers are shown below.

Via
Path taken by the request so far

Call ID
A unique number used to identify the call

CSeq
Used for keeping track of the conversation number in the SIP messages environment.

Contact
Used for identifying the user agent and the version of software used by the user agent.

Message Body
Normally contains the SDP messages.

PREVIOUS POST: From Sip to RTP (Part 2) – This is straight talking !
NEXT POST: From Sip tot RTP (Part 4) – Invite & Register friendship

From Sip to RTP (Part 2) – This is straight talking !

In this post I will discuss the interaction between SIP and SDP/RTP protocols, with a approach bottom up.

In the beginning a first important note: the Session Initiation Protocol is used ONLY to initiate a session between two endpoints. SIP protocol does not carry any voice or video data (stream) itself, it only allows two or more endpoints to set up connection to transfer that traffic (voice or video) between each other via other protocol, the Real-time Transport Protocol (RTP).

Streaming Audio: the Real-Time Protocol (RTP)
The Real-Time Protocol (RTP) is an application-level protocol that delivers real-time data between two end systems. This is done in such a way that the receiving end system is able to reconstruct the original data stream sent by the other end system, even if the packets are delayed or arrive out of order.

If packets are lost on the way, the protocol will be able to detect this but it does not support requests for retransmissions of any data: every RTP packets contains a sequence number to detect lost and out of order packets.

The reason for not supporting retransmission in the protocol is that it would most likely take too long to request that the source resend the lost RTP packet and for this copy to arrive. A better solution, for the case of audio at least, is to extrapolate sound from previous audio samples to make up for the lost ones, or just ignore the lost data and go on as if nothing has happened (the duration of the audio in one packet is relatively short and the loss of sound for that short period of time will not have a major influence of the quality).

The topic of retransmission is a major reason for not using TCP (TCP protocol, which is a reliable connection oriented protocol, uses retransmissions as a way to guarantee the delivery of the data handed to the TCP layer from the application layer).

Therefore RTP normally uses UDP as the default transmission protocol because that does not provide any reliability features. UDP in turn uses IP, with best effort delivery to encapsulate its data.

Att.: Def. of best effort delivery = Describes a network protocol in which the network does not provide any guarantees that data is delivered.

In the next we summarize the processing and encapsulation of the audio for an IP telephony session before it is sent from a host usng a network connection.
1) The sound from the microphone will be sampled at certain times. A number of samples are bundled together by the application to be the data compressed and encapsulated into a RTP packet. Typically the data related to 20 ms of sound is encapsulated into one RTP packet (to summarize this step: transformation of the voice into a stream of bytes).
2) Every RTP packet is encapsulated into a UDP datagram and transmitted to the destination.

Att.: Does exist several methods how to sample the sound from microphone and compress this stream of bytes obtained: every different methods is a different codec.

The Session Description Protocol (SDP)
The Session Description Protocol (SDP) has three main objectives that need to be achieved before an IP telephony session between a caller and a callee can begin.

First, you need to tell the other party what kind of media you want to receive: audio, video, or both. The second thing is how you want the media to be coded by him so that you can understand what is being sent (what codec is in using). The third thing you need is to inform the other party about what is the address and UDP port you want the media to be delivered to.

For this to work the device on the other side will also have to send you a session description with his information to you, or else you will not be able to send any media data to him. A typical session description looks like the one in the next. SDP is entirely textual !

v=0
o=gptucci 955720785595 955720785595 IN IP4 135.138.242.8
s=Basic Session
c=IN IP4 135.138.242.8
t=955720785595 0
m=audio 2328 RTP/AVP 8 0 96 98 99 97
a=rtpmap:96 SC6/6000
a=rtpmap:98 SC6/3000
a=rtpmap:99 RT24/2400
a=rtpmap:97 VR15/1500

In the next we will see in details the SDP session, but now we can figure out the most important field..

The origin field

o=<username> <session id> <version> <network type> <address type> <address>

The parameters of the origin field will together form a unique identifier for the current SDP session.

The connection field

c=<network type> <address type> <connection address>

The purpose of the connection field is to give to the port number given in the media field (see in the next) an address to be associated with.

The media field

m=<media> <port> <transport> <fmt list>

The purpose of the media field is to let the other party in the session know what kind of media (audio or video) the recipient of the SDP should deliver, to what port on the associated connection address (see above) the media should be delivered to, and in what way the media should be coded. The example of SDP session above uses two standard codecs denoted 8 and 0 in the media field (respectvly PCMA and PCMU). In the same media field are four non-standard codecs, denoted 96, 97, 98 and 99, declared. The non-standard codecs are defined in the following attribute fields, one for each codec number.

SIP
The session initiation protocol (SIP) is a signaling protocol for setting up sessions between clients over a network, i.e. the Internet.

Att.: These sessions do not necessarily have to be Internet telephony sessions: SIP could just as well be used for setting up gaming sessions or for distance learning where a lecture is streamed out to the participants.

The SIP sessions are set up by using a three-way handshake procedure (much like TCP).

Sip: Alice wants to call Bob

When client A (Alice) wants to set up an IP telephony call session with client B (Bob), A sends an INVITE request to B. The INVITE message contains a payload (=data inside the INVITE request) with a description of the session he/she wants to set up with B. If A want to setup an IP voice telephony session, then the session description in payload contains information about audio encoding types A “can understand” and it also specifies on which ports A wants the RTP audio data sent to. The protocol to convey session descriptions is Session Description Protocol (SDP). All the SDP message will be transimmetd inside SIP payload message (it’ll become more clear in the next…) !

When B accepts the call his user agent sends a message with a response code of 200. Any 2xx response means that the message was successfully received, understood, and accepted. In the response client B adds his codec capabilities and the port numbers where he wants A to send his RTP data to (using SDP packet). The final part of the three-way handshake occurs when A sends an acknowledgement to B. By sending an ACK the caller confirms that it has received the response from the callee. After the setup procedure is completed the conversation can begin now using RTP.

 SDP in SIP
I have to repeat another time, but it is very important !

SIP protocol is used to initiate a session between two endpoints: it does not carry any voice or video data (stream) itself, it only allows two endpoints to set up connection (using SDP incapsulated in SIP messages) to transfer that traffic (voice or video) between each other via other protocol, the Real-time Transport Protocol (RTP).

Here is a real example of INVITE message where it is possible to see the structure of the more important SIP message (Alice is calling her friend named Bob).

Att.: In Asterisk it is possible to debug all the SIP messages with the following commands from console.

set verbose 0
set debug 0
sip set debug

 

1 = This is the SIP Request header that tells us what kind of SIP message this is. This particular packet is a SIP INVITE request for below extension.

532453@79.14.212.52 (calling request)SIP/2.0

Att.: 79.14.212.52 is the ip address of the SIP proxy, more common the IpAddress of the SIP Pbx: 532453 is the Bob’s number.

2 = The Via header contains a list of all SIP proxy servers that this packet has passed through, including the initiating client.

We have see that the SIP protocol can be, and usually is, routed through one or more SIP proxy servers before reaching its destination: it is very similar to how email is transmitted, in that multiple email server are usually involved in the delivery process, each forwarding the message in its original form. Each email server adds a Received header to the message, to track the route the message has taken. SIP uses a Via header to track the SIP proxies that the message has passed through to get to its destination.

Att.: The Via field indicates the path taken by the request so far. This prevents request looping and ensures replies take the same path as the requests, which assists in firewall traversal and other unusual routing situations.

3 = The “To” header specifies the SIP packet’s destination

4 = The “From” header specified who sent the SIP packet

5 = This particular packet is a SDP packet, meaning it contains a Session Description Protocol message that contains information the remote client needs to open an RTP session for this call.

6 = The IP address of the SIP client that created this packet

7 = The IP address the destination SIP client should contact to open an RTP session.

8 = The key pieces of information in this header are audio, 35302 and RTP/AVP. The audio component obviously signifies that this is an audio call, 35302 specifies the port where want to receive the RTP stream, and the IP address is specified in 6: RTP/AVP specifies that the Real-time Transport Protocol will be used for the session. The numbers at the end of this header represent the different codecs that this client supports: the SIP client at the other end must support one of the matching protocols in order to be able to make a successful connection.

More deeply…. The key pieces of information in this header are how the audio will flow from UAS (that receive the INVITE message, and is the called party) to UAC (that transmit this INVITE message, that is the caller).

In the INVITE message we can see the following.

c=IN IP4 193.227.104.23
t=0 0
m=audio 35302 RTP/AVP 18 3 97 8 0 101

These means that the stream related the voice (transmitted by RTP) must be transmitted to ip 193.227.104.23 port 35302.

This is the response to this INVITE message.

In the OK messages there is the information about the other voice stream, related to the flow caller->called.

c=IN IP4 79.14.212.52
t=0 0
m=audio 19340 RTP/AVP 8 101

These means that the other stream related the voice must be transmitted to ip 79.14.212.52 port 19340.

Att.: Usually the stream is transmitted from the same port where the other stream is received.

Alice’s voice is sent from ip 193.227.104.23 port 35302 to 79.14.212.52 port 19340 (Bob’s loudspeaker), and Bob’s voice is sent from ip 79.14.212.52 port 19340 to 193.227.104.23 port 35302 (Alice’s loudspeaker).

Att.: The voice is “transmitted” using bit and a codec: the other party must use the same codec to receive the stream and re-transform the bit-flow to voice. There are different kind of codecs: the number at the end of the header illustrated above (m=audio 19340 RTP/AVP 8 101), i.e. 8 represent the different codecs that client supports (here there is only one codec, but usually we can find more values), and 101 describe other sub-properties about the specified codecs. The SIP client at the other end must support one of the matching protocols in order to be able to make a successful connection. To simplify:

m=<media> <port>/<number of ports> <proto> <fmt>

where proto=codec, and fmt=media format description. Here 8 = PCMA (alaw) and 101 define a paylod type = telephony. All the specified numbers are defined in the IETF RFC related to SDP protocols.

The stream is transmitted using RTP protocol, but all the message that clarify what IP and port using is SDP.

Att.: Unlike SIP, which listens on port 5060 (usually UDP like in Asterisk enviroment, but can be TCP), RTP uses a dynamic port range (and is only ever UDP): in asterisk the default is between 10000-20000 and can be changed using the file rtp.conf.

PREVIOUS POST: From SIP to RTP (Part 1) – Overview
NEXT POST:  From Sip to RTP (Part 3) – B2BUA… What ?!