SMS Resource

SMS Encoding Guide and Segment Calculator

Compare GSM-7, ASCII, Latin-1, and UCS-2 behavior, then calculate exact segment counts for any message payload.

Practical rule: encoding determines payload size, segment count, and delivery cost profile.

Encoding Comparison

SMS payloads are limited to 140 bytes. Concatenated SMS uses UDH overhead, which reduces per-segment capacity.

Encoding	Typical Use	Single Segment	Concatenated Segment	Unit
GSM-7	Default SMS alphabet, most Western-language traffic	160	153	septets (7-bit chars)
ASCII	7-bit ASCII characters sent in 8-bit payload mode	140	134	bytes
Latin-1	Extended Western European characters in 8-bit mode	140	134	bytes
UCS-2	Multilingual/unicode messaging	70	67	16-bit code units

Segment Calculator

Message

Encoding

Segments

Units Used

Per Segment Units

Remaining

Resolved encoding: Auto
Payload bytes: 0
Character count: 0

Cost Estimator (Editable Assumptions)

Base Rate / Segment (USD)

Estimated Carrier Fee / Segment (USD)

Fixed Fee / Message (USD)

Estimated Message Cost

$0.0000

Encoding	Valid for Message	Segments	Units	Payload Bytes	Est. Cost (USD)

GSM-7 Notes

GSM-7 extension-table characters consume 2 septets (`^`, `{`, `}`, `\\`, `[`, `]`, `~`, `|`, `€`).
If a message includes characters outside GSM-7 tables, GSM-7 is not a valid encoding for that payload.
GSM-7 supports 160 chars for one segment and 153 chars per segment when concatenated.

UCS-2 Notes

UCS-2 uses 2 bytes per 16-bit code unit.
Single-segment limit is 70 code units, then 67 per concatenated segment.
For characters outside the BMP (for example many emoji), practical implementations use surrogate pairs and consume 2 code units.

Supported Character Sets

Use these references to validate payload compatibility before selecting an encoding in production.

GSM-7 Character Set

Basic table characters (1 septet each):

@ £ $ ¥ è é ù ì ò Ç \n Ø ø \r Å å Δ _ Φ Γ Λ Ω Π Ψ Σ Θ Ξ Æ æ ß É
! \" # ¤ % & ' ( ) * + , - . /
0 1 2 3 4 5 6 7 8 9 : ; < = > ?
¡ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Ä Ö Ñ Ü §
¿ a b c d e f g h i j k l m n o p q r s t u v w x y z ä ö ñ ü à

Extension table characters (2 septets each): ^ { } \ [ ] ~ | €

ASCII Character Set

Supported range: U+0000 to U+007F (7-bit ASCII).

Common printable range: U+0020 to U+007E, including letters, digits, punctuation, and symbols.

Latin-1 (ISO-8859-1) Character Set

Supported range: U+0000 to U+00FF.

Includes ASCII plus Western European accented characters such as Á É Í Ó Ú Ñ Ç Ö Ü ß æ ø.

UCS-2 Character Set

Supports 16-bit Unicode code units in the Basic Multilingual Plane (BMP), roughly U+0000 to U+FFFF.

Characters outside BMP (for example many emoji) are encoded as surrogate pairs in practical implementations and consume additional units.

SIP Trunking

Messaging

Wireless Data

Cellular Services

POTS