Deploying ejabberd for Scalable XMPP Messaging

Overview

ejabberd is a robust, Erlang-based XMPP server designed for high availability and horizontal scalability. This guide walks through planning, deploying, and operating ejabberd for production-grade, scalable XMPP messaging.

1. Architecture choices

Single node (development / small scale): Simple setup; minimal ops overhead.
Clustered ejabberd (recommended for scale): Multiple Erlang nodes forming a cluster; user sessions and routing distributed across nodes.
Load-balanced frontends + clustered backends: Use stateless XMPP frontends (ejabberd nodes) behind TCP/HTTP load balancers; optional route traffic by client type.
Database-backed state (optional): Use external RDBMS or NoSQL for roster/offline storage if you need strong persistence and cross-node sharing.

2. Capacity planning (quick method)

Estimate concurrent users: Decide target concurrent XMPP connections ©.
Memory per connection: ~50–150 KB (depends on features, MUC, presence).
CPU: Erlang handles many lightweight processes; plan for cores proportional to message throughput.
Storage: Message history, archive, and MAM requirements.
Example: For 100k concurrent users, budget ~5–15 GB RAM for connections, plus headroom for apps and OS.

3. Deployment components

ejabberd nodes: Erlang VM instances running ejabberd.
Load balancer: TCP (HAProxy/TCP mode) or XMPP-aware proxy for TLS termination and session stickiness.
Database: Mnesia (default, distributed), or PostgreSQL/MySQL for external storage.
Message queue / pubsub: Use ejabberd’s internal pubsub or integrate with external systems for analytics.
Monitoring: Prometheus + Grafana, ejabberd_stats, logs, and alerts.

4. Installation and basic configuration

Install Erlang (compatible version) and ejabberd from packages or source.
Key config file: ejabberd.yml. Set:
- hosts: domain(s) served.
- listen: ports for client-to-server (5222), BOSH/HTTP, and WebSocket.
- auth_method: internal, external, or SQL.
- acl, modules: enable mod_mam, mod_muc, mod_http_upload, modoffline as needed.

Example relevant snippets (conceptual):

Code
hosts: - “chat.example.com” listen:
port: 5222 module: ejabberd_c2s starttls: required max_stanza_size: 65536

5. Clustering ejabberd

Ensure Erlang cookie is the same across nodes (/etc/ejabberd/ejabberdctl.cfg or ~/.erlang.cookie).

Start nodes with unique names (ejabberd@node1, ejabberd@node2).

Join nodes: use ejabberdctl join_cluster ejabberd@node1 from node2.

Use Mnesia for distributed state; consider schema fragmentation for large clusters.

Test cluster: ejabberdctl cluster_status and ejabberdctl status commands.

6. Persistence options

Mnesia (default): Fast, distributed, suited for clustered Erlang apps. Ensure disk I/O and replication planning.

SQL backends: Configure auth_method and sql options for PostgreSQL/MySQL to store rosters, vCard, archive, and MAM. Use external DB for heavy persistence and easier backups.

7. Security and TLS

Use valid TLS certificates (Let’s Encrypt or wildcard). Configure TLS ciphers and enforce STARTTLS.

Enable rate limits, connection limits, and consider fail2ban for brute-force protection.

Regularly update Erlang and ejabberd for security patches.

8. Load balancing and session affinity

For TCP: HAProxy in TCP mode with source IP stickiness or consistent hashing.

For WebSocket/BOSH: HTTP load balancers with session affinity cookies or sticky routes.

Terminate TLS at edge or pass-through depending on security and scaling needs.

9. High availability and failover

Use clustering and multiple nodes across availability zones.

Use external DB replicas and backup strategies for persistence.

Configure client reconnection settings and shorter session timeouts to recover quickly on node failover.

10. Monitoring and tuning

Monitor: active connections, message rates, queue lengths, Mnesia disk activity, process memory.

Tune:

erl +ejabberd VM arguments for heap and schedulers.

max_stanza_size, rate limits, and timeouts in ejabberd.yml.

Collect metrics with ejabberd_prometheus or custom exporters; visualize in Grafana.

11. Common production modules to enable

mod_mam (message archive)

mod_muc (multi-user chat)

mod_offline (offline messages)

mod_vcard, mod_privacy, mod_blocking, mod_http_upload

12. Deployment checklist

Provision servers and set time sync (NTP).

Install Erlang and ejabberd compatible versions.

Configure ejabberd.yml (hosts, listeners, auth).

Set up TLS certificates and security policies.

Configure clustering and Mnesia/SQL backend.

Set up load balancer with proper affinity.

Enable monitoring and logging.

Perform load testing and tune parameters.

Roll out gradually and monitor for issues.

Conclusion

Deploying ejabberd for scalable XMPP requires planning around clustering, persistence, TLS, and load balancing. Start with a small cluster, enable the required modules, monitor resource usage under load, and iterate configuration based on real traffic. This approach delivers a resilient, scalable XMPP messaging platform.

Deploying ejabberd for Scalable XMPP Messaging