Thirdlane PBX HA beta with support for Asterisk 1.8 is available

Submitted by mattdarnell on Tue, 09/21/2010 Permalink

I would have much rather seen you drop support for 1.6.X

We will be using 1.4 until 1.8.10 if the Asterisk release/bug schedule stays the same as it has been since 0.8

-Matt

Log in to post comments

Submitted by thirdlane on Tue, 09/21/2010 Permalink

Matt,

Thanks for the feedback - this is exactly what we are looking for.

Who knows, we may support all flavors for a while if that is what customers want. Nothing makes you learn what's needed faster than announcing you are dropping something :).

That said - I won't know exactly until i check out VoIP telephony in Havaii :)

Log in to post comments

Submitted by brian on Fri, 09/24/2010 Permalink

Hi Alex,

Would really like to get some information on the way the PBX expects DRBD and Linux-HA to be setup.

We are using this scenario with our proxies and would really like to start doing the same with Asterisk.

Thanks,
Brian

Log in to post comments

Submitted by brian on Wed, 10/20/2010 Permalink

Hi Alex/Erik,

Just about to take a look at this and I'm getting a 404 on the beta link.

root@berlin ~/Downloads/tl # wget http:////www.thirdlane.com/downloads/thirdlane-en-st-2.1-beta-i386.iso
--2010-10-20 11:01:42-- http://www.thirdlane.com/downloads/thirdlane-en-st-2.1-beta-i386.iso
Resolving www.thirdlane.com... 72.52.64.63
Connecting to www.thirdlane.com|72.52.64.63|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2010-10-20 11:01:43 ERROR 404: Not Found.

If you could post the correct link here.

Thanks
Brian

Log in to post comments

Submitted by dozment on Thu, 10/21/2010 Permalink

Sorry I'm late here, but I agree with Matt. We're going to skip 1.6, and will eventually go to 1.8. However, for the near future we will stick with 1.4.

Log in to post comments

Submitted by eeman on Fri, 10/22/2010 Permalink

brian: some problems were discovered, an updated beta is about to be released.

dozment:
I totally understand. 1.8.0 released yesterday but, given history or releases, its probably going to be a mess for the first 12 releases (approx 6mos). this release should prove to be interesting since its the first release to support IPv6 and the remaining IPv4 space does not have much time left.

Whats more interesting is that you've by now heard of 'devices' to convert between IPv4 and IPv6. That, ladies and gentlemen, is just another for of NAT (network address translation) and we know what a pain that can be.

Lets hope that 1.8.x becomes production grade for the volume and scale of MTE before everyone switches to IPv6 :-)

Log in to post comments

Submitted by matt44 on Thu, 11/11/2010 Permalink

Hello,

With this new version, we can setup up two TL servers (active / passive). Do we need two or one licence ?

regards

Matt

Log in to post comments

Submitted by thirdlane on Thu, 11/11/2010 Permalink

You would need a special license (price to be determined) with 2 MAC addresses.

Log in to post comments

Submitted by thirdlane on Thu, 11/11/2010 Permalink

We are not getting enough feedback - do you find that it works as expected, does not work at all, no one tried :) ?

Log in to post comments

Submitted by mattdarnell on Thu, 11/11/2010 Permalink

Alex,

Everyone I have talked to is excited about it but nobody knows how to get it going.

Are there any documents on how to setup the HA availabilities?

-Matt

Log in to post comments

Submitted by compusource (not verified) on Tue, 02/01/2011 Permalink

I was going to build a two machine HA cluster from scratch myself using heartbeat but NOT using drdb, using rsync so it was ALMOST realtime but it would be a cinch performance and administration-wise. I'll get this new beta HA iso on a machine and see if we can duplicate our existing stuff on there, then move on to getting HA services working well. For the time being, I was planning MAC and IP takeover. How does this iso expect failover and failback to work? Hearbeat would be over a crossover cable connected between the servers. Any further guidance or docs would be useful.

Log in to post comments

Submitted by eeman on Tue, 02/01/2011 Permalink

failback is a bad decision.

Log in to post comments

Submitted by thirdlane on Tue, 02/01/2011 Permalink

To configure the active/passive HA pair you need to do the following:

1) Install from HA ISO (currently 2.1) on 2 servers
2) Connect servers second NICs with a crossover cable
3) Go to Cluster Management -> Registered PBX Servers (on either server)
4) Click on Register PBX Server and register the other server, make sure it is Reachable
5) Select master and slave in the drop downs
6) Click on "Configure HA cluster"
7) Change host names, enter Cluster IP address and other network information
8) Click on configure, watch the progress messages on the top of the screen and wait :)
9) After you get a message about servers rebooting wait for enough time for both master and slave to reboot - this may vary
10) Click on the link in the message - you should be connected to the PBX Manager via cluster address

Log in to post comments

Submitted by compusource (not verified) on Tue, 02/01/2011 Permalink

Thanks Alex! We'll try this and let you know how it goes, but it may take a tick, as it requires actually traveling to the colo. @Erik You mean failback without manual intervention is a bad idea or that failing back is a bad idea period? That is, would you recommend waiting for another failure in order for services to be sent back to the originally active/then failed/now passive machine? I figured it be better to get stuff back on the primary lest your failback process be tested at the worst possible time, when the one working server you had went down.

Log in to post comments

Submitted by eeman on Wed, 02/02/2011 Permalink

no, in heartbeat failback is defined as when the resources on the primary server become available, return services to the primary box.

This would play out as follows..

its 11am and your primary box experiences a failure, failing over to the secondary or slave server.

at 2pm you have identified and corrected the hardware failure and power the primary server back online. This happens to be the busiest part of your day and you have over 260 concurrent calls among 4000 available handsets.

at 2:05pm the primary server starts heartbeat and issues an hb_takeover to regain services thereby dropping all 260 calls that were in progress. Once service is regained your support desk becomes flooded with service complaints.

its best to just run on your secondary server until a time in which you can manually issue hb_takeover from the primary server (like in the middle of the night).

Log in to post comments

Submitted by caw on Fri, 02/04/2011 Permalink

Hi Alex-

I followed your instructions, but things got messed up somewhere...

As soon as I clicked "Configure" after making changes to the cluster settings, I received the following messages:

cannot stat /dev/drdb0 No such file or directory
mount special device /dev/drbd0 does not exist

I saw above where you said, "The ISO installer now allows to setup a DRBD partition required for data replication." Was there a step I missed in the install or does that mean that there are API type hooks in the installation that will allow me to install DRBD later?

I received no other errors, but /etc/httpd/ was empty and /etc/asterisk/ only had defaults.txt.

One other real quick question: When I configured the cluster I was asked for the Virtual IP. I'm assuming that I have to create eth0:0 myself and TL won't do it for me, correct?

Log in to post comments

Submitted by caw on Tue, 02/08/2011 Permalink

Ok, so I reinstalled the two servers in my lab (yet again) and attempted to get HA set up before doing any customizations.

After the initial install, I configured eth1 and connected both with a x-over cable. I then configured Eth0:0 with my virtual IP. Finally I ran `rpm -Uvh /var/thirdlane_load/drbd/*.rpm' to get drbd installed and started the service on both boxes.

I made more progress this time. I saw no errors, about 90 minutes later the sync was complete and the boxes reboot after the following message:

"Processing final steps of cluster configuration. Master and slave servers are going to be rebooted shortly. Once the reboot is completed you can login to PBX Manager at the Cluster URL https://192.168.100.10:10000/session_login.cgi"

Now, HTTPD on the master box won't start and throws "/etc/httpd/conf/httpd.conf: No such file or directory" and the slave box won't access the network via it's primary interface.

Any suggestions would be most helpful.

Log in to post comments

Submitted by jawaidbazyar on Wed, 02/16/2011 Permalink

Hi,

I have the Thirdlane MTE HA Beta with Asterisk 1.6 installed on two vmware virtual machines. Installation from the ISO went without issue, and I pretty quickly went through your instructions for configuring the cluster.

I was able to trigger failovers by powering down my primary, then turning the primary back on to automagically restore service. As eeman was saying earlier, automatically flipping back to the primary is a bad idea, and we need to know how to turn that off.

Going through this I have developed some questions. There are many, and I apologize in advance for that :-)

Primary Questions:

Can I backup my current tenant settings, non-HA Asterisk 1.4, and restore onto a HA system Asterisk 1.6?

Alternatively, what would be the best way to migrate an existing MTE system to an HA MTE system?

What K/sec should I get during DRBD synchronization? I am only getting 2000Kbyte/sec (2MByte/sec) between 2 virtual machines - very slow. Would a dedicated Gig port get 100MByte/sec? Or is this something that just takes a long time the first time?

The DRBD Device is only 4GB. Does this scale up automagically, or does it require manual scaling?
Would this limit voicemail storage size?
Could we use a separate SAN GFS filesystem for voicemail?

Instead of DRBD, can we use a SAN? Both devices mounted to the same SAN GFS block device. No "syncing" needed.

Where is DRBD data actually stored?

In Cluster Management, Slave shows Unreachable and down. Is that normal?

Does DRBD do any fencing? Or would fencing be independent of that?

Misc Questions

DNS Domain setting: What is this used for?

/etc/asterisk is blank on main system. Will files on DRBD be linked-to from here? Or are config files all in a different place in the HA system?

Is there a tool to migrate individual tenants from another system?

Spelling error: "HA Cluster Conigurator"

Log in to post comments

Submitted by eeman on Thu, 02/17/2011 Permalink

the HA Beta is not meant for production systems. It was meant as a beta to test a self installing application that creates one scenario for clustering. Deviations from this scenario (such as your SAN) are not part of the scope of the HA ISO, but rather a custom built cluster with different components. The thirdlane MTE dual-MAC license will work with a SAN storage for the cluster but no, its not part of the ISO. Fencing is another component that would require you to either do-it-yourself or hire someone to do that portion. As you may or may not know, fencing with heartbeat is a very case-by-case basis. Every remote management device I have ever had to deal with has required its own scripts to be written for that device. This got even more complicated when utilizing redundant power supplies connected to redundant power sources for increased fault tolerance.

if you want to know what DRBD is I suggest you peruse www.drbd.org. Theres a lot of information that will explain how it works and how it replicates.

Log in to post comments

Submitted by rolavarria on Thu, 02/17/2011 Permalink

Do you know if there´ll be a release of the MTE HA version for production use, and if so, when will it be??

Thanks

Log in to post comments

Submitted by George on Sat, 02/19/2011 Permalink

when it this going to be available to current customers..?

Log in to post comments

Submitted by thirdlane on Mon, 02/21/2011 Permalink

There are a few reasons we did not make this a production release yet.

Firstly, we would have liked to release it with Asterisk 1.8, but as of today there is known multiple parking lot related bug https://issues.asterisk.org/view.php?id=18553 ( thanks Dave for finding the bug and reporting it to Digium) which we would like to be fixed before release. Another important reason is that we are still collecting feedback - there was not enough and since we would like to see how our approach to redundancy works in the field we are cautious to claim it to be a production release. So if you'd like to speed up this process install the beta ISO in the test environment, ask us for help if you need to, and really give it a try.

Log in to post comments

Submitted by jawaidbazyar on Mon, 02/21/2011 Permalink

Hi Alex,

When you get a chance, can you and/or your partner address my questions above? They will assist greatly in my evaluation of the HA version.

Regards,

Jawaid

Log in to post comments

Submitted by rolavarria on Fri, 02/25/2011 Permalink

I´ve been reading the DRBD user´s guide and this is what it says on chapter 8 (http://www.drbd.org/users-guide/ch-pacemaker.html):

"Pacemaker is the direct, logical successor to the Heartbeat 2 cluster stack, and as far as the cluster resource manager infrastructure is concerned, a direct continuation of the Heartbeat 2 codebase. Since the intial stable release of Pacemaker, Heartbeat 2 can be considered obsolete and Pacemaker should be used instead."

... and on chapter 9 (http://www.drbd.org/users-guide/ch-heartbeat.html):

"This chapter talks about DRBD in combination with the legacy Linux-HA cluster manager found in Heartbeat 2.0 and 2.1. That cluster manager has been superseded by Pacemaker and the latter should be used whenever possible — please see Chapter 8, Integrating DRBD with Pacemaker clusters for more information. This chapter outlines legacy Heartbeat configurations and is intended for users who must maintain existing legacy Heartbeat systems for policy reasons."

Why did you choose heartbeat instead of pacemaker for the beta release of HA?

Thanks!

Log in to post comments

Submitted by eeman on Fri, 02/25/2011 Permalink

because the needs are entirely simple. Technically the configuration files are that of hearbeat 1 even though the binaries are hearbeat 2. Of course you haven't seen the pricing yet. I'm sure we could look into using pacemaker if you are willing to accept $10k licensing fees for the cluster license alone. Thats in addition to the $3k per machine licensing cost of the MTE management interface.

Log in to post comments

Submitted by rolavarria on Fri, 02/25/2011 Permalink

I´m not sure i got your answer right....

so, regarding the 10k licensing fee you´re talking about, would it have to paid to you guys for developing the whole cluster interface?? Because I looked into http://www.clusterlabs.org/wiki/License and it says that pacemaker is a free software.

Maybe I didn´t explain myself right in my first question. Does heartbeat have any advantages over pacemaker, besides ease of configuration?, if I were to implement it on my own??

Is there another reason for choosing heartbeat over pacemaker? What would you recommend?
I´m considering doing the whole HA on my own instead of using the PBX HA realease.

Thanks again!

Log in to post comments

Submitted by eeman on Fri, 02/25/2011 Permalink

yes, as far as an extra licensing fee its not licensing for the underlying software but licensing for the development of the management application etc.

The PBX HA is just one of many possible scenarios to perform HA. It came about from a demand for a specific subset of customers who have poor unix skills and for whatever reason did not want to hire professional services to assemble their cluster. HA clustering of MTE has been accomplished long before this ISO release. I believe those demanding an ISO version somehow thought they were going to get the milk for free. They are, in fact, in for a huge realization that there is a separate cost associated with the cluster component. They price might come in less expensive than paying for professional services, but then again they are locked into a single cluster scenario. If they do not like this scenario, they should not opt for the purchase.

If you are a competent unix/linux administrator with some clustering experience you will be able to build a cluster for yourself that is superior to what the PBX HA version provides. This is due to the fact that your cluster will be entirely tailored to your needs whereas bulk ISO releases are somewhat bland, watered down versions so that 'one size fits many'.

When taking your own approach your choices are quite expansive. There is no rule that you need to use DRBD at all. Given the financial backing you could opt for an iSCSI SAN on a backside 10Gbps Ethernet segment. As far as the piece that monitors availability of resources feel free to use Hearbeat v1, v2, or anything else

Log in to post comments

Submitted by nimit on Tue, 08/28/2012 Permalink

Respected Sir,

Alex Epshteyn

I am Nimit Gajjar from VOIP OFFICE company. We are your currently customer. and I am finding a solutions for my T-L box Fail over solutions. I will plan to configure my server as below.

1. USA Server T-L- Box 1 Real IP with Domain Name. = Locations :1
2. Canada Server T-L- Box 2 Real IP with Domain Name. = Locations :2

Now If my USA Server is gose down then my all customer phones connect automatically without any changes connect my Canada Server T-L Box 2 Locations : 2.

and I want to use only one Domain Name for this.

so is this possible with our Third-Lane Box or not ?

Regards
Nimit Gajjar
VOIP OFFICE.

Log in to post comments

Submitted by eeman on Tue, 08/28/2012 Permalink

its not a thirdlane issue its a network issue you face..

1) you need a MINIMUM 45Mbps connection point-to-point between the two PBX's to replicate data. If you think you can go cheap and just install a T1 then you are in for a painful awakening when the load on the primary pbx backs up so high that it stops processing calls.

2) your providers at both locations has to have the ability to use THE EXACT SAME IP ADDRESS at both locations. If your PBX's ip is 69.64.22.12 then that has to have the ability to become the IP on your standby PBX, regardless of where you host it.

If you cannot meet these two requirements then your answer is a resounding no.

Log in to post comments

Submitted by nimit on Wed, 08/29/2012 Permalink

Respected Sir,

thank you very much for your reply I am very happy with this forum.

So what is the best way for fail over HA Asterisk Server Solutions as per your angle.

Thanks
Nimit Gajjar

Log in to post comments