cancel
Showing results for 
Search instead for 
Did you mean: 

Datasync redundancy

SOLVED
Go to Solution
Highlighted
UnboundID tamAtping
UnboundID
0 Kudos

Datasync redundancy

Hi all,

 

Directory can be deployed in a master-master configuration, DataGov is master-slave with algorithm to determine best master.
 
How is DataSync deployed? It makes sense that there's only 1 operating DataSync server. I was hoping it can be deployed as active-passive, so if the active server fails we can promote the passive server. There is metadata that needs to be retained in case of a failed sync server, such as changelog numbers.
 
Thanks!
Tam
10 REPLIES
UnboundID PhilipP
UnboundID
Solution

Re: Datasync redundancy

You got it right. Sync is Active/Passive.
There is a regular handshake to inform the passive instance(s) that the
"master" is still active. That handshake includes basic data on the last
change number for each endpoint.

"Promotion" is automatic. You specify the order if there is more than one
passive instance.
UnboundID ArnoL
UnboundID
0 Kudos
Solution

Re: Datasync redundancy

Hi Tam,

  this is already supported.

When you setup the second sync server, just point it to the first server.

done.

 

It will wait as a passive node until the master dies.

It automatically picks up where the master left things off.

Each sync pipe has state information propagated from master to slaves for that purpose.

UnboundID tamAtping
UnboundID
0 Kudos

Re: Datasync redundancy

Thanks guys. Exactly what I need to know prior to a meeting I'm about to join!

 

Cheers

UnboundID tamAtping
UnboundID
0 Kudos

Re: Datasync redundancy

Hi guys

 

How does the promotion work? Could we ever have a scenario where we could have 2 active syncs?

 

Say if the passive mode can't talk to the active, and the active can't talk to the passive, how does one or the other know that the master has failed, and thus be promoted?

 

Alternatively, is there a way to disable automatic promotion and make it manual?

 

Thanks

UnboundID ArnoL
UnboundID
0 Kudos

Re: Datasync redundancy

However unlikely, it _is_ possible to have two (or more) instances of sync active at the same time.

 

That said, even if or when that happens, the instances would end up attempting the same reconciliation operations between the source(s) and destination(s) with slightly off timings.

The first instance would effect reconciliation.

The remaining instances would end up observing the reconciliation already effected and move on.

 

There is no way to make promotion manual and I truly believe it is better that way.

UnboundID tamAtping
UnboundID
0 Kudos

Re: Datasync redundancy

Our customer has had to deal concurrent creates in the past, and it seems to be an issue they are passionate about, even though it is a rare occurrence and highly dependent on network issues.

 

To hypothesize, let's say we have 2 datacenters, and in them we have 2 datasync engines, one active in Site A, and one passive in Site B. It wouldn't be too uncommon for the sites to lose connectivity for a host of network related reasons. Would we, in this instance, have 2 active datasync engines running?

 

From the eyes of the customer, it would be better if the passive datasync engine did not activate in this scenario. 

 

Thanks

 

UnboundID ArnoL
UnboundID
0 Kudos

Re: Datasync redundancy

In that case, if the passive instance is so passive it actually doesn't automatically kick in, I am not sure they need a "passive" instance at all then.

 

If the two sites were partitioned but Sync continued to operate normally, replication would catch up between sites when the partition is resolved, assuming the destination is Ping Directory.

 

If they want to be able to resume the synchronization at a certain point in time manually, Sync supports this by manually setting the timestamp or change number to resume at, assuming the Sync source type supports it. Out of the box, this scenario is supported for Sun DSEE, Ping Directory.

This would allow them to control how to sync after a network partition where the active instance of Sync would have become unavailable.

By doing this, they are effectively giving up High Availability for Synchronization though.

UnboundID tamAtping
UnboundID

Re: Datasync redundancy

Thanks 

 

You're right, it's not technically "passive". It's more stand-by. The only reason the stand-by instance needs to be on, is to synchronize timestamps, changelog #'s, and other metadata. Otherwise, they'd be happy to keep the stand-by instance off.

 

In their view, high availability is not as important as control. 

 

Historically, they've had issues with concurrency where 2+ records were created at exactly the same time. They'd have 2 Sun DS records, from what I've been told, and this causes them a lot of grief. It must have occurred at extreme load.

 

They're concerned that 2 sync processes would trigger the same time, one sync server replicates information to Sun DS #1, and the other to Sun DS #2, and that there might be issues reconciling the two during directory replication.

 

 

The workaround solution seems like it would suffice. Is there an easy way to retain the timestamp/change number settings of the active server?

 

Cheers!

UnboundID PhilipP
UnboundID

Re: Datasync redundancy

I think your customer needs to understand better how sync works.
Even if two (or more) sync servers run at the same time, the end result is
just more work being done than is necessary.

It will not arbitrarily create new entries (unless it is misconfigured).

It ALWAYS reads the source entry and the destination entry, compares them
and applies and diff between the two.

If it goes to create an entry and finds it already exists, it will do
nothing (but log an error) by default.

Playing games with the way it is designed to work and manually doing stuff
is much more likely to cause problems.

Philip