cancel
Showing results for 
Search instead for 
Did you mean: 

Working with Replication Conflicts

UnboundID KevinL
0 Kudos

Overview

This KB article will focus on replication conflicts which can occur under different situations during replication.  Since replication is defined as a eventually consistent state in the UnboundID products, operations that happen on different servers in the topology may conflict with each other and thus cause a conflict. In reading this article we are assuming that you have a good working knowledge of Replication in UnboundID Data Store. If not then it would be helpful for you to read the chapter on Managing Replication chapter in the UnboundID Data Store Administration Guide.  The documentation also covers replication conflicts so in this KB we will focus more on how to create them and then fix them so you can get familair with understanding and working with these issues.

 

Here are some high level points about Replication Conflicts:

 

  • Each Directory Server is responsible for handling conflict resolution at the point at which it receives changes from a Replication Server.
  • If a conflict arises in which an attribute is modified at the same time on two different replicas, the replication server uses the most recent change based on its timestamp. The older modification will be ignored and the latest one is the change that shoud be reflected across all servers in the topology.
  • If a conflict arises in which two entries with the same DN are added at the same time on two different replicas, the replication conflict resolution will keep the oldest entry DN (earliest createTimeStamp) and rename the most recent entry DN by adding the entryUUID attribute to the RDN of the entry. In addition an objectClass value is added to the entry to flag this entry as a replication conflict.  Replication conflicts are not visible to standard LDAP client operations.
  • Any unresolved conflict generates an administative alert and is logged in the errors log or sent out via other alerting mechanisms such as SNMP or SMTP.

 

The topology we are going to use with this KB article will be 3 instances of DS that are installed on the same physical server. These will be small instances so not a lot of memory is required.


 

Replication Conflicts

In this KB we will simulate the effect of adding an entry to two servers in the topology at the same time (or as close as possible) and how replication handles these scenarios.

 

In order to accomplish this we will actually have to stop the directory servers one at a time to simulate a network outage between the two servers so that replication cannot occur when the adds happen.

 

 

  1. Shutdown DS2 & DS3 instances:

    ds2/bin/stop-ds
    ds3/bin/stop-ds

     

  2. Add the following entry to DS1

    cd ds1
    bin/ldapmodify -p 1189 -D "cn=directory manager" -w secret123 -a <<+
    dn: uid=user.2002,ou=People,dc=example,dc=com
    objectClass: top
    objectClass: person
    objectClass: organizationalPerson
    objectClass: inetOrgPerson
    uid: user.2002
    givenName: Romina
    cn: Romina Valerio
    sn: Valerio
    mail: user.2002@example.com
    userPassword: password123
    +

     

  3. Shutdown DS1

    bin/stop-ds

     

  4. Startup DS2

    cd ../ds2
    bin/start-ds

     

  5. Added same entry to DS2 (copy all the lines together and past and then enter.

    bin/ldapmodify -p 1289 -D "cn=directory manager" -w secret123 -a <<+
    dn: uid=user.2002,ou=People,dc=example,dc=com
    objectClass: top
    objectClass: person
    objectClass: organizationalPerson
    objectClass: inetOrgPerson
    uid: user.2002
    givenName: Romina
    cn: Romina Valerio
    sn: Valerio
    mail: user.2002@example.com
    userPassword: password123
    +

     

  6. Start DS1

    cd ../ds1
    bin/start-ds

     

  7. Tail the error log on DS2

    cd ../ds2
    tail -30 logs/errors
    
    [20/May/2016:21:25:00.035 -0500] threadID=199 category=EXTENSIONS severity=SEVERE_ERROR msgID=1880359005 msg="Administrative alert type=replication-unresolved-conflict id=5761ff92-3298-4cc5-be9c-81e2c9ed00fe class=com.unboundid.directory.server.replication.plugin.ReplicationDomain msg='An unresolved conflict was detected for DN uid=user.2002,ou=People,dc=example,dc=com. The conflicting entry has been renamed to entryuuid=6201e4a6-b312-440f-89f2-032b4db6c72b+uid=user.2002,ou=People,dc=example,dc=com'"

     

  8. Now you can search for that entry in the directory server using the --control return-conflict-entries option.

    bin/ldapsearch -p 1189 -D "cn=directory manager" -w secret123 --control return-conflict-entries -b ou=people,dc=example,dc=com -s sub "(uid=user.2002)" "*" createTimeStamp 
    
    dn: uid=user.2002,ou=people,dc=example,dc=com
    objectClass: top
    objectClass: person
    objectClass: organizationalPerson
    objectClass: inetOrgPerson
    givenName: Romina
    uid: user.2002
    cn: Romina Valerio
    sn: Valerio
    userPassword: {SSHA}9iq1r7/L0V0W99xOv52fIBpcUf5tFZ8p+7JNhA==
    mail: user.2002@example.com
    createTimeStamp: 20160521022151.614Z
    
    dn: entryuuid=6201e4a6-b312-440f-89f2-032b4db6c72b+uid=user.12002,ou=people,dc=example,dc=com
    objectClass: top
    objectClass: person
    objectClass: organizationalPerson
    objectClass: inetOrgPerson
    objectClass: ds-sync-conflict-entry
    givenName: Romina
    uid: user.2002
    cn: Romina Valerio
    sn: Valerio
    userPassword: {SSHA}51gxEorGe1WnL0gcsf0BpSCKHxSNUB2nd5YvSQ==
    mail: user.2002@example.com
    createTimeStamp: 20160521022353.823Z

     

    You can see that the entry renamed has the entryUUID appended to the RDN. Also the existence of the objectClass: ds-sync-conflict-entry objectclass means that the server will hide this entry from normal operations. If you do the same search without the --control options then it would only return the first entry above. You can also search for all entries with "(objectClass=ds-sync-conflict-entry)" and that will bring back just the replication conflict entries. Using the above search though allows you to see the original entry (earliest createTimeStamp).

    You can also see that the passwords are hash with different values. This is the reason that the server could not auto resolve this replication conflict. You will also see that this replication conflict exists on all of your servers and it will be the same entry on each server. This is because of the way that we added the entries and is probably the normal case. There can be scenario's where you will have a replication conflict entry on only one of the servers and not on others.

  9. In addition you can search on the data in the directory server to find conflict entries without having to rely on the information from the error logs. You can ask for just the "ds-sync-conflict" attribute which will tell you which entry in the normal data that this entry is in conflict with.

    bin/ldapsearch -p 1189 -D "cn=directory manager" -w secret123 --control return-conflict-entries -b ou=people,dc=example,dc=com -s sub "objectclass=ds-sync-conflict-entry" "ds-sync-conflict"

    dn: entryuuid=6201e4a6-b312-440f-89f2-032b4db6c72b+uid=user.12002,ou=people,dc=example,dc=com
    ds-sync-conflict: uid=user.12002,ou=people,dc=example,dc=com

    With this information you can now compare each of these entries with each other to determine which entry needs to be retained and which should be removed or modified.

Replication Conflict Repair

Now that we have determined there is a replication conflict entry, we will need to do something to resolve this. Since we know in this case that both entries are the same we can simply remove the replication conflict entry.

  1. Lets get the DN of the conflict entry that we want to delete.

    bin/ldapsearch -p 1189 -D "cn=directory manager" -w secret123 --control return-conflict-entries -b ou=people,dc=example,dc=com -T -s sub "(objectclass=ds-sync-conflict-entry)" dn
    
    dn: entryuuid=6201e4a6-b312-440f-89f2-032b4db6c72b+uid=user.12002,ou=people,dc=example,dc=com

     

    You can see that we added the -T option to search which tells it to not wrap the results and also we added dn to the end of the search as we only need to get the DN back of the entry.

  2. It is best to check each instance in your topology and make sure that this conflict entry exists on each server. In some cases this conflict entry may only exist on one server. In that case we would only need to delete this entry from the one instance. 
  3. Since we are going to just delete this entry we do not have to use any special replication repair controls since this conflict entry may reside on other instances.


    bin/ldapdelete -p 1189 -D "cn=directory manager" -w secret123 <<+
    entryuuid=6201e4a6-b312-440f-89f2-032b4db6c72b+uid=user.2002,ou=people,dc=example,dc=com
    +
    
    Processing DELETE request for entryuuid=6201e4a6-b312-440f-89f2-032b4db6c72b+uid=user.2002,ou=people,dc=example,dc=com
    DELETE operation successful for DN entryuuid=6201e4a6-b312-440f-89f2-032b4db6c72b+uid=user.2002,ou=people,dc=example,dc=com
    


    Do not specify the dn: component in the string of the entry to be deleted.

    ** If we wanted to delete this entry only from this one instance and not have the delete operation replicated since the entry does not exist on the other servers, you can use the "--control replication-repair" option with the above command. **

  4. Now the entry is deleted from the directory and the delete operation also replicated to other servers in the case that it existed there.

  5. Now let's do a final check to make sure there are no conflicts in the directory.

    bin/ldapsearch -p 1189 -D "cn=directory manager" -w secret123 --control return-conflict-entries -b ou=people,dc=example,dc=com -T -s sub "(objectclass=ds-sync-conflict-entry)" dn
    


    This search should return no entries.  You should also run this search across all other servers to ensure that they are all clean.

  6. Let's also do a search against our monitor entry to see what it shows now.

    bin/ldapsearch -p 1189 -D "cn=directory manager" -w secret123 -b "cn=monitor" -s sub "(&(objectClass=ds-replica-monitor-entry)(base-dn=dc=example,dc=com))" "*"
    
    dn: cn=Replica dc_example_dc_com,cn=monitor
    resolved-naming-conflicts: 1
    conflict-entry-count: 0
    


    So you can see another way to find out of there are any replication conflicts in your systems is to check the conflict-entry-count for each backend in your directory server. 


Automatic Replication Conflict Resolution

In this section we are going to create an entry on both servers like we did above, only these entries won't have passwords. In this case these entries should compare to be identical on both servers.

 

  1. Shutdown DS2

    cd ../ds2
    bin/stop-ds

     

  2. Add the following entry to DS1

    cd ../ds1
    
    bin/ldapmodify -p 1189 -D "cn=directory manager" -w secret123 -a <<+
    dn: uid=user.3001,ou=People,dc=example,dc=com
    objectClass: top
    objectClass: person
    objectClass: organizationalPerson
    objectClass: inetOrgPerson
    uid: user.3001
    givenName: Romina
    cn: Romina Valerio
    sn: Valerio
    mail: user.3001@example.com
    +
    

     

  3. Stop DS1

    bin/stop-ds2

     

  4. Start DS2

    cd ../ds2
    bin/start-ds

     

  5. Add the entry to DS2

    bin/ldapmodify -p 1289 -D "cn=directory manager" -w secret123 -a <<+
    dn: uid=user.3001,ou=People,dc=example,dc=com
    objectClass: top
    objectClass: person
    objectClass: organizationalPerson
    objectClass: inetOrgPerson
    uid: user.3001
    givenName: Romina
    cn: Romina Valerio
    sn: Valerio
    mail: user.3001@example.com
    +

     

  6. Start DS1

    cd ../ds1
    bin/start-ds

     

  7. Check the error logs.

    tail -30 logs/errors


    You should not see any messages about replication conflicts.

  8. Search for the entry you added using the --control option.

    bin/ldapsearch -p 1189 -D "cn=directory manager" -w secret123 --control return-conflict-entries -b ou=people,dc=example,dc=com -s sub "(uid=user.3001)" "*" createTimeStamp 
    
    dn: uid=user.3001,ou=people,dc=example,dc=com
    objectClass: top
    objectClass: person
    objectClass: organizationalPerson
    objectClass: inetOrgPerson
    givenName: Romina
    uid: user.3001
    cn: Romina Valerio
    sn: Valerio
    mail: user.3001@example.com
    createTimeStamp: 20160521023830.870Z

     

  9. Now we can search on some of the monitoring information to see what the server stats show with respect to replication conflicts. We will search on the cn=monitor branch and look for the entries that represent the Replica view of server. In this case these are the entries that have an objectclass=ds-replica-monitor-entry on them. We will also want to add the base-dn of the backend we want to see this information for, otherwise we would see all backends that are replicated (which we might want to).

    bin/ldapsearch -p 1189 -D "cn=directory manager" -w secret123 -b "cn=monitor" -s sub "(&(objectClass=ds-replica-monitor-entry)(base-dn=dc=example,dc=com))" conflict-entry-count resolved-naming-conflicts
    


    This information is stored on the Replica entry in the monitor backend.

    dn: cn=Replica dc_example_dc_com,cn=monitor
    base-dn: ou=people,dc=example,dc=com
    resolved-naming-conflicts: 1
    conflict-entry-count: 1
    


    So you can see that the server is tracking these stats which you can report on as well.

    To see all of the attributes of the Replica monitor entry you can issue the following search.

    bin/ldapsearch -p 1189 -D "cn=directory manager" -w secret123 -b "cn=monitor" -s sub "(&(objectClass=ds-replica-monitor-entry)(base-dn=dc=example,dc=com))" "*"
    

Appendix: Creating Data Store configuration for this use case.

 

For the purposes of this article the following will create a deployment of 3 UnboundID Directory Servers to illustrate how to handle and work with replication conflicts as discussed above. These instances are all installed on the same phyiscal server so they will require different ports for both the LDAP and Replication ports.  We will use the following ports for this example:

 

DS1:  LDAP = 1189 / Replication = 1190

DS2:  LDAP = 1289 / Replication = 1290

DS3:  LDAP = 1389 / Replication = 1390

 

  1. Unzip the software for our 3 directory servers.

    cd /{install_folder}
    unzip -qq UnboundID-DS-5.2.0.2.zip mv UnboundID-DS ds1 cp -r ds1 ds2
    cp -r ds1 ds3

    Note: You can only do the copy of the binary folder as above prior to running the setup command on the server. At this point the software is not configured and thus this is the same as unzipping the software again for each directory instance.
  2. Now we will run the setup command to create  the initial configuration for each data store.  On the first data store we will load 2000 example entries into the backend database.

    cd /{install_folder}/ds1
    ./setup --cli --acceptLicense --baseDN dc=example,dc=com \
    --no-prompt --ldapPort 1189 --sampleData 2000 \
    --rootUserDN "cn=directory manager" --rootUserPassword secret123 \
    --maxHeapSize 512m
  3.  On the subsequent servers we will only add the base entry of dc=example,dc=com as we will be initializing these servers from ds1 when we setup replication.

    cd /{install_folder}/ds2
    ./setup --cli --acceptLicense --baseDN dc=example,dc=com \
    --no-prompt --ldapPort 1289 --addBaseEntry \
    --rootUserDN "cn=directory manager" --rootUserPassword secret123 \
    --maxHeapSize 512m

    cd /{install_folder}/ds3
    ./setup --cli --acceptLicense --baseDN dc=example,dc=com \
    --no-prompt --ldapPort 1389 --addBaseEntry \
    --rootUserDN "cn=directory manager" --rootUserPassword secret123 \
    --maxHeapSize 512m
  4. For the first replication setup, we will enable replication between ds1 and ds2

    cd /{install_folder}/ds1

    bin/dsreplication enable \
    --host1 `hostname -f` --port1 1189 --replicationPort1 1190 \
    --bindDN1 "cn=Directory Manager" --bindPassword1 secret123 --location1 SITE1 \
    --host2 `hostname -f` --port2 1289 --replicationPort2 1290 \
    --bindDN2 "cn=Directory Manager" --bindPassword2 secret123 --location2 SITE1 \
    --baseDN dc=example,dc=com --adminUID admin \
    --adminPassword secret123 --no-prompt --ignoreWarnings
  5. Next we will add ds3 to the replication topology:

    bin/dsreplication enable \
    --host1 `hostname -f` --port1 1189 --replicationPort1 1190 \
    --bindDN1 "cn=Directory Manager" --bindPassword1 secret123 --location1 SITE1 \
    --host2 `hostname -f` --port2 1389 --replicationPort2 1390 \
    --bindDN2 "cn=Directory Manager" --bindPassword2 secret123 --location2 SITE2 \
    --baseDN dc=example,dc=com --adminUID admin \
    --adminPassword secret123 --no-prompt --ignoreWarnings
  6. Check the replication status:

    bin/dsreplication status -p 1189 --adminUID admin \
        -w secret123 --no-prompt --showall --displayservertable

    Note that we used the global admin account, because this is guaranteed to have access to all servers in the topology.

    Take a look at the replication log (logs/replication)

    You will see notification that generation IDs are not equal, and so replication is currently suspended.

    You will also notice that admin and schema data has been exported - this is used to sync both directories with identical data in these backends. You don't have to set up replication for schema and admin data, its automatic.

  7. The next step is to initialize ds2 with the user data from ds1.

     bin/dsreplication enable --host1 `hostname -f` --port1 1189 \
    --bindDN1 "cn=Directory Manager" --bindPassword1 secret123 \
    --replicationPort1 1190 --host2 `hostname -f` --port2 1289 \
    --bindDN2 "cn=Directory Manager" --bindPassword2 secret123 \
    --replicationPort2 1290 --baseDN dc=example,dc=com --adminUID admin \
    --adminPassword secret123 --no-prompt

    Check logs/replication on ds2 and you will see the messages indication initialization.

    Re-run the dsreplication status command. You will now see both servers having the same generation ID and entry count.

  8. We will now test replication. Start by modifying user.1999 to replace the attribute description with text indicating a modification made from ds1:

    ./bin/ldapmodify -p 1189 -D "cn=directory manager" -w secret123 <<+
    dn: uid=user.1999,ou=People,dc=example,dc=com
    changeType: modify
    replace: description
    description: This is a modify done on DS1
    +
    
  9. Check that this change has replicated to ds2:

    ./bin/ldapsearch -p 1289 -D "cn=directory manager" -w secret123 \
        -b dc=example,dc=com -s sub "(uid=user.1999)" dn description
  10. Next modify user.1998, indicating a change made from ds2:

    ./bin/ldapmodify -p 1289 -D "cn=directory manager" -w secret123 <<+
    dn: uid=user.1998,ou=People,dc=example,dc=com
    changeType: modify
    replace: description
    description: This is a modify done on DS2
    +
  11. Use ldapsearch to check on ds1 to ensure the change was reflected there:

    ./bin/ldapsearch -p 1189 -D "cn=directory manager" -w secret123 \
        -b dc=example,dc=com -s sub "(uid=user.1998)" dn description
  12. Now let's add our 3rd server into the replication topology:

    ./bin/dsreplication enable --host1 `hostname -f` --port1 1189 \
    --bindDN1 "cn=Directory Manager" --bindPassword1 secret123 \
    --replicationPort1 1190 \
    --host2 `hostname -f` --port2 1389 \
    --bindDN2 "cn=Directory Manager" --bindPassword2 secret123 \
    --replicationPort2 1390 --baseDN dc=example,dc=com --adminUID admin \
    --adminPassword secret123 --no-prompt --ignoreWarnings
  13. Run dsreplication status again to see that the third server has been added to the topology (but not yet initialized).

  14. The next step is to initialize ds3 with the user data from ds1.

    bin/dsreplication enable --host1 `hostname -f` --port1 1189 \
    --bindDN1 "cn=Directory Manager" --bindPassword1 secret123 \
    --replicationPort1 1190 --host2 `hostname -f` --port2 1389 \
    --bindDN2 "cn=Directory Manager" --bindPassword2 secret123 \
    --replicationPort2 1390 --baseDN dc=example,dc=com --adminUID admin \
    --adminPassword secret123 --no-prompt
  15. Run dsreplication status again. You will now see all three instances with the same generation ID and the same entry count.


 

 

VERSION 10112016
Copyright © UnboundID 2016