Hello,
I have pinned down a duplication issue caused due to a current limitation in SlowSync implementation.
This type of sync will be forced if the client and server sync anchors do not match, for example if the
previous sync was interrupted.
To understand the problem one must first look at the steps involved:
After detecting that SlowSync is needed, the content_map entries for the involved client are all deleted.
Now the client uploads all its entries to the server. Every single client entry is now passed to the corresponding egwsync_search ($content, $contentype) function.
In this function the search(content) function of the involved application storing the data (addressbook, infolog, calendar) is called.
At this point we are running into the problem:
The search function will return the contentid of the first (and only the first!) matched entry.
So say for example there are very similar entries (or even duplicate entries) that the syncing user can read (and will match), only the first one found will be returned. First of all the found content might not be the original corresponding entry on the client, secondly in the second step of SlowSync, where the server sends all items to the client that are not in the content_map (that had been cleared before starting the search) are now send to the client.
At this point, we have have a duplication on client side and an entry on the client, that does not have an entry in the content_map on the server. (only the last matched content-id will be in the table to keep entries unique).
Every upcoming SlowSync will now add more duplicates to the client. Of course if such a duplicate is modified on client side (without an entry in the map_table) , even a standard TwoWaySync, will add a duplicate on the server.
I believe the solution to this is as follows:
-do NOT delete the content_map in the first step
-see if the entry in the content map is equal to the sent entry
If yes, update the contentmap, if not -> delete the map_entry as is no longer correct and
import the record as a new record (with new content_map entry)
-after all records are processed, delete all content_map entries that have not been processed.
-upload the entries (that are not in content_map) to the client.
This will stop duplication, but will still make sure that no information is lost.
Additionally I have looked at the current horde code – they seem to do the same.
The only question for me is now, how the search function could be modified, without having to
go through the entire groupware.
Basically the search functions will need to either accept a
second parameter (content-id) and only return a single matching guid (or false if there is no match)
or the function will need to return an array of all matching entries.
Any ideas ?
–Philip