“CDI” – Forum Thread – Discuss

- remie
- 06 Aug 12, 10:04 pm
- Comment #61

@davjand: looking at your code, it seems to be equal to the functionality Database Synchronizer. Can you explain how your implementation is any different?

- davjand
- 06 Aug 12, 10:40 pm
- Comment #62

I've not looked through the code for it but it may not be that different. The main reason for approaching it as I did is that it:

With a few simple scripts can be automatically applied across different environments (see my ci-workspace repo)
Due to using multiple timestamped files you won't end up having to do manual git merges as git sees them as separate files.
Keeps an entire history of the project allowing any developer to jump into the repo, and instantly have the entire working db setup.
In conjunction with the last point, enables simple local=>local merging if you've got multiple developers

I'm not saying that CDI or dbsync wouldn't let you do this but I was having difficulty implementing a complete continuous integration process with them (local=>local sync) (local=>dev) and (local=>release).

- remie
- 07 Aug 12, 3:18 am
- Comment #63

Both CDI and DBSync allow you to implement a complete continuous integration process. Their only black spot is the co-op between devs (local=>local sync). Basically, you've solved this by adding timestamps to the files instead of having one single file.

The problem with this approach (as with any attempt to solve this, as it seems) is that it is unreliable. Consider this scenario:

At 1pm developer A creates a field in a section.
At 2pm developer B syncs with A and adjusts the newly created field.
At 3pm developer A, which did not sync, adjusts the field differently from developer B
At 4pm developer B, which did not sync, adjusts the field differently from developer A
At 4pm developer A tries to sync with developer B

So now developer B has made two changes, and developer A made one change to the same field, with one hour apart and without sync-ing.

Now with your timestamp approach this creates a problem: either your implementation takes all files and updates the available statements. This will disregard the changes from A (the changes from step 4 are the last that will prevail.

It is also possible that your approach will save the timestamp of the last change, and only updates the changes after this timestamp. This will also cause a problem because this will disregard step 2 (because developer A made changes after developer B) which could be incompatible with step 4.

Either way, this might cause a data integrity problem, especially if you run this on ACC or PROD. Or am I missing something here?

- remie
- 07 Aug 12, 8:17 am
- Comment #64

OK... so I've come up with two other possibilities to avoid the merge problem:

Use a single line per contributor
Use a central location to store the CDI log file (perhaps a Gist?)

However, this still does not solve the problem of the execution order. Some more thoughts on that matter:

Empower the developer to solve the order issue with a UI interface in case of timestamp conflicts
Allow for a lock on structural changes (requires a central locking file). This way, only one developer can make structural changes at a time, and other developers need to sync before they can take over the lock.

Oh well, time to have a good night's sleep about this :)

- davjand
- 07 Aug 12, 8:35 am
- Comment #65

Yeah you're correct in the potential for errors in the situation! Something I forgot to mention is that my plan for dealing with conflicts is simply that this should be a team issue and not really a software one.

I completely agree with you that the locking solution could work. My approach to this would be to have different modes whereby the staging server (in my setup at least) would co-ordinate the locking. Any other version has to co-ordinate with this symphony install to check that no one else has locked the db. If they have then you have to wait for them to unlock it, make sure you're in sync and then you can make changes.

However there is potentially another option but it is much more complicated but could potentially achieve a complete software approach. It's only an idea so probably isn't the most concrete:

Basically you'd have to log symphony operations as opposed to mysql (or as well as). So for instance a field is created, the fields Id, name, section and params could be logged in a json array (maybe with the mysql as well) as well as the operation (CRUD basically). It might also need to store a reference to the last query run as well.

You could then write an updating script that could then analyse the current situation and then perform the diff as necessary. For instance if it can see that multiple update options have been performed it would thus accept the later timestamp, or if a delete has been performed for a section then any future operations on that section would be nullified.

It would be a mammoth task to say the least but I think a locking mechanism would be far more efficient. I'm only working with a small team so I'm kind of using trust in place of a software solution :)

- designermonkey
- 07 Aug 12, 5:52 pm
- Comment #66

perhaps a Gist?

Oh, god, don't do that! That's public!

- remie
- 07 Aug 12, 6:49 pm
- Comment #67

Good point :D

- davjand
- 10 Aug 12, 6:57 pm
- Comment #68

There's actually a much bigger problem with my approach it turns out.

Say I create a section (this becomes the first logged query) The query to create a section says, create a new section record

The next set of queries create the fields, in the section, these are all tied to the ID returned by the first query. This id is generated by mysql autoincrement.

However say someone else creates a section, this throws the autoincrement of. Thus any subsequent queries on the newly create section are now relating to the wrong section.

I fear that this is a problem with trying to use any of our solutions local=>local

The DB locking mentioned earlier solves the problem, I'm going to go away and think about how best to combat this though.

- vladG
- 10 Aug 12, 7:26 pm
- Comment #69

Well, starting with Symphony 2.4 the settings for Sections and Pages will be stored in files on the filesystem so perhaps you should gear your efforts towards other issues with CDI?

- creativedutchmen
- 10 Aug 12, 8:05 pm
- Comment #70

Good to hear there's progress (sortof) here!

I've been doing some thinking the past time, too, and I think there are a few - read:two - solutions to the problems we are facing right now. There is one I saw mentioned earlier which I really like:

In this solution it's not the queries that are logged, but the actions themselves. An action can be added field to section x or removed section y. These changes are harder to log, because they require to listen to pretty much every delegate in the core (still very possible, though!). These actions can then be merged dumb by either just executing them in the same order they were executed on the clients, or smart by really looking at the changes, detecting conflicts and asking the importer to select the action to keep.

An interesting side-effect of this is that it's possible to build a user-friendly UI, where you can actually select actions that make sense. Create new section messages makes sense for pretty much everyone, while Insert into sym_sections ... might not.

- remie
- 10 Aug 12, 9:27 pm
- Comment #71

@davjand: Both me and Nick Dunn have tackled the problem with the auto-increment before, there should be a solution for this.

@vlad: moving the sections / pages / etc schema to XML files would make CDI obsolete, any changes can be done using version control. This would make life so much easier, so I fully support any progress on this :)

@Huib: I also like this idea, but I think it will be a very large effort to create a meta language for these kind of operations. In the light of moving to XML based configuration, this might not be very cost-efficient?

- vladG
- 10 Aug 12, 11:38 pm
- Comment #72

Regarding the XML settings, this is the the main discussion on GitHub.

Done, no more thread hijacking.

- creativedutchmen
- 11 Aug 12, 7:15 am
- Comment #73

Remie, that is indeed one thing we need to keep in mind. I do think the merging logic will be far out the most difficult here, getting the files to merge is easy (either via delegates, or just grabbing them directly).

For instance, imagine this: there are two developers: Bob and Carl. Bob has setup a "bootstrap" install, that contains pretty much all the sections and fields the project will need, but it will need a bit of polish. Now, Carl has imported Bob's installation as a starting point, and renamed one section to make things easier to understand for the client.

If you now blindly merge both XML files, you will end up with both the original and the renamed section in the final installation. This is not something that I'd like to see: it would be much better if the extension knows the section is renamed, and only includes the new one.

So, even if we store sections in the XML, this extension will still need to step in and add meta-data (like a timestamp, or specific actions) to make it work flawlessly.

- gunglien
- 07 Sep 12, 11:15 pm
- Comment #74

@remie been using the CDI single user just fine. just had our first foray into multiple-user setup. & our sys-admin made some fancy configuration on live/staging to increment the ID in a non-even matter. eg one server is even the other is odd. So when we push the files using cdi... it usually does just an 'insert' and we end up with different page ids. However the update lines still include the page-ids as on my local. Messing up the whole update system.

PS this also happens when we push on live in different order to pages created; the update scripts reference our local environment so I would be over-riding the pages with any update queries... Was wondering weather its possible to fix the ID of the insert statements maybe using last-insert id.

- remie
- 07 Sep 12, 11:42 pm
- Comment #75

Was wondering weather its possible to fix the ID of the insert statements maybe using last-insert id.

How do you envision this?

- gunglien
- 08 Sep 12, 12:49 am
- Comment #76

Not so sure yet - but I think we could possibly check if the query is an insert and modify it with the last id? if the query was successful.

- nickdunn
- 08 Sep 12, 1:16 am
- Comment #77

The solution we had to the autoincrement problem doesn't solve the issue @davjand is describing. The problem we solved was the logging of the queries back into a MySQL table was itself modifying the last_insert_id() (or whatever it's called) that Symphony was expecting.

@davjand's problem is more about syncing two separate instances, which is a problem that neither Database Synchroniser or CDI attempts to resolve. This is more a problem of having two separate instances of Symphony (one per developer) and how to merge the two together. This is why for Database Synchroniser to work within a team situation, everyone in the team must all connect to the same database to avoid the concurrency/merging issues.

If Symphony used GUIDs instead of autocremental numeric IDs then merging would be slightly easier. But not trivial. How do you handle conflicts where two or more developers have modified the same section or field?

Having the whole team work from one database is the only viable solution until everything is stored as flat XML files. Even then, someone would need to merge files if a conflict arose.

- designermonkey
- 08 Sep 12, 1:26 am
- Comment #78

Pretty easy to solve this if you all think outside the box a little, it just means a regimented workflow, as I would expect for a process like this anyway.

Just have one server instance of the site install as a CDI master that all devs can create/edit sections on, and when it's complete, tell every other dev to update their CDI slave's, and then when dev is done, update the CDI slave that is staging the site, and then production.

I can tell you, you won't find a way around this auto_increment stuff until the structure is moved to xml and we have uuids for the sections and fields. Our DB schema is just too abstract to figure out a way. Even parsing out the IDs and replacing them on each instance won't work, as all the separate DBs will differ massively.

- nickdunn
- 08 Sep 12, 1:32 am
- Comment #79

This doesn't just go for sections, fields and pages by the way. Everything that Symphony's database contains would need to use GUIDs for this to work: extensions, delegates, section associations, authors. Given that extensions themselves will create and destroy their own tables and records too, having a regimented approach as @designermonkey says (if each developer works in isolation), or sharing one instance (if all developers work centrally) are really the only ways.

- remie
- 08 Sep 12, 3:11 am
- Comment #80

I think we should drop the effort of getting CDI to work in a multi-user environment and put our money on moving the structure to XML files.

@brendo / @nickdunn / @kanduvisla How can we speed up the implementation of this?

Symphony.

CDI

Search

Server Requirements

Symphony.

CDI

Search

You are looking at page 4 of 10

Server Requirements

Sign in