MySQL/PHP serialized array

DarkStreet

Expert Member
Joined
Jan 18, 2007
Messages
1,284
Trust me, I know your pain. I'm busy doing updates to a huge system developed by a Pakistan company and while I do this I tell all staff to NOT TALK TO ME AT ALL as it gets me in such a pissed off mood having to fix other people's crap.
 

murraybiscuit

Executive Member
Joined
Oct 10, 2008
Messages
6,483
Trust me, I know your pain. I'm busy doing updates to a huge system developed by a Pakistan company and while I do this I tell all staff to NOT TALK TO ME AT ALL as it gets me in such a pissed off mood having to fix other people's crap.

and people wonder why coders are always so pi55ed off!
 

guest2013-1

guest
Joined
Aug 22, 2003
Messages
19,800
throw me a few records and i'll try help you out.
my only issue that LIKE > REGEXP in terms of performance, but it's not as flexible obviously.
http://thingsilearn.wordpress.com/2008/02/28/mysql-query-speed-regexp-vs-like/

i don't know t-sql, but i would be interested to know how it could do a better job of searching through an unindexed column with a whack of text?

I'll try both ways, the piece in the entire serialized array may look like this:

s:12:"IMAGE_Column";s:50:"http://content.etilize.com/images/100/11970303.jpg"

No idea how the regex would look like for that (the s: bit is the actual size of the array/column and can vary) and I imagine doing a LIKE would only do it on the entire serialized array unless I do some kind of left() or right() to chop the data somehow from the end position of the actual column
 

DarkStreet

Expert Member
Joined
Jan 18, 2007
Messages
1,284
throw me a few records and i'll try help you out.
my only issue that LIKE > REGEXP in terms of performance, but it's not as flexible obviously.
http://thingsilearn.wordpress.com/2008/02/28/mysql-query-speed-regexp-vs-like/
i don't know t-sql, but i would be interested to know how it could do a better job of searching through an unindexed column with a whack of text?

Regular expressions will always be more resource intensive, whether in MySQL or scripting languages - you just need to know when to use them. In this instance, the best method would probably be reg exp as LIKE will more than likely result in records that shouldn't be included but it's not safe to say unless we've seen an extract of the database that indicates what the serialized data looks like. You may find using LIKE is a very real possibility if the data can be extracted easily without the possibility of unwanted records being included.

T-SQL is a proprietary extension to SQL and has some pretty decent features which were very helpful back when I was using ASP and MS SQL stored procedures but I doubt it would have been much help in this case.
 

murraybiscuit

Executive Member
Joined
Oct 10, 2008
Messages
6,483
acid, is "manufacturer" at least a common field throughout or can that also vary depending on the csv file?
 

guest2013-1

guest
Joined
Aug 22, 2003
Messages
19,800
acid, is "manufacturer" at least a common field throughout or can that also vary depending on the csv file?

Manufacturer is setup in the Token/Column table (see above for explanation), they then "map" certain csv columns to those specific tokens/columns they setup in the system. When the data then saves, it saves it with the token/column they chose with the data. It varies per installation and not per csv file. The csv files can be different in terms of column setup, because you just map the "right" columns whenever you want.

The example I gave in my reply to you is what it looks like more or less. And yes, you can only have 1 token/column per data column from the CSV file.

I didn't know that you could use % anywhere you want to in mysql, so I came up with this:

SELECT * FROM `products` WHERE `product_data` LIKE "%Manufacturer%;s:%:%Sony%"

And it seems to work pretty well. I had to go read the regular expressions website you gave to realize that MySQL is able to do sort of pattern matching with like as well and is faster (I just did under 1 sec on all those records returning 311 positive hits)
 

murraybiscuit

Executive Member
Joined
Oct 10, 2008
Messages
6,483
+1 for wildcards.
the LIKE syntax is actually much better than the regex because this isn't really a complex pattern.
problem solved. that wasn't too hard after all :D
 

Other Pineapple Smurf

Honorary Master
Joined
Jun 21, 2008
Messages
14,593
Please do not doubt the power of MySQL! If you know your DBs you can do magic with it. Also spend a lot of time on creating the correct partial indexes for this table. A guy at work was pulling his hair for over two weeks due to how slow MySQL was, turned out he never used the correct indexing on his DB.

One advantage with indexing is that you can take an existing schema and tweek it without any cascading effects on code - correction, the app will kick ass.
 
Top