mysql question for you experts

guest2013-1

guest
Joined
Aug 22, 2003
Messages
19,800
Reaction score
13
When I do this:

select count(productID) from product inner join subject on subject.subjectid = product.subjectid where subject = 'yo momma'

(as an example)

I see the query in the connection list go through, but then I see this:

Sending to Net - select * from product

Then after a while

Sending data - select * from product

Now, nowhere do I specify "*" and I always have a where statement because the product table has 6 or 7 million records in it.

This obviously slows things down quite dramatically...

Any ideas why mysql does a "select * from" whenever you do a count?
 
I'm guessing but maybe because a count(*) is the same as a select * in terms of the work the db engine actually has to do...
 
Yea, I thought so too, but I'm actually doing a count on the productid itself, so not wild-carding it.

I must admit that I had count(distinct productID) for some stupid reason in there, so managed to shave off a couple of seconds. Strange thing is, on my dev machine it's nice and fast, but on the live server it's slow as hell. I really have no idea WHAT to check, on average I run 4-6 seconds for a full search and count but on the server it times out OR takes 2+ min

Same edition mysql installed, both running on 64bit windows IIS7 (not php, asp connecting to mysql)... could it be the connector that might be different? *shrug* I feel stupid now. Usually could figure out WTF is going on already
 
They definitely aren't the same thing. Where are you viewing this connection list, in mytop?

Executing COUNT(*) isn't really a performance concern in MyISAM or InnoDB (unlike SELECT(*)). In many cases it's faster than COUNT(col) - if col isn't covered by an index or can contain NULLs - as it's almost always able to use a good covering index.

Regarding your performance issues, it almost certainly isn't the database driver/connector. However you can run the query natively through MySQL on your dev PC and the server and compare performance - take the connector out of the equation. You should also run an EXPLAIN query on both servers to make sure that you're hitting your indexes on the production server. I'm also assuming that your dev server and production server both have 6 million rows as that would obviously affect query time....
 
Executing COUNT(*) isn't really a performance concern in MyISAM or InnoDB (unlike SELECT(*)).

I should have said - There's nothing wroing with executing COUNT(*).

There is a marked difference in performance when using COUNT in InnoDB and MyISAM due to their locking strategies. (MyISAM keeps the track of the number of records in a table). When dealing with large databases (particularly those running on InnoDB), I'd recommend using triggers to increment/decrement a stored count field.
 
Last edited:
It's a mix between MyIsam and Innodb, the product table currently on Innodb (I didn't design the database, the ERD is non-existent and the normalizing of data is terrible)

The reason why I ask is because with the same setup (both dev & live currently has the same amount of records), i get significant differences in speed between the 2.

The dev machine I'm working on has the same installation/setup as the server AFAIK, except the server has 2 gig more ram (I have 4gig on my dev machine) and dual quad core CPU's (I only have a silly one, can't remember it's name...)

Then the other obvious difference is the OS (but both running IIS7 so not much there)

Natively (sql browser), when I run the queries, the one on my machine runs between 4.85 seconds - 6.2 seconds, whereas the one on the live machine runs 22.34 seconds - 59.89 seconds

Same query, same database (unless someone changed something on the live machine) using it's indexes etc. I initially had some success shaving off about 1 second on my queries by using "like" instead of "match against" but that didn't do much to improve it on the server itself. The other thing could be the hard drive? I have a 10000 rpm SATA laptop drive (which kicks several kinds of asses) on my dev machine whereas I think the server only has a 7200rpm SATA in there (no raid I think)

I'm REALLY confused with this and why it's so slow. I even went and duplicated the "my.ini" from my machine over to the server (the server does run the previous password hash or whatever, whereas my dev doesn't). Max ram mysql uses is about 600-700mb even though the buffer for innodb is set to 4 gig (the db table is only 1gb big)

"subject" (in my example) is a column name.

I really feel dumb... usually I would have figured this out already but can't seem to see where the hell the bottle neck is. I do see some huge spike in Network traffic at some point on the Nic (using "localhost" as the connection) so it might be something to do with how Web Africa setup their webserver? because 127.0.0.1 reverses to xxxx.wadns.net (xxxx = pc name of client). So it might be doing an outbound send to some router through the nic and then comes back in again??

I'll try and figure out how to use explain to see, pity I can't do much on the live server now, OSS workbench is pretty awesome but not on installed on live. Dev team who put this together said there's nothing they could do in terms of speed on the search for this client, unless he bought a caching server.... and I'm like "yea, if you have money you could throw it at new hardware, but ****ty best-practices...or lack thereof, will always mean you have to upgrade instead of code/plan properly)

Thanks for the advice so far. However, re: the count on the InnoDB table, I'm not doing a count on the entire table each time (or maybe it is?), it's to do paging. I need the max records returned by the search query so I can setup the paging to work client side. If you know of a nifty way to do that from inside mysql itself it'd be awesome. I wrote a SQL Server SP which wraps around any other stored procedure or table select that could do paging by itself without hassle, but mysql I'm a bit of a n00b...
 
select count(productID) from product inner join subject on subject.subjectid = product.subjectid where subject = 'yo momma'

Have you tried or see any performance difference with using this? > select count(product.productID) from product <rest of query>
 
Going back to a point mentioned above - you're absolutely sure the Product table is InnoDB on the production server, and that subjectid is indexed? MyISAM tables do not automatically index foreign keys. Also, have you tried rebuilding the indexes if they are indeed present?
 
Another thought, have you compared the performance of your query with, for example:

select count(*) from product where subjectid in ( select subjectid from subject where subject = 'yo momma' )

Essentially the same thing, but perhaps a little less expensive than a join.

Also, I assume, whatever the "subject" in "subject = 'yo momma" is, that's indexed too, right?
 
Disregard the SQL rewrite, so used to working in Oracle, where it performs better - according to online resources, MySQL actually handles joins better than subqueries, long as the foreign keys are indexed.
 
I'll try what you suggested anyway because I'm not sure why there's such a big difference between my dev and my live.

The products table is definitely InnoDB, but the table that holds the "subject" is MyIsam (all of them that I inner join is MyIsam)

Something I did notice is that my w3wp.exe process handling the website on live goes over 1.1gb in memory suddenly when doing a query, indicating that it might not only be the server but the site performance as well. AGAIN, the clone I have of live differs wildly in performance to on my machine... could it be a corrupt database or something? And why would w3wp process bump so much in ram each time when there's a limit 20 on the sql statement? Acts as if I'm sending through a massive amount of data....
 
Acid... probably doesn't have anything to do with your query, but it may trigger a thought or something. I noticed my my w3wp.exe RAM usage spiking in a query I was working on a few years ago. Also didn't return a lot of rows and I was stumped for a few weeks. It happened intermittently. It turned out that my query was spewing a lot of messages (prints and exceptions) while it was running, and that w3wp.exe was reading these and storing them while the query was running. I can't remember the exact circumstances, but it wasn't part of the result set, it was the stuff that usually pops up in the "messages" window when u run a query in sql server management studio.

Point is, there could be other data passing between the DB and your w3wp process that isn't result set related.

Good Luck solving the problem.
 
Well, I'd say best place to start would be with the EXPLAIN plans, just to eliminate any obvious factors like unused / missing indexes, malformed SQL, etc. Just insert "explain" in front of your query in a GUI or from the command line on both the dev and production servers and paste the output.

Additionally, is the MySQL server running on the webserver machine? If so, what kind of load is the machine under? If it's already heavily taxed, the above could simply be the result of a lack of resources.
 
Thanks guys,

Will monitor what you said shogun, however, I can't see why it would give back any other message except the result code. Would I see something in query analyzer?

I've been doing the "explain" and "describe" thing most of the morning and have read up on some nice explaining on indexes, how they work WHY they work and what most of the "explain" stuff means. I've now sussed out most of the speed issues when it comes to searching (some searches still comes up at around 40 seconds, but the rest is < 7 seconds now and even though 40 seconds is long it's quite an improvement on the 3-4 min - time out we used to get)

I also have a solid idea WHY this specific search is running 40 seconds as well, and things are looking good. Let's see if I can optimize more

FYI, the difference between using MATCH AGAINST and LIKE was significant enough for me to change the code
 
After a lot of tinkering and learning how to do a stored routine, I managed to cut down on the "count" query (for paging) that happened right before the actual search query, which is about 6-10 seconds faster. After learning how to use explain and seeing what a difference indexes can make (and not like the other programmers, just slappin'em on and hoping for the best) and using a clever dynamic way of implementing limit in the stored routine, I now run an average search of 5.9 seconds. Down from 50 seconds or timeout.

I'm betting I could shave off another 2 seconds at least if I had the opportunity to use a well-formed normalized database, but it was fun to see how these idiots implemented things and wrote CODE to work around a fundamental database issue. Client is happy and I'm walking away with a bonus :p And they said it couldn't be done... lols... amateurs

Greatly appreciate everyone who took the time to respond. I owe you guys a few beers.
 
Oh, and I had to clear 2 redundant inner joins that wasn't necessary at all from the entire thing... waste of processing power really. I'm convinced MySQL has a very bright future as a database engine... SQL Server should start doing the "we are free please use us" thing :p
 
Thanks guys,


FYI, the difference between using MATCH AGAINST and LIKE was significant enough for me to change the code

LIKE being the slower of the two right?

What version of mysql was running on the server and your machine? I had a scenario a while back where mysql 4 was significantly slower that mysql 5?
 
LIKE being the slower of the two right?

What version of mysql was running on the server and your machine? I had a scenario a while back where mysql 4 was significantly slower that mysql 5?

Actually in this case MATCH AGAINST was the slower of the 2. I've found that sometimes (especially in big databases, the fulltext indexing queries are sometimes slower for a reason, I think it has to do with inefficient indexes being created overall)

Running MySQL 5.1.046 community I think
 
Top
Sign up to the MyBroadband newsletter
X