Wednesday, 11 September 2013

How to optimize a query when grouping on a date / unix timestamp?

How to optimize a query when grouping on a date / unix timestamp?

I'm not very used to MySQL, but I think it can be much faster than it is.
Here's my table:
CREATE TABLE `crashes` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`added_date` int(11) NOT NULL,
`status` int(11) NOT NULL,
`issue_id` varchar(32) NOT NULL,
`report_id` text NOT NULL,
`app_version_code` text NOT NULL,
`app_version_name` text NOT NULL,
`package_name` varchar(80) NOT NULL,
`package_name_id` tinyint(4) NOT NULL,
`file_path` text NOT NULL,
`phone_model` text NOT NULL,
`android_version` text NOT NULL,
`build` text NOT NULL,
`brand` text NOT NULL,
`product` text NOT NULL,
`total_mem_size` int(11) NOT NULL,
`available_mem_size` int(11) NOT NULL,
`custom_data` text NOT NULL,
`stack_trace` text NOT NULL,
`initial_configuration` text NOT NULL,
`crash_configuration` text NOT NULL,
`display` text NOT NULL,
`user_comment` text NOT NULL,
`user_app_start_date` text NOT NULL,
`user_crash_date` text NOT NULL,
`dumpsys_meminfo` text NOT NULL,
`dropbox` text NOT NULL,
`logcat` text NOT NULL,
`eventslog` text NOT NULL,
`radiolog` text NOT NULL,
`is_silent` text NOT NULL,
`device_id` text NOT NULL,
`installation_id` text NOT NULL,
`user_email` text NOT NULL,
`device_features` text NOT NULL,
`environment` text NOT NULL,
`settings_system` text NOT NULL,
`settings_secure` text NOT NULL,
`shared_preferences` text NOT NULL,
`application_log` text NOT NULL,
`media_codec_list` text NOT NULL,
`thread_details` text NOT NULL,
`user_ip` text NOT NULL,
PRIMARY KEY (`id`),
KEY `package_name_id` (`package_name_id`)
) ENGINE=MyISAM AUTO_INCREMENT=202364 DEFAULT CHARSET=utf8
As you can see it's filled with 200k rows. I'd like to retrieve the row
added_date (unix timestamp, int(11)), and number of rows on that day.
So I select the date, day (as Y-M-D), and count:
SELECT date_format(from_unixtime(added_date), '%Y-%c-%d') as date,
added_date, count(*) as nb_crashes FROM crashes WHERE package_name =
'net.bicou.redmine' GROUP BY date ORDER BY date ASC
It's very slow! Almost 1.5sec on a dedicated mysql server on my host.
So I figured I could optimize the thing a little bit: I added a
package_name_id that is a tinyint, which is unique per package_name (I
have 5 different package_name values on that 200k rows). I made that an
INDEX so that MySQL can browse it faster.
Result: 0.9sec. That's much better, but still off the performance I'm
expecting!
How could I optimize the thing? I guess creating the date on each row,
then grouping is very expensive. However I don't know how I could make
this faster...
Edit:
Here's what I did to update the table:
ALTER TABLE `crashes` ADD `temp` DATETIME NOT NULL
UPDATE crashes SET temp = FROM_UNIXTIME( added_date )
ALTER TABLE `crashes` ADD INDEX ( `temp` )
Here's the updated query:
SELECT added_date, count(*) as nb_crashes FROM crashes WHERE
package_name_id=3 GROUP BY year(temp),month (temp),dayofmonth(temp) ORDER
BY temp ASC
I've still got execution times of around one second... Did I do something
wrong?

No comments:

Post a Comment