Announcement

Announcement Module
Collapse
No announcement yet.

Notification of table crashes

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Notification of table crashes

    I can't seem to find info on a tool or methodology for being easily notified when MySQL has table crashes.

    It seems wrong for the web application or whatever is connecting to MySQL to have to have built in handlers for every query to detect whether some table has crashed and notify someone about it... shouldn't there be some low level support for this right in MySQL?

    We turned on error logging and can get some of the crashed tables to show up in the error log, but even then you would have to run something to watch the log and notify you. That would be better than nothing, but I don't find documentation on this kind of thing either in MySQL's documentation, or being mentioned in other sources.

    Are MySQL System Administrators being slighted by the developers?

    Am I missing something obvious?

  • #2
    dangby

    We turned on error logging and can get some of the crashed tables to show up in the error log, but even then you would have to run something to watch the log and notify you. That would be better than nothing, but I don't find documentation on this kind of thing either in MySQL's documentation, or being mentioned in other sources.


    Normal Network Monitoring operation to read log files and send alerts whenever a configured string occurs.
    And there are also monitoring plugins that issue SQL queries against the database. So that you can test with a query and see that the database works.
    Your web application should not have care about this as you said yourself.

    dangby wrote on Tue, 21 April 2009 21:12


    Am I missing something obvious?

    Why do you have so many table crashes?

    I would focus on trying to figure out why you have so many table crashes (if that is a big problem for you). Try to find the root cause and see if you can fix it. Either if it is a OS thing or a newer version of MySQL that could be the solution.

    Comment


    • #3
      We don't have that many table crashes, but we do have one a month or something, which is usually caused by a power failure or some such thing.

      However, it still seems obvious to me that there isn't as much support built right into MySQL for notifications of the crashes. As I said, we've had them crash and not even be logged in the error log. Also, the fast/quick checks that are available via mysqladmin or whatever it is are limited in what they report. It seems like it ought to be one of the first requirements of writing a database that the logging/notification of errors be handled correctly... I just don't think that MySQL has realized that they've left out that big feature. The logging ought to always work if there is a problem. You'd think there would even be support for configuring automatic repair attempts with notification of the result.

      Comment


      • #4
        I hear you arguing your case, but most people fix the real problem which are the crashing OS due to power outage.

        Very few applications return to an entirely consistent state after a OS crash. The reason for this is that you don't want to pay the performance penalty. If the state shall be consistent after reboot you need to write everything down to disk all the time and that is very slow compared to caching it and writing sequentially later (the limitation on a normal disk is about 150 times per second due to seek times, which is an limitation that many people using InnoDB (read below) has hit).

        Then you can of course argue about logging and yes if it finds an error then I definitely agree with that it should write it.
        But I don't want the database to perform a thorough consistency check on all tables at startup since that would take to long time.

        As for the automatic repair you don't always want that because if you have a large server performing a lot queries in a lot of databases. Starting an unscheduled automatic repair (which very often can be a very heavy operation) can bring the entire server to it's knees and you ending up with all applications running against that server

        So I'm sorry to say, and you can blame developers for this but fixing the real problem which is the power outages is the right course.


        All that said I'm guessing that you are running MyISAM tables, and if that is the case, then I can suggest you to move to InnoDB tables instead to get a more robust database.
        But depending on how you perform backups etc you might have to do some other changes in this area due to this change.

        Comment


        • #5
          I understand that if the power goes off that I can expect a crash or two, but that isn't the issue. Of course the power shouldn't go out, but even at a data center where they say they provide 5 nines of availability it happens, and I have to deal with it.

          The logging is the issue I'm talking about, or the lack of notification by MySQL when there are problems. Logging should be consistent. It doesn't matter how my crashes happen. I might be crashing my tables on purpose, but MySQL should notify me if they do crash, at least if I tell it to.

          I don't recommend some unscheduled repair that would check all tables on boot. I just recommend that if MySQL knows about an error, it always logs it. Additionally, it would be nice if there were an option that controls whether or not repairs are attempted when a failure occurs.

          I've used InnoDB and have seen many drawbacks with it to really like it. Replication and moving data between servers, as well as recovering from errors is way harder with innodb than with MyISAM.

          I'm not a full-time System Administrator with time to manage my database, I just want it to work. I'm just saying that it seems like a no-brainer that there would be a few more features in MySQL that let it just work for projects/companies that don't have full-time sysadmins.

          Comment


          • #6
            dangby wrote on Wed, 22 April 2009 20:36


            I've used InnoDB and have seen many drawbacks with it to really like it. Replication and moving data between servers, as well as recovering from errors is way harder with innodb than with MyISAM.


            All true, but if your main problem is power outages then I think you should benefit from the added stability with the transaction log and the doublewrite.
            But yes you will not be able to copy the tables in the same simple fashion as MyISAM.

            dangby wrote on Wed, 22 April 2009 20:36


            I'm not a full-time System Administrator with time to manage my database, I just want it to work. I'm just saying that it seems like a no-brainer that there would be a few more features in MySQL that let it just work for projects/companies that don't have full-time sysadmins.

            I think it pretty much does that already.
            Have you tried Oracle or MSSQL server and what they think about regular power failures?
            I've been working as a DBA for both and what I really like about MySQL is the simplicity.

            You can say that you want a car that should _just_work_, but for that to happen you may not run out of fuel (power failure) on a regular basis and you will have to leave it to a mechanic (DBA) for service at regular intervals. Or if you are mechanically inclined you perform the service yourself.

            Point being nothing works indefinitely without some sort of service and especially if you mistreat it.

            You see you can't convince me that you shouldn't fix the power outages.

            Comment


            • #7
              I strongly suggest InnoDB as well. I can't really comment on your experience with replication and crash recovery, other than to say that they are exactly the opposite of the experience I have had.

              MyISAM can take absolutely forever to repair after a crash, while InnoDB's recovery is automatic and almost entirely failproof unless there is a hardware failure.

              I don't know what could make replication harder on InnoDB; its transactional behavior there again makes it much more robust in case of things like interrupted queries. In InnoDB, they roll back and don't get logged and replayed on the slave, whilst in MyISAM they will cause the slave to halt.

              It's true that you can't copy an InnoDB table by copying its files, although XtraDB does relieve that problem.

              In general, if you have corrupted tables, something is seriously wrong and you should fix that root cause first.

              Comment


              • #8
                Funny, I'm searching once again for information on how to monitor table crashes and found my own thread as the top result.

                The ironic thing about this is that I have since switched to InnoDB, and guess what, the virtual server was powered off, and many InnoDB tables crashed. InnoDB's automatic recovery did try and recover, but the log mentioned that there might be a MySQL bug and I might have to recover from a backup. The log didn't mention which tables it recovered, but it did output a ton of other unreadable junk, which I'm sure would prove useful to someone who wanted to spend three days on the project and who might be an InnoDB developer themself. My point again is that MySQL's (and InnoDB's) built-in tools for administrators is not very user friendly (not even for high-level administrators). They need someone from a marketing team to help them point them in the right direction.

                Anyway, the database would still run fine most of the time... inserting new data and reading back that old data, but when I'd try to delete data from the time period of when the power outage happened, MySQL would just plain crash, which just shouldn't happen. By process of elimination, I could figure out what table it was working on at the time and cause the crash as many times as I wanted by running any of the commands to DELETE records, repair table, optimize table, and sometimes check table... definitely a bug that MySQL would crash with those commands.

                My main point though is that when MySQL comes back up after its crash, it ought to at least be able to know what it was doing when it crashed and report what failed. I understand this would involve the overhead of logging something with every action, and it would probably be an expensive operation to have that feature. I'm not sure what the answer is, but there needs to be more visibility into issues with table crashes.

                I shouldn't be able to just run an app for months not knowing that there is actually some bad data in there that could compromise my ability to do things like restore from a backup, for example. The backup data could be corrupted and I wouldn't know it because MySQL didn't tell me there was bad data in there.

                The underlying bug may be because I'm storing gzipped encoded text into blobs, and that kind of things probably isn't being tested as often, but all I want is a way to know whether tables have crashed at some point. If MySQL runs into an instance where it is about to die because of bad data, it at least ought to spit out some stuff about what it was doing into the log before it croaks. CHECK TABLE should actually work on InnoDB tables, but it doesn't really tell you for sure that something is corrupted, and in my case, it sometimes caused a MySQL crash.

                Rant complete, but I'm sure this isn't the last time I'll search for what other administrators do to watch the health of their databases. And I'm pretty sure that the tools I'm looking for aren't built yet.

                Comment


                • #9
                  The tools you need are simply to test your backups by restoring them and running mysqlcheck to verify the absence of corruption in any tables. It is impossible to overstate the importance of doing this. I suggest at least once a week.

                  Comment

                  Working...
                  X