MySQL at Twitter
MySQL is the persistent storage technology behind most Twitter data: the interest graph, timelines, user data and the Tweets themselves. Due to our scale, we push MySQL a lot further than most companies. Of course, MySQL is open source software, so we have the ability to change it to suit our needs. Since we believe in sharing knowledge and that open source software facilitates innovation, we have decided to open source our MySQL work on GitHub under the BSD New license. The objectives of our work thus far has primarily been to improve the predictability of our services and make our lives easier. Some of the work we’ve done includes:
- Add additional status variables, particularly from the internals of InnoDB. This allows us to monitor our systems more effectively and understand their behavior better when handling production workloads.
- Optimize memory allocation on large NUMA systems: Allocate InnoDB's buffer pool fully on startup, fail fast if memory is not available, ensure performance over time even when server is under memory pressure.
- Reduce unnecessary work through improved server-side statement timeout support. This allows the server to proactively cancel queries that run longer than a millisecond-granularity timeout.
- Export and restore InnoDB buffer pool in using a safe and lightweight method. This enables us to build tools to support rolling restarts of our services with minimal pain.
- Optimize MySQL for SSD-based machines, including page-flushing behavior and reduction in writes to disk to improve lifespan.
If you want to learn more about our usage of MySQL, we will be speaking about Gizzard, our sharding and replication framework on top of MySQL, at the Percona Live MySQL Conference and Expo on April 12th. Finally, contact us on GitHub or file an issue if you have questions.
On behalf of the Twitter DBA and DB development teams,
- Jeremy Cole (@jeremycole)
- Davi Arnaut (@darnaut)
No comments:
Post a Comment