مشاركة مميزة
Launch HN: QuestDB (YC S20) – Fast open source time series database https://ift.tt/2EjYre2
- Get link
- X
- Other Apps
Launch HN: QuestDB (YC S20) – Fast open source time series database Hey everyone, I’m Vlad and I co-founded QuestDB ( https://questdb.io ) with Nic and Tanc. QuestDB is an open source database for time series, events, and analytical workloads with a primary focus on performance ( https://ift.tt/2DmHzQo ). It started in 2012 when an energy trading company hired me to rebuild their real-time vessel tracking system. Management wanted me to use a well-known XML database that they had just bought a license for. This option would have required to take down production for about a week just to ingest the data. And a week downtime was not an option. With no more money to spend on software, I turned to alternatives such as OpenTSDB but they were not a fit for our data model. There was no solution in sight to deliver the project. Then, I stumbled upon Peter Lawrey’s Java Chronicle library [1]. It loaded the same data in 2 minutes instead of a week using memory-mapped files. Besides the performance aspect, I found it fascinating that such a simple method was solving multiple issues simultaneously: fast write, read can happen even before data is committed to disk, code interacts with memory rather than IO functions, no buffers to copy. Incidentally, this was my first exposure to zero-GC Java. But there were several issues. First, at the time It didn’t look like the library was going to be maintained. Second, it used Java NIO instead of using the OS API directly. This adds overhead since it creates individual objects with sole purpose to hold a memory address for each memory page. Third, although the NIO allocation API was well documented, the release API was not. It was really easy to run out of memory and hard to manage memory page release. I decided to ditch the XML DB and then started to write a custom storage engine in Java, similar to what Java Chronicle did. This engine used memory mapped files, off-heap memory and a custom query system for geospatial time series. Implementing this was a refreshing experience. I learned more in a few weeks than in years on the job. Throughout my career, I mostly worked at large companies where developers are “managed” via itemized tasks sent as tickets. There was no room for creativity or initiative. In fact, it was in one’s best interest to follow the ticket's exact instructions, even if it was complete nonsense. I had just been promoted to a managerial role and regretted it after a week. After so much time hoping for a promotion, I immediately wanted to go back to the technical side. I became obsessed with learning new stuff again, particularly in the high performance space. With some money aside, I left my job and started to work on QuestDB solo. I used Java and a small C layer to interact directly with the OS API without passing through a selector API. Although existing OS API wrappers would have been easier to get started with, the overhead increases complexity and hurts performance. I also wanted the system to be completely GC-free. To do this, I had to build off-heap memory management myself and I could not use off-the-shelf libraries. I had to rewrite many of the standard ones over the years to avoid producing any garbage. As I had my first kid, I had to take contracting gigs to make ends meet over the following 6 years. All the stuff I had been learning boosted my confidence and I started performing well at interviews. This allowed me to get better paying contracts, I could take fewer jobs and free up more time to work on QuestDB while looking after my family. I would do research during the day and implement this into QuestDB at night. I was constantly looking for the next thing, which would take performance closer to the limits of the hardware. A year in, I realised that my initial design was actually flawed and that it had to be thrown away. It had no concept of separation between readers and writers and would thus allow dirty reads. Storage was not guaranteed to be contiguous, and pages could be of various non-64-bit-divisible sizes. It was also very much cache-unfriendly, forcing the use of slow row-based reads instead of fast columnar and vectorized ones.Commits were slow, and as individual column files could be committed independently, they left the data open to corruption. Although this was a setback, I got back to work. I wrote the new engine to allow atomic and durable multi-column commits, provide repeatable read isolation, and for commits to be instantaneous. To do this, I separated transaction files from the data files. This made it possible to commit multiple columns simultaneously as a simple update of the last committed row id. I also made storage dense by removing overlapping memory pages and writing data byte by byte over page edges. This new approach improved query performance. It made it easy to split data across worker threads and to optimise the CPU pipeline with prefetch. It unlocked column-based execution and additional virtual parallelism with SIMD instruction sets [2] thanks to Agner Fog’s Vector Class Library [3]. It made it possible to implement more recent innovations like our own version of Google SwissTable [4]. I published more details when we released a demo server a few weeks ago on ShowHN [5]. This demo is still available to try online with a pre-loaded dataset of 1.6 billion rows [6]. Although it was hard and discouraging at first, this rewrite turned out to be the second best thing that happened to QuestDB. The best thing was that people started to contribute to the project. I am really humbled that Tanc and Nic left our previous employer to build QuestDB. A few months later, former colleagues of mine left their stable low-latency jobs at banks to join us. I take this as a huge responsibility and I don’t want to let these guys down. The amount of work ahead gives me headaches and goosebumps at the same time. QuestDB is deployed in production, including into a large fintech company. We’ve been focusing on building a community to get our first users and gather as much feedback as possible. Thank you for reading this story - I hope it was interesting. I would love to read your feedback on QuestDB and to answer questions. [1] https://ift.tt/11g71v6 [2] https://ift.tt/39KJE6k [3] https://ift.tt/2CUWTH9 [4] https://ift.tt/30TsPDL... [5] https://ift.tt/37TmfQV [6] https://ift.tt/2EiOfCz July 28, 2020 at 03:57PM
- Get link
- X
- Other Apps
Popular posts from this blog
How do you build your body in 1 week?
How do you build your body in 1 week? Ready to see real changes in just seven days? This guide shows how to kick start your body's muscle development . You'll learn about targeted workouts, smart nutrition, and wellness habits. Even in a week, small adjustments can spark noticeable improvements in strength and energy. Edit Full screen View original Delete Body health This plan helps lay the groundwork for lasting progress by focusing on exercise routines, meal planning, and rest. Every step is designed to maximize your body’s potential in a short time frame. Key Takeaways Body...
استفتاء - A REFERNDUM.
استفتاء . ******* استفتاء حيران ... العقل أم الجنان ؟.. ألمس نار الحنان .. وتحيط بكل جانب .. بأدق ما في الزوايا .. تسكن مني الحنايا.. تدعو والقلب راغب .. يملأ حولي المكان .. ويدق مثل الكمان .. فتفيض فيها الحِسان .. أتُراها لا تُبالي ؟؟.. أتحس حقا بِبالي ؟؟.. أرفض منها العناد .. فيعاتبني الفؤاد .. ويواسيني خيالي .. ما جدوى أن تُحاسب ؟.. تبغي مثلي الأمان .. إذّ يجمعنا أوان .. لا نخشاه الزمان .. وتقلبات الليالي .. أبلغ منها منالي .. وتُكللنا مواكب .. وتنال مني كتاب .. لِتُوَحدُنا رواية .. وتُصورُنا مرايا .. ونظل في كل آن .. نجمان يسبحان .. ويُقِرُنا الاثنان .. العقلُ والجِنان . بقلمي السيد أحمد إبراهيم 16\11\1994 ترجمة قصيدة استفتاء. A referendum. ******* Wondered referendum ... The mind or the heavens? Touch the fondness of fire. Surrounded by each side. As exact in the corners. Live in all my curves. Invites the heart wants. Fill around the place. Beats like a violin. It is overflowing with grace. is she Don't care?? It fell off my mind ???. I reject such stubbornness. Then m...
H. Welton Flynn: A Pioneer in Public Service
H. Welton Flynn: A Pioneer in Public Service By Jeremy Menzies This February for Black History Month we're highlighting one of San Francisco's most important figures in transportation, H. Welton Flynn. Flynn is best known for his key leadership at the SF Public Utilities Commission and as the inaugural chairman of the SFMTA Board of Directors. Mr. Flynn was the first African American appointed to a city commission and served the longest term of any commissioner in San Francisco’s history. He served under six different mayors, being elected to the office of President or Chairman more than a dozen times. Welton Flynn in 1978 during the inaugural run of Muni light rail in the Market Street Subway. Flynn’s leadership on the Public Utilities Commission was crucial during this era of change and adoption of new technology in San Francisco transit. Flynn's long tenure in transit began with his appointment to the Public Utilities Commission, which oversaw Muni operations, in 19...
Comments
Post a Comment