Silkenweb Example: Hackernews Clone

Show HN: QuestDB with Python, Pandas and SQL in a Jupyter notebook – no install

53 points by bluestreak 2 years ago | 15 comments

amunra__ 2 years ago
Hi, I'm Adam Cimarosti, one of the core engineers at QuestDB.
We built play.questdb.io to make it easy for anyone to try our database. No installation.
There's a Jupyter Lab notebook, data, sample code, queries and graphs.
We'd love to hear what you think.
- jzelinskie 2 years ago
  This is really cool -- congrats on the launch! Similarly, the team over at AuthZed has created a playground for SpiceDB[0], by using WebAssembly and Monaco.
  We debated for hours whether or not to go the notebook route. I'm sure y'all did something similar; would you care to share your reasons for going with the notebook?
  [0]: https://play.authzed.com
  - amunra__ 2 years ago
    We actually do have a web based demo which is https://demo.questdb.io/ preloaded with millions of rows of data.
    That one focuses on SQL queries though.
    The notebook in https://play.questdb.io/ offers a more rounded experience to try out any and all of our features.
    You can use the notebook to try out data ingestion, dropping partitions and more that is simply not possible in a more sandboxed environment.
    The other part is that we value our Python users and wanted to provide an example of how to use our database in conjunction with other tools commonly used in the data science space to slice and dice time series data.
    - zX41ZdbW 2 years ago
      The demo does not work at all: https://github.com/questdb/questdb/issues/1525
- Cilvic 2 years ago
  Reading your pitch here, i'd love to have a vague idea what questdb is and why I should care.
  - amunra__ 2 years ago
    Most databases store the latest state of something. We don't. We ingest events. After all, life is a function of time :-) The whole world ticks and we take those ticks and store them. If part of your application tracks anything happening over time (trades, ocean pollution levels, ships moving, rocket simulation metrics.. or whatever else then it makes sense to store those events in a time series database. What we provide, primarily, is two basic pieces of functionality: (1) Taking in lots of events FAST. Our ingestion rate is high (and we also integrate with things like Kafka, Pandas -- see the notebook, etc). Each of our time series tables (we support regular ones too) comes with a special timestamp column. (2) Specialized SQL to make sense of data that's changed over time, such as grouping and resampling by time and more. Take a look at our docs for things like SAMPLE BY, LATEST ON, ASOF JOIN, LT JOIN and more. On disk, we also guarantee that all records are sorted by time and this gives us great query performance for these time-based types of queries.
    PS. We're also wire-compatible with PostgreSQL.
    - jmholla 2 years ago
      I was once in the market for time series databases, but all I could find required down sampling of older data. I don't know if this has changed, and to be fair I haven't been looking for quite some time, but does yours allow for keeping data with the captured precision in perpetuity (or until my hard drive fills up)? My guess from the way you describe your approach is yes, but I wanted to check.
    - doubleg72 2 years ago
      Thanks for the detailed reply, I was curious as well. How does this compare with InfluxDB? I was actually looking into a way to store my own financial data of US equities for backtesting and experimentation awhile back. I never did get any further than the planning phase but this seems like it would almost be ideal for that use case.
    - Wonnk13 2 years ago
      So I guess it would be fair to say you compete with Timescale and Clickhouse as a timeseries database?
  - nwsm 2 years ago
    I take it you did not visit the link?