Database support

From CatchChallenger wiki
Jump to: navigation, search

Multiple database connector (mysql, sqlite, postgresql, ...) is supported or:

  • Qt sync connector to be universal (SQLite is more to have single player mode)
  • C custom connector to be optimized and asynchrone to never slow down if the database respond slowly, this is mostly based on epoll for linux

Usage

Used into CatchChallenger:

  • Fixed data length index (as login hash)
  • Index on integer
  • Partial index (most player don't have clan, then index when the value is 0 is useless)

Not used into CatchChallenger:

  • Relational SQL part:
    • It will be better for good dev, but bad dev manipulate too much data to do stupid thinks
    • More hard to read, understand
    • CatchChallenger read and serialize/unserialise binary, why not do that's?
    • Some engine/Database don't support it (MyISAM)

Minimal usage of:

  • Index on arbitrary length of data (pseudo and clan name)

Support

Qt support C async connector Partial index NoSQL Second index Blob index Replication
SQLite Yes No No No Yes Yes No
MySQL/MariaDB/Percona Yes Yes No No Yes Yes Yes
PostgreSQL Yes Yes Yes No Yes Yes Yes
Oracle Yes No  ??? No Yes Yes Yes
File (not done, into the roadmap) Yes Yes Yes Yes Partial Partial Yes

Prepared statement

  • Performance: PostgreSQL using LLVM JIT: 5~7x of speed up included on TPC-H. Prepared statement into CatchChallenger free the CPU for the compression to even more performance, skip parsing part and interpreting. This is very useful on tiny CPU as ARM or Intel Atom.
    • On PostgreSQL 9.5.2 on Raspberry pi 1 (128KB L2 CPU cache), it's 3x improvement on database CPU time with CatchChallenger allinone CLI epoll server. Only only few query time reduction, on specific query (can be 2x as can be 1x -> no improvement).
  • Security: Have minor benefit on security, because mostly number is manipulated. But as it don't have negative impact on the rest, is good.
  • Network: This allow exploit the native protocol to improve the performance and reduce the bandwidth used.
  • Bug prevent: Detect the query syntax error at the startup of the server

Performance

If the disk is 2MB/s, the database to be secure will write at this speed, if it usafe it write at the memory speed 100MB/s while the 256MB is not full, mean: 256MB/(100MB/s-2MB/s)=2.6s

This is the best case, without take care of slow down, hdd seak time, barrier problem, sync problem, ...

Query sort in cluster

Query-short-cluster.png

In case of not sort, it's more easy to split the user query (more intuitive), and more balanced

  • Cache don't scale: More user = more cache miss
  • Hdd don't scale: Need seek on 100% of the disk
  • Space don't scale: need have all the data

In case of sort, the sort should be fast to not introduce latency. You need analyse the query to statistically balance it correctly. You need partion your data to improve the used data ratio into a same zone.

  • Cache scale: But for CPU cache and memory: 10x more server it's like 10x cache and memory
  • Hdd scale: Seek on sort part of the disk, 1/4 if 4x server and full content on each server and partitioning, mean 4x less seek time
  • Space scale: scale with the server count