Friday, February 24, 2017

SQL Relay 1.0.1 Release Announcement

Version 1.0.1 of SQL Relay, the powerful connection pool, load balancer, query router, and database proxy, is now available.

This release features lots of small improvements, mainly for issues discovered during attempts to make it compliant with Redhat's packaging guidelines, and for a set of issues reported by users.



Thread Fix

This is fairly important.

There was a long-standing bug in the sqlr-listener process that caused multiple threads to access the same variable, without coordination. The symptom was that the sqlr-listener process might crash, randomly, every few days. The variable was a buffer used to build log/notification messages, so the error could only occur if logging or notifications were configured, but it could happen if there were logger or notifications tags in the config file, even if the modules themselves were disabled. The workaround was to comment out the entire logger and notification tag groups, or to configure the sqlr-listener to be multi-process instead of multi-thread (using sessionhandler="process").

None of that is necessary now though. The bug is fixed. Local variables are used to build the messages in the parts of the sqlr-listener that can be run by multiple threads.

sqlrsh Fixes

You can now hit ctrl-D to abort sqlrsh without generating a segfault. Also, errors are now printed to stderr and if an error occurs (in non-interactive mode) then sqlrsh exits with status 1 instead of 0.

ODBC Bug

1.0.0 featured an ODBC fix which itself created a new bug. Between the two, the first line of the result set could be duplicated or skipped. Both bugs should be fixed now.

PDO on 32-bit Systems

The PHP PDO driver compiles correctly for PHP 7 on 32-bit systems now.

Ruby Updates

Ruby semi-recently consolidated Fixnum and Bignum classes into the Integer class. The Ruby API has been updated to support this change.

Router Sequence Fix

There was a bug that made it necessary to start all of the instances that a router instance would route to prior to starting the router itself. Eg. if you had 2 instances: routermaster and routerslave, and a third instance: router, then routermaster and routerslave had to be started prior to router. This wouldn't always happen if the instances were configured to start at boot though. Now, they can be started in any order.

Normalize Translation

Fixed a bug in the "normalize" query translation that caused it to remove spaces before unary ! and - when they followed a keyword. For example, it will now return "select !a,!b from mytable" instead of "select!a,!b from mytable".

Upates to the Drop-in Replacement Libraries

As recommended by rpmlint, the .so links aren't installed for these libraries any more. Now, the major version must be included in the LD_PRELOAD command. Eg: LD_PRELOAD=/usr/local/firstworks/lib/libmysql51sqlrelay.so.4

The Drop-in Replacement Library for MySQL now pays attention to the SQLR_MYSQL_MAP_DATE_TO_DATETIME environment variable. If it's set to "yes" then DATE types are mapped to DATETIME. This is useful if you're mapping a MySQL app to an Oracle database, have NLS_DATE_FORMAT set to include the time, and your app truncates the time if the type of the column is DATE instead of DATETIME.

configure Script Options

The --disable-wall and --disable-werror options have been added to the configure script to work around quirks on systems who's header files generate warnings or particular errors.

The --with-abs-max-connections option has been added to set the absolute maximum number of connections that can be configured at compile time. It defaults to 4096, up from 512 in previous releases.

The --with-fortify option has been added to enable -O -D_FORTIFY_SOURCE=2 -fstack-protector-strong compiler options.

The --enable-oracle-at-runtime (and similar options for sap, db2, and informix) have been added to enable runtime loading of the database libraries rather than linking them in at compile time.

Error Messages

Various error messages have been improved to suggest solutions.

Erlang Reference

An API reference exists for the Erlang API now.

Cache Manager

The mysterious sqlr-cachemanager process isn't enabled by default any more. This process manages the client-side result set cache, only needs to be run on SQL Relay clients, and only if an app caches result sets. As such, it is installed when building from source, but isn't enabled by default. Also, in the RPM distribution it has been split out into its own RPM. The docs have also been updated to explain when to run the process.

"run", "log", and "cache" directories and file names

This is more of an internal change, but...

In the past there were various directories under .../var/sqlrelay/tmp that contained files used at runtime by the various instances of SQL Relay. These have been consolidated into .../var/run/sqlrelay. The "log" directory has been changed from .../var/sqlrelay/log to .../var/log/sqlrelay. The "cache" directory has been changed from .../var/sqlrelay/cache to .../var/cache/sqlrelay. These changes were motivated by Redhat packaging guidelines, but I think they have roots in the Linux Standard Base.

Sockets created by the sqlr-listener process to communicate with the sqlr-connection processes now have .sock extensions, and 700 permissions rather than erroneous 711 permissions). Files used by ftok() to get IPC ids now have .ipc extensions. "Up-down" files used to indicate whether the database is up now have .up extensions. The sockseq file is no longer used. Instead, the file names of sockets used to resume suspended sessions are based on the process id of the sqlr-connection process.

Full Changelog follows:

  • updated sqlr-bench to create "db"-bench.png
  • added edoc-generated API docs for the Erlang API
  • updated some module-load error verbage
  • added initial support for loading Oracle, SAP/Sybase, DB2, and Informix libraries on-demand at runtime
  • overhauled spec file
  • fixed PDO driver for PHP 7 on 32-bit platforms
  • added permissions to all open() with O_CREAT calls
  • added --with-fortify configure option
  • fixed a segfault on ctrl-D in sqlrsh fixed uncoordinated access to shared debugstr among listener threads by using local instances instead
  • added ping-loop to router connection module's logIn() method to manage cases where routed-to instances are started after the router
  • moved .../var/sqlrelay/tmp/*/* to .../var/run/sqlrelay
  • added .sock extension to sockets, .ipc extension to ipc key files, .pid extension to pid files, and .up extension to files which indicate that the db is up
  • changed mode of listener-connection sockets to 700 from 711
  • moved .../var/sqlrelay/log to .../var/log/sqlrelay moved .../var/sqlrelay/cache to .../var/cache/sqlrelay
  • libmysql*sqlrelay.so and libpqsqlrelay.so are no longer installed
  • replaced sockseq-generated unix socket with pid-based unix socket
  • updated ruby api to support unification of Fixnum and Bignum classes into Integer class
  • added --with-abs-max-connections configure script parameter
  • increased default absolute max connections to 4096 (from 512)
  • updated "can't open ... file" errors to mention user/group
  • added SQLR_MYSQL_MAP_DATE_TO_DATETIME option to mysql drop-in replacement library
  • fixed normalize translation not to remove spaces before unary - and !
  • updated sqlrsh to set error code correctly and write errors to stderr
  • sqlr-cachemanager is no longer enabled at boot by default
  • fixed another subtle bug in the ODBC driver that caused the first row to be duplicated if columns were not bound
  • added --disable-wall and --disable-werror configure options

Rudiments 1.0.4 Release Announcement

Version 1.0.4 of Rudiments, the C++ class library for developing systems and applications, is now available.


This release primarily replaces the dependency on readline with a dependency on libedit. I'd used readline directly in SQL Relay for years and one day decided it would be better if it were pushed up into Rudiments as a prompt class. However, I didn't notice that readline was GPL rather than LGPL, so using rudiments with readline wasn't necessarily GPL compliant. Actually, I guess it would only not be compliant if the application it was ultimately linked into wasn't a GPL application... But, to eschew any such doubts, Rudiments now uses libedit (which has a BSD license) instead of readline.

Some other, extremely obscure, features were added too.

Full Changelog follows.

  • added support for EOT to codetree
  • added loop-avoidance to xmldomevents class
  • added alias support to codetree class
  • replaced readline with libedit
  • included implementation of history_truncate_file for versions of libedit that don't provide it
  • added --disable-wall and --disable-werror configure options

Sunday, January 22, 2017

Rudiments 1.0.3 Release Announcement

Version 1.0.3 of Rudiments, the C++ class library for developing systems and applications, is now available.


This release mainly fixes some packaging issues, and a really subtle issues in the prompt class. The file::open() method also fails with EINVAL if it is used to create a file but no permissions are given, which would otherwise cause the method to crash on systems where FORTIFY_SOURCE is used. No pressing need to update unless these issues sound familiar.

Sunday, December 25, 2016

Rudiments 1.0.2 Release Announcement

Version 1.0.2 of Rudiments, the C++ class library for developing systems and applications, is now available.


There was actually an unannounced 1.0.1 as well. 1.0.1 mainly contained fixes for OpenSSL 1.1.x. 1.0.2 just adds another SSL tweak that's required by Fedora for packaging. No pressing need to upgrade unless you're having trouble building on a platform that uses OpenSSL 1.1.x.

Wednesday, December 21, 2016

SQL Relay 1.0.0 Benchmarks

Introduction

There are a lot of reasons to use SQL Relay, but one of the most popular is just to improve throughput, usually measured in queries-per-second.

Logging in to most databases takes a noticable amount of time...

So much time, in fact, that transient apps (such as web-based applications) which start up, connect, query, disconnect, and shut down to deliver each bit of content, often take longer to log in than to run the queries. As a result, overall throughput (queries-per-second) is generally low unless many queries are run per connection.

SQL Relay maintains a pool of already-logged-in database connections, and logging in to a connection pool is substantially faster than logging into the database directly.

As a result, overall throughput (queries-per-second) should be maximized, and shouldn't depend nearly as much on the number of queries that are run per connection.

In theory.

Anecdotally this is the case and informal testing has always confirmed it. But, I wanted hard data and pretty graphs to prove it. Also, it had been a long time since I'd done any formal performance testing. For all I knew, I'd done something recently to make everything run horribly.


sqlr-bench

Of course there are popular tools for benchmarking apps and databases. JBench, for example. But they all use intermediate layers - JDBC, ODBC, web pages... I wanted raw comparisons: the database's native API vs. SQL Relay's native API. I also wanted it to be easy to run from the command-line, and automatable.

Months ago, I'd begun writing a program called "sqlr-bench" which aimed to run a battery of tests against the database directly, run the same tests through SQL Relay, collect stats, and produce a graph. It had to support every database that SQL Relay supported. It needed a dozen or more tunable paramters... It was a fairly ambitious project. I started and stopped work on it several times but I eventually got it working well enough.

It creates a table, populates it, and runs a series of selects. First it runs 1 select per connection, then 2 per connection, then 3, etc. It times all of this and calculates queries-per-second as a function of queries-per-connection. The whole process is done first against SQL Relay, and then directly against the database. The data is then correlated and gnuplot is used to produce a graph.

Various parameters are configurable, but the defaults are:

  • rows: 256
  • colums: 16
  • column size: 32
  • rows to fetch at once: 10
  • samples: 10
  • queries: 30

Initial Results

I ran the tests against each database, and against instances of SQL Relay running on the local machine (over a unix socket).

Theoretically I figured I should see...

  • ...against the database directly: The number of queries-per-second should increase steadily as the number of queries-per-connection increases. It should start off to be pretty bad, then get better and better, and eventually level off so that running more queries-per-connection doesn't make any more difference.
  • ...against SQL Relay: The number of queries-per-second should be independent of the number of queries-per-connection, and should be roughly the same as the maximum number of queries-per-second that can be run against the database directly.

Performance against the database directly was exactly as I expected, and it generally seemed to level off at about 20 queries-per-connection.

SQL Relay actually does have a small ramp-up period (2 or 3 queries per-connection is slower than 4 or more), which I didn't expect, but the number of queries-per-second were otherwise uncorrelated with the number of queries-per-connection, which I did.

Overall though, SQL Relay did not immediately perform as well as I expected. In fact, for most databases, there was a "break-even" between 5 and 10 queries per-connection.

While apps that use SQL Relay would still likely perform better than if run directly against the database (how many web pages run more than 5 queries that return 256 rows each), I really expected it to perform better.


Tweaks

Thus began days of profiling and analyzation. It appeared that, indeed, over the years new features had steadily introduced inefficiencies.

In most cases, things just needed to be reused instead of reallocated. Everything from database cursors to byte arrays. I replaced a few regular expressions with simple string compares, and made a few case-insensitive string compares case-sensitive. There were also some alignment and padding issues that prevented optimized strcpy/memcpy from being used. My socket buffer sizes were sub-optimal...

Lots of little changes.


Results

In the end, I was pleased with the results. Most of them, anyway.

Informix:

Oh yeah, that looks great.

MySQL:

Still looking good.

Oracle:

Not bad, but why the falloff?

DB2:

It's good where it counts, but why does DB2 eventually outperform SQL Relay?

SAP/Sybase:

Again, it's good where it counts, but throughput is generally low. And again, why the falloff?

PostgreSQL:

My goodness, PostgreSQL is fast!

I checked the code over and over to make sure my test program was legitimately running queries and not just erroring out or something. It was. Why is it so much faster than everything else? Strace showed that the socket is in non-blocking mode and that they do a poll() then read(), basically blocking manually rather than doing a blocking read. My preliminary tests with a similar strategy didn't show much improvement with that over a blocking read though. Non-blocking poll()/read() is a good way to multiplex reads through a single thread, but if you're only reading from one socket at a time, I wouldn't expect it to help at all.

Very strange.

SQLite:

Well, I guess I shouldn't really be disappointed that SQL Relay is slower than SQLite :)

FreeTDS:

82 queries-per-second??? Does SQL Server limit throughput?


What now?

The falloffs are strange. It's odd that it's so inconsistent. There doesn't appear to be a memory leak. Heap fragmentation?

Performance against DB2 and SAP is odd. It shouldn't break even so quickly.

Performance against PostgreSQL is the most strange. Whatever they're doing, SQL Relay needs to be doing it too.

And what's the deal with MS SQL Server?

These things deserve investigation.

I ran tests with SQL Relay on a server between the client and database too. The results were similar, but there was more of a ramp-up. I should publish those results too.


Test Environment

To perform these tests, I configured 2 Fedora Linux VMware VM's. One for the databases and one to run SQL Relay. Both VM's had 1 CPU. The database VM had 2Gb of RAM. The SQL Relay VM had 512MB.

The DB VM ran modern versions of all databases, in their default configurations, without any performance tweaks.

I built Rudiments and SQL Relay without debug and with -O3 optimizations like: CPPFLAGS="-O3" ./configure

The SQL Relay configurations were all pretty standard too. For example, the Oracle configuration was something like:

 <instance id="oracletest" port="9000" socket="/tmp/test.socket" dbase="oracle">
  <users>
   <user user="test" password="test"/>
  </users>
  <connections>
   <connection string="user=...;password=...;oracle_sid=..."/>
  </connections>
 </instance>

For the PostgreSQL tests I disabled typemangling.

The sqlr-bench program can be found in the test/bench directory of the SQL Relay distribution. It must be run from that directory. I ran it with default parameters. There's a benchall.sh script there too, which starts and stops the appropriate SQL Relay instances and runs sqlr-bench. I actually used that script rather than running the tests individually.

Tuesday, December 20, 2016

SQL Relay 1.0.0 Release Announcement

Version 1.0.0 of SQL Relay, the powerful connection pool, load balancer, query router, and database proxy, is now available.

At great length: 1.0.0!

The client APIs has been stable for a long time. The server APIs have been stable for long enough. Recent efforts during the past few releases updated the internal structure such that significant internal changes can be made without affecting the API or ABI. Yes, it's definitely time for a 1.0.0 release.

Of course, no project is ever feature-complete. There are still plenty of tasks in the backlog, but it should be possible to implement most, or all, of them without breaking the API and ABI.

New Features

The most noticeable new feature in this release is that all DB abstraction layer drivers support the same set of connect-string options, including TLS and Kerberos options, as well as columnnamecase, resultsetbuffersize, dontgetcolumninfo, nullsasnulls, and lazyconnect options. The options are named subtly differetly for each driver, following the conventions for the DB abstraction layer, but all options are present.

Changes

This release mainly features a bunch of internal changes though. Most improve performance or stability in one way or another.

The most outwardly noticeable change is that column names are case-sensitive now when getting a field by name. The docs never promised case insensitivity, and the performance improvement is notable. Also, there are options for upper and lower-casing column names, if you need them to be converted one way or the other.

Another semi-noticeable changes is the removal of calls to mysql_stmt_store_result/mysql_stmt_num_rows in the mysql database connection module. Actually the removal of them isn't the noticeable part. Rather, that since they have been removed, the row count is no longer available immediately, if a result-set-buffer-size is being used. This is consistent with almost all other databases though, and by default, no result-set-buffer-size is used.

Another, nearly unnoticeable change, is that by default now, connections start with 1 cursor, but will scale up, on-demand, to 5 cursors. Of course, maxcursors can be set higher or lower than 5, but it defaults to 5.

Most other changes were fixes for subtle bugs or tweaks to improve performance.

Full ChangeLog follows:

  • updated md5 pwdenc to use rudiments md5 class
  • solved a long-standing hang that could occur if the database password expired while sqlrelay was running, and the sqlr-scaler attempted to start new connections
  • added convertcase, convertcasedoublequotes, and removedoublequotes parameters to normalize translation module
  • added ColumnCaseName parameter to ODBC driver
  • fixed a subtle bug in the ODBC driver that caused the first row to be skipped if columns were not bound
  • added -backtrace option for debugging
  • fixed mysql connection module so it can use the statement API on Windows
  • fixed semaphore reference-before-init bug in sqlrservercontroller
  • added support for db, debug, columnnamecase, dontgetcolumninfo, nullsasnulls, and lazyconnect connect-string options to all drivers (though in some they are camel-case and others lower-case)
  • in DB-abstraction-layer drivers, the debug, dontgetcolumninfo, nullsasnulls, lazyconnect, krb, and tls connect-string options now support any yes/no equivalent such as yes, Yes, y, Y, true, True, 1, (and similar for no)
  • fixed a regression that could cause crashes in the C#/ADO driver when binding null or date values
  • added all parameters to ODBC config screen
  • removed calls to mysql_stmt_store_result/mysql_stmt_num_rows from mysql connection to improve performance. mysql connection doesn't know the total number of rows prior to full fetch now (which is consistent with most other databases).
  • updated postgresql connection to use an unnamed cursor instead of naming it and having to deallocate it over an over
  • tweaked skipWhitespaceAndComments and related methods to use charstring::isNullOrEmpty and character::isWhitespace
  • replaced regular expression comparisons in sap/freetds connections with simpler charstring comparisons
  • updated db2 connection to reuse the statement instead of recreating it with each prepare
  • made column-lookup by name case sensitive to improve performance
  • fixed nagle-disable and sizing of socket read/write buffers
  • changed default maxcursors to 5
  • fixed an uninitialized variable in the firebird connection
  • python api's are installed in dist-packages subdirectory on systems that have that instead of site-packages
  • added route-entire-session option to router modules

Rudiments 1.0.0 Release Announcement

Version 1.0.0 of Rudiments, the C++ class library for developing systems and applications, is now available.


Yes, that's right, 1.0.0. It's been a very long time coming, but Rudiments has finally become stable enough to call it version 1. The last few releases have been aggressively targeting this. Now it's time.

So, is Rudiments complete? Hardly. But, the core API has been stable enough, for long enough. Also, the general design has proven to make it easy enough to modify the internals quite a bit without changing the API or ABI. For the most part, anyway.

So what's new in this release? Lots of little fixes. Several features that I've been putting off adding. Several bugs that I've been putting off fixing.

Probably the most significant change is the addition of an avltree class and use of it in various places.

Other big changes include refactoring of the thread class such that the thread and process classes both follow the same spawn/detach/exit/wait paradigm, and refactoring of the filesystem class so that it and the file class both follow the same open/getCurrentProperties/close paradigm.

An md5 class was also added. It and the sha1 class are now children of a hash class.

Full ChangeLog follows:

  • images directory and contents are installed with the docs now
  • added md5 class
  • moved sendSignal/raiseSignal to process class
  • implemented process::backtrace() for windows
  • added missing charstring::getLastOfSet(OrEnd) methods
  • refactored thread class to match process class paradigm
  • added avltree class
  • updated dictionary to use an avltree
  • update xmldom to use a dictionary for its string cache
  • file::getContents(fd) gets the entire file contents now, even if the current position in the file isn't the beginning
  • fixed detection of whether timed semaphore ops are supported on platforms that just implement semtimedop() to return -1 and set ENOSYS
  • fake certs aren't generated now (in tests) on Windows, if ssl is disabled, or if openssl isn't found
  • added charstring::isYes/isNo
  • fixed high/low-bits calculation bugs in memorymap and sharedmemory for Windows
  • fixed a maxsize bug in memorymap for Windows
  • changed filesystem paradigm to match file paradigm - open/close/getCurrentProperties
  • added O_BINARY to various open()'s in file class
  • fixed threadmutex::wait() return value on Windows
  • fixed timezone offset during DST for systems with the "timezone" external variable
  • filesystem class returns NULL consistently for methods that return const char * but that aren't implemented by the OS
  • sys::getSuggested* return 1024 if they are unsupported by the OS now
  • added dlerror() calls before dlopen/dlsym/dlclose to clear stale errors
  • added fs type-name to fs type-id mapping for systems with statvfs
  • added type-name and type-id for SCO htfs
  • if unsetenv doesn't exist, then environment::remove() checks to see if the variable exists at all before setting it to an empty string now, to avoid creating it unnecessarily
  • xmldomnode writes out numeric entites like È correctly now
  • fixed groupentry::initialize(NULL) on UnixWare
  • implemented missing unixsocketserver::close()
  • fixed filesystem::open(fd) for Windows 8+
  • fixed file::changeOwner() for Windows 8+
  • initialized timeremaining in snooze::nanosnooze for systems that don't set it at all if the full sleep occurred (eg. arm64 linux)
  • renamed linkedlistutil_* to node_*
  • added commandline::getValue/found that take arg and abbr parameters
  • added supported() methods to sharedmemory/semaphoreset classes
  • renamed supportsXXX() methods to supported()
  • unix sockets are faked on syllable now
  • tweaked dynamicarray::find() to greatly reduce the number of comparisons it does and improve performance significantly
  • various performance tweaks to memorypool
  • updated memory pool to 8-byte-align allocations
  • renamed filedescriptor::get/setTcpRead/WriteBufferSize to
  • filedescriptor::get/setSocketRead/WriteBufferSize
  • fixed *entry::getAliasList() to return NULL under all circumstances if there are no aliases
  • added missing libraries to the library link command