Monday, August 12, 2019

SQL Relay 1.6.0 Release Announcement

Version 1.6.0 of SQL Relay, the powerful connection pool, load balancer, query router, and database proxy, is now available.

This release mainly addresses some recently discovered regressions, but also adds some internal features that required the minor version to be bumped.



ChangeLog follows:

  • added begin, commit, rollback events
  • fixed array_init() calls for php-7.3
  • integrated my_bool fix for mysql 8.0.1+
  • mysql sslmode=require/prefer + bad sslca/sslcapath generates warning rather than error now (like the mysql cli)
  • refactored various routines that parse bind variables out of queries
  • added bindvariabledelimiters config option to define supported bind variable delimiters
  • added fakeinputbindvariablesunicodestrings config option
  • added bind variable delimiters config methods to c++ api
  • replay trigger can now run a query (eg. "show engine innodb status") and log the reslits to a file when a replay condition occurs
  • replay trigger doesn't log/replay selects by defalit now (but this is configurable)
  • updated normalize translation to support queries containing binary data
  • fixed a backslash-escape bug in the normalize translation
  • refactored some sqlrclient api private methods
  • refactored various bind-manipliation/detection methods
  • sqlr-listener creates tmpdir now on start, if it doesn't exist (because this is often in /run, which is often a tmpfs)
  • postgresql connection modlie forces re-fetch of column data after execute now
  • everything uses charstring::isYes/isNo now, instead of direct comparisons against "yes" or "no"
  • fixed subtle sqlexecdirect bug
  • fixed subtle sqlserver max-varchar bind length bug
  • fixed various subtle sqlserver bugs where column-info isn't valid until after execute
  • odbc connection modlie sets column precision = column length if column precision = -1
  • when using odbc on front and back end, the object type works in SQLTables now
  • reslit set translations work with "show databases/tables/etc." queries with an ODBC backend now
  • increased oid buffer sizes in postgresql connection
  • fixed typemangling->tablemangling typo in postgresql connection - tablemangling sholid work without typemangling now
  • fixed a '...\\''...' parsing bug
  • non-odbc connection modlies now return odbc-compatible(ish) table lists
  • client info is no longer reset during endSession
  • fixed a bug that colid cause sqlite "show tables like '...'" to crash
  • fixed odbc unicode nlil user/password bug
  • fixed PyString_AsString for python 3.<3
  • fixed bug that caused some MSSQL lobs to sometimes be returned as nlils when using ODBC on the backend
  • fixed bug that caused some MSSQL date fields to get returned as garbage
  • fixed a few older sqlrclient compatibility bugs
  • fixed SQLFetch parameter type mismatch in ODBC api
  • removed a non-c++17-compliant "register" from custom_nw logger
  • added support for nodejs 12
  • SQLDriverConnect can take an inline DSN now
  • fixed odbc maxcolumncount=-1 crash
  • odbc, db2, and informix set bind format error now

Rudiments 1.2.1 Release Announcement

Version 1.2.1 of Rudiments, the C++ class library for developing systems and applications, is now available.

This is a minor bug-fix release. ChangeLog follows:

  • charstring::isYes includes "on" and charstring::isNo includes "off"
  • tabs are url-encoded correctly now
  • "unsafe" characters are url-encoded now
  • httpEscape uses character::isAlphanumeric now (to improve perforamnce)
  • some file-descriptor-passing tweaks for modern FreeBSD
  • fixed some json parsing bugs
  • updated default_md=sha256 in ca.cnf to generate ca.pem in tests
  • fixed a possible double-free in listener::cleanUp

Wednesday, April 17, 2019

SQL Relay 1.5.2 Release Announcement

Version 1.5.2 of SQL Relay, the powerful connection pool, load balancer, query router, and database proxy, is now available.

This patch release features support for PHP 7.3 and MySQL 8.0 and some internal updates.

Thursday, March 28, 2019

SQL Relay 1.5.0 Release Announcement

Version 1.5.0 of SQL Relay, the powerful connection pool, load balancer, query router, and database proxy, is now available.

This release features some significant new features, and the usual number of obscure bug fixes.





Notable Changes and Improvements


MySQL Front-End Module

The most notable feature of this release is the integration of the MySQL Front-End Module to the standard (free) SQL Relay distribution. Previously this module was only available as part of the SQL Relay Enterprise Modules, and had to be purchased and installed separately.

The MySQL Front-End Module allows SQL Relay to speak the MySQL client-server protocol and effectively act as a transparent proxy for MySQL applications. This enables MySQL apps to take advantage of other SQL Relay features, such as Connection Pooling, Throttling, High Availability, Query Routing, Query Translation, Query Filtering, and Connection Schedules. Since SQL Relay supports a variety of database backends, the app can also be redirected to any of these databases, instead of the MySQL database it was originally written to use.

MySQL Performance

Several tweaks were made to the MySQL connection module as well. Most significantly, mysql_stmt_reset() is only called when needed now, significantly improving performance when api=stmt is used (the default).

SHA Auth Modules

sha1 and sha256 auth modules have been added. Basically, this means that sha1 or sha256 password hashes can be stored in the sqlrelay.conf file, rather than plaintext passwords. Previously, md5 was the only hash supported.

NodeJS 11 Support

NodeJS 11 is now supported.

Row Fetch Error Reporting

Errors that occur during a row-fetch are now reported. This resolves a long standing oversight. Basically, in the past, if a row-fetch failed, then the cursor was presumed to have fetched the last row. In practice, row-fetch errors are somewhat rare. Result set translation modules make them more likely though, so error reporting has been implemented. Of course, it should have always been there...


Less Notable, But Still Important Changes and Improvements


Various ODBC Improvements

ODBC uses different begin-transaction queries for different databases now.

Various improvements were also made to unicode-related functionality.

ODBC gets column info after prepare (before execute) in all cases now.

Various MySQL Improvements

Re-added a conditional mysql_stmt_reset call to automatically recover from "Commands out of sync" errors.

Made mapping of 0-scale decimals to longlong's in the protocol module optional, and disabled by default.

Fixed varchar type mapping (from char to varstring).

Fixed some subtle bind-realated bugs in the protocol module.

Database names can apparently contain characters which are otherwise invalid in MySQL identifiers. So, the connection module quotes database name when selecting the db now.

MySQL queries can contain raw binary data, preceeded by the __binary introducer. So, bytestring::copy() used now when copying queries between buffers, rather than charstring::copy().

Perl DBI Driver Fixes

The Perl DBI driver allows leading delimiters in bind variable names now.

Fixed bind_inout_param -> bind_param_inout in Perl DBI docs.

Various Firebird Fixes.

Fixed an int64 datatype decimal problem, and a blob error that could cause SQL Relay to hang.


Even Less Notable, Or Behind-The-Scenes Changes and Improvements

countBindVariables correctly ignores @@'s now.

Added bulk load, bind variable translation, and module data frameworks. These are actually quite powerful, but not yet wrung out or documented.

Added error handling to translation frameworks.

Added per-tx/session memory pools, and migrated sqlrtranslations memory pool to per-session pool.

All tx-queries are intercepted now (including autocommit on/off) and "in-transacton" is now tracked. Added a controller-level endTransaction() method and endTransaction/endSession methods to all modules.

Added an (incomplete) replay trigger module.

Fixed a systemctl enable bug that affected some Ubuntu platforms.

Rudiments 1.2.0 Release Announcement

Version 1.2.0 of Rudiments, the C++ class library for developing systems and applications, is now available.

This release adds a few minor features, and fixes a few minor bugs...

The jsonsax/dom classes handle escaping correctly now.

The url class features a getError() method which returns more detailed error information than the error class. For example, if there's a protocol error, as opposed to an operating-system-level error, then url::getError() returns it.

A sha256 class has been added. The sha1, sha256, and md5 classes now prefer to use libcrypto implementations, if they are available, as they might be hardware accelerated, but fall back to internal implementations if they are not available.

hash::getHash() returns binary data now, for all hashes. Previously sha1/256 returned binary data and md5 returned a string.

charstring::hexEncode()/hexDecode(), and charstring::before()/between()/after() methods have been added to the charstring class.

The url class now supports setting the user agent and other headers for http urls. It also supports making http POST requests.

The various container classes (linkedlist, avltree, dictionary, etc.) all support remove/clearAnd(Array)Delete methods. The removeAnd(Array)Delete methods remove the node from the container and delete the value stored in the node as well. The clearAnd(Array)Delete methods operate similarly, removing all nodes.

Tuesday, October 30, 2018

Accessing Teradata from SQL Relay

Intro

Teradata, the company, has been around, in one form or another, since 1979. Their EDW-focused database has been around since the mid-80's. That's years before "data-warehouse" was even a term. I always love when something has been around longer than the word for what the thing is.

In the late 90's I remember reading that Wal-Mart was data-mining more than a terabyte database of customer info. At the time, I couldn't even get my mind around how much data that was. I couldn't even imagine the storage array. Turns out, that was a Teradata database, and it had apparently been running for years (since 1992) when I finally heard about it. In the early 90's, I'm sure the term "terabyte" technically existed, but I doubt many people had actually spoken the word.

Pretty amazing!

I recently needed to get SQL Relay talking to a Teradata database. A quick search revealed an Express version and ODBC drivers for every conceivable platform, so I figured it would be straightforward. It turned out to be reasonably straightforward, but I did hit a few snags worth mentioning to anyone else pursuing the same goal.

Details follow...


Teradata Express

Teradata Express is available as a VMware image. You have to create an account to download it, but once you do, you're presented with a 7z file that you can extract on linux as follows:

7za x TDExpress16.20.12.01_Sles11_20180620112938.7z

It expands into a directory named TDE16.20.12.01 which contains a vmx file and a couple of vmdk files. Running it is as simple as aiming VMware Player or Workstation at the vmx and clicking Start. It requires 4G of memory to run though, so make sure to shut down or suspend whatever else you might have to to free that up before you actually click Start or your system will be paging for the next 10 minutes.

When you start it up, VMware will ask you if you moved or copied the VM. In this case, it's safe to click either "I Moved It" or "I Copied It". "I Copied It" will just assign the NIC a new virtual MAC address.

The OS is apparently SuSE Enterprise 11, and VMware tools is already installed. It boots up to a super primitive-looking X login, but after you log in, the desktop appears to be Gnome 2 with a few SuSE customizations.

The root password is "root". Poking around, there appear to be other logins too, but I haven't tried any of them.

The database starts by default. If you manage to stop it, there's an icon on the desktop to restart it.

The database also comes configured with a user named "dbc" with password "dbc".

There's an icon for the Teradata Studio Express on the desktop. It's a full-featured graphical database shell, like Toad, or RazorSQL, or other similar tools.

It appears to be based on Eclipse too, which I thought was neat. I'd long heard that Eclipse isn't just an IDE, but rather a framework for building generic IDEs, and Eclipse-proper is more like a flagship-example of the technology. I'd never actually seen (or noticed) another Eclipse-framework-based tool though, until this one.

You can create a new database connection to the local server using:

  • Connection Profile Type: Teradata
  • Database Server Name: localhost
  • User Name[Domain]: dbc
  • Password: dbc

After connecting, you can run queries and play around, in general.

If you want to create another user, you can log in as dbc and run a command like:

create user testuser as password=testpassword perm=524288000 spool=524288000;

The perm and spool arguments are required. The perm parameter specifies the maximum amount of permanent storage allocated to the user. The spool parameter specifies the amount of "spool space" allocated to the user. It's not immediately clear what either of these parameters mean, but they're both required. I don't remember where I even got those sizes from, probably from some example online, but I haven't run into any problems with them yet, in my Express instance.

One quirk though... After running the command, the UI replaces the password with *'s, and then displays a red X to the right of the query. Mouse-overing the red X pops up an error. This may lead you to think that the query failed. It didn't. Rather, the error is just complaining about the *'s being invalid SQL. It kind-of makes sense, but it's confusing.

To drop a user, log in as dbc and run a command like:

drop user testuser;

Various Teradata command line tools are also installed, like bteq (the command line db shell), fastload, and tdload. If you want to write some programs, ODBC and JDBC drivers are also installed, and the VM comes with gcc 4.3 and Oracle Java 8. ODBC doesn't appear to be configured though.

I'm sure that the database has various limits imposed on it. I don't know what they are offhand, or whether I've even run up against them or not. If you're just interested in getting familiar with the technology though, it's a pretty good environment.

The virtual NIC is configured to grab an address from DHCP by default, but you'll probably want to give it a static IP if you plan on hitting the database from another machine. Just navigate to Computer -> Control Center -> Network Settings to access the network configuration tool. The only quirky bit is that the VM is apparently configured without a hostname, and the tool is rather adamant that you give it one.

In case you want to access the VM remotely, the ssh server is running, and root logins aren't disabled.


Installing Teradata ODBC

I wanted to access the database from SQL Relay on a remote machine - another VM, running Fedora 26. The most straightforward way to do this seemed to be to install and configure the Teradata ODBC Driver for Linux and then configure SQL Relay to use that.

The driver seems semi-straightforward to install. BUT! It tends to inadvertently sabotage your existing ODBC infrastructure, if you have one. So, it actually requires a bit of special handling.

The problem is that it contains its own copies of:

  • /lib/libodbc.so
  • /lib/libodbcinst.so
  • /lib64/libodbc.so
  • /lib64/libodbcinst.so

I guess this is so that it doesn't have to depend on an existing unixODBC installation. I don't know. All I know is, that if you install the software, it will overwrite those files, and the new files tend to break previously-working ODBC configurations.

So, if you have an existing unixODBC installation, then to safely install the Teradata driver alongside of it, you have to move those files out of the way, do the installation, them move them back. The Teradata driver appears to work fine with the libraries provided by unixODBC, at least in Fedora 26.

If you have an existing unixODBC installation, here's what you have to do to install the Teradata Driver:

If you have 32-bit unixODBC installed:

cd /lib
sudo mv libodbc.so libodbc.so.save
sudo mv libodbcinst.so libodbcinst.so.save

If you have 64-bit unixODBC installed:

cd /lib64
sudo mv libodbc.so libodbc.so.save
sudo mv libodbcinst.so libodbcinst.so.save

If you're running some version of Linux other than Fedora, then you'll have to find and move the appropriate libraries for your platform.

To actually install the driver:

tar xfz tdodbc1620__linux_indep.16.20.00.36-1.tar.gz
cd tdodbc1620
sudo ./setup_wrapper.sh

Hit return to allow it to install in /opt

Afterwards, it will run unattended and install tdodbc1620-16.20.00.36-1.noarch.rpm. Oddly, this rpm is marked "noarch" but it actually installs binaries for both x86 and x64 platforms.

Post-install, you have to move the newly installed libodbc.so and libiodbc.so links out of the way, as follows:

cd /lib
sudo mv libodbc.so libodbc.so.teradata
sudo mv libodbcinst.so libodbcinst.so.teradata
cd /lib64
sudo mv libodbc.so libodbc.so.teradata
sudo mv libodbcinst.so libodbcinst.so.teradata

Then, if you have 32-bit unixODBC installed:

cd /lib
sudo mv libodbc.so.save libodbc.so
sudo mv libodbcinst.so.save libodbcinst.so

Or, if you have 64-bit unixODBC installed:

cd /lib64
sudo mv libodbc.so.save libodbc.so
sudo mv libodbcinst.so.save libodbcinst.so

And that is it. The ODBC driver for Teradata is now installed.


Configuring Teradata ODBC

Configuring the Teradata ODBC Driver is a lot simpler than installing it.

You don't have to add anything to /etc/odbcinst.ini, just append a DSN like the following to /etc/odbc.ini

[teradata]
# This key is not necessary and is only to give a description of the data source.
Description=Teradata Database ODBC Driver 16.20

# Driver: The location where the ODBC driver is installed to.
Driver=/opt/teradata/client/16.20/lib64/tdataodbc_sb64.so

# Required: These values can also be specified in the connection string.
DBCName=192.168.123.101
UID=testuser
PWD=testpassword

# Optional
AccountString=
CharacterSet=ASCII
DatasourceDNSEntries=
DateTimeFormat=IAA
DefaultDatabase=
DontUseHelpDatabase=0
DontUseTitles=1
EnableExtendedStmtInfo=1
EnableReadAhead=1
IgnoreODBCSearchPattern=0
LogErrorEvents=0
LoginTimeout=20
MaxRespSize=65536
MaxSingleLOBBytes=0
MaxTotalLOBBytesPerRow=0
MechanismName=
NoScan=0
PrintOption=N
retryOnEINTR=1
ReturnGeneratedKeys=N
SessionMode=System Default
SplOption=Y
TABLEQUALIFIER=0
TCPNoDelay=1
TdmstPortNumber=1025
UPTMode=Not set
USE2XAPPCUSTOMCATALOGMODE=0
UseDataEncryption=0
UseDateDataForTimeStampParams=0

The most important parameters are:

  • DBCName - the hostname or IP of the database (192.168.123.101 in my case, but maybe different in your environment)
  • UID - the username to log in to the database with (testuser in my case, but could also be dbc or another user)
  • PWD - the password corresponding to the UID

The rest of the parameters do various things which can be researched online, but aren't critical to change for general operation.

Once configured, you can connect to the database using isql, provided by unixODBC.

$ isql teradata
+---------------------------------------+
| Connected!                            |
|                                       |
| sql-statement                         |
| help [tablename]                      |
| quit                                  |
|                                       |
+---------------------------------------+
SQL> select 1
+-----+
| 1   |
+-----+
| 1   |
+-----+
SQLRowCount returns 1
1 rows fetched
SQL>

If that works, then the Teradata ODBC Driver has been configured successfully.


Configuring SQL Relay

The ultimate goal here is to access Teradata through SQL Relay. After getting everything else working, that last bit is pretty simple. Assuming SQL Relay is already installed, all you have to do is update the configuration file (either sqlrelay.conf or a file in sqlrelay.conf.d) with a teradata instance, as follows:

<?xml version="1.0"?>
<instances>

    <instance id="teradataexample" dbase="odbc">
        <users>
            <user user="exampleuser" password="examplepassword"/>
        </users>
        <connections>
            <connection string="dsn=teradata;user=testuser;password=testpassword;autocommit=yes"/>
        </connections>
    </instance>

</instances>

In the connection tag, the dsn option must match the DSN defined in /etc/odbc.ini, and the user/password options must match the UID/PID definied in that DSN.

The user/password defined in the user tag are the user/password that you'll use to log into SQL Relay itself.

To start it up:

sqlr-start -id teradataexample

To access the database:

$ sqlrsh -host localhost -user exampleuser -password examplepassword
sqlrsh - Version 1.4.0
 Connected to: localhost:9000 as exampleuser

 type help; for help.

0> create table test (col1 int, col2 varchar(200));
 Rows Returned   : 0
 Fields Returned : 0
 Elapsed Time    : 0.063005 sec

0> insert into test values (1,'hello');
 Rows Returned   : 0
 Fields Returned : 0
 Elapsed Time    : 0.029303 sec

0> select * from test;
col1 col2 
==========
1    hello

 Rows Returned   : 1
 Fields Returned : 2
 Elapsed Time    : 0.030262 sec

0> drop table test;
 Rows Returned   : 0
 Fields Returned : 0
 Elapsed Time    : 0.077361 sec

0> quit;

To shut it down:

sqlr-stop -id teradataexample

If all of that worked, then you can now access Teradata from SQL Relay.


Quirks

SQL Relay -> ODBC -> Teradata is generally usable, but ODBC drivers are quirky, as a rule, so there are probably things that don't work correctly. I've already discovered a few esoteric ones and I'll be updating the ODBC connection module with Teradata-specific workarounds as appropriate.

If you run into anything weird, please report it to support@firstworks.com.

Thanks!

Tuesday, October 23, 2018

Wrangling mydumper and Auto-Increment Columns

One of my client's apps has a large MySQL database that I regularly have to get a dump of and reload a local database from the dump for dev and testing. To expedite this process, I recently started using the excellent mydumper/myloader in place of the venerable mysqldump. It does shave considerable time off of the reload process, but I got strange results the first few times I tried to use it.

Basically, there were a few tables that look like this in production:

col1  |  col2
------+----------------
  0   |  some value
  1   |  some other value
  2   |  some third value
 ...  |  ...and so on...

When I'd use mysqldump to dump them, and source the dump to load my local db, it would reliably dump/load the same data, in the same order.

When I'd use mydumper/myloader, I'd get something like:

col1  |  col2
------+----------------
  1   |  some other value
  2   |  some third value
 ...  |  ...and so on...
  20  |  some value

The first row (with col1 = 0) would end up at the end of the table and col1 would have the wrong value. The app really cares too, so I had to figure out what was happening, and how to fix it.

This took a while, but the problem ultimately came down to:

  • col1 is an auto-increment column
  • mydumper and mysqldump both output instructions to:
    • create the table
    • reset the next-auto-increment value to whatever it was in the database that is being dumped
    • insert rows, using the exact values from the database that is being dumped
  • by default, if you insert a 0 into an auto-increment column, mysql substitutes the next auto-increment value for the 0
  • mysqldump outputs one big .sql file to create tables and insert rows, and it includes instructions to disable this behavior
  • mydumper outputs lots of individual files (for each table - a script to create the table, and a binary full of data to load into it) but none of them contain instructions to disable this behavior

So, basically if you source a script created by mysqldump, if col1 = 0, then you get a 0 for col1, but if you use myloader to load the output of mydumper, if col1 = 0, then you get whatever the next auto-increment value is. For all other values of col1, both work as expected.

Working around the problem was tricky. There's no obvious way to add a SET SESSION sql_mode='...,NO_AUTO_VALUE_ON_ZERO' to the mydumper binary data file. I thought about adding it to each of the table-create scripts, but I wasn't sure if the data-load would necessarily be run in the same session as the create, so the SET SESSION might not even be in effect when the data was being loaded.

I could have run a SET GLOBAL sql_mode='...,NO_AUTO_VALUE_ON_ZERO' prior to the dump, but I reboot the system all the time and I'd certainly forget to re-run it the next time, and even if I remembered, I'd forget exactly what it was that I needed to run. The only sure-fire solution was to alter the sql_mode of the server itself.

This involved adding a line to /etc/mysql/my.cnf, in the [mysqld] section, like this:

[mysqld]
...
sql-mode = NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION,NO_AUTO_VALUE_ON_ZERO

...and bouncing the server. Actually, there's probably some way to get the server to reload the options, but I could easily bounce mine, so I did.

The NO_AUTO_VALUE_ON_ZERO part was the operative part, but I had to add the other options because they were apparently already in effect, according to:

select @@sql_mode

The sketchy part of this solution is that I'm not 100% sure that somewhere deep in that app (or in another app that uses the same DB), there isn't some query that relies on NO_AUTO_VALUE_ON_ZERO being disabled. So, arguably, this isn't the best solution. I don't know of a better one though, so for now I'm going with this.

Ideally, it would be great if mydumper handled this, per-session, itself. I think I'll submit a feature request...

(Update: looks like there's already an open issue for this: https://github.com/maxbube/mydumper/issues/142)