Database name sanitizing not done well

1) Hyphens are replaced by underscores. The code claims that this is because Mysql does not like hyphens in database names, yet the Mysql 5.0 documentation allows them. 2) The checks are incomplete. For example, there is a maximum length on Mysql database names, and the names must not contain characters that the file system disallows (so /\ are not allowed). Less relevant but still possible is a code-255 character which is disallowed as well. 3). Checking and sanitizing are distributed across multiple files: the code that calls fix_database_name repeats some (but not all) of the checks and changes that fix_database_name does.

I propose removing these checks, and relying on the database itself to determine what is a valid name. For concrete database creation, errors are reported anyway. For the database prefix, simply test-create databases (e.g. if the prefix is "pre", create databases "pre" and "prex"). If it works, the prefix is ok; if a database with that name already exists, the prefix is ok, too; if the database does not exist and cannot be created, the prefix is not ok. (Just make sure to clean up the test-created databases afterwards.)

Status: 
Closed (fixed)

Comments

1) We chose not to allow dashes as even though MySQL supports them, they are pain to work with in SQL as they need to be quoted to avoid the dash being interpreted as a special character.

2) I will add a check on name length in the next Virtualmin release. Non alpha-numeric characters should be blocked already though.

3) Good idea .. I will consolidate that code into a single function.

Relying on the DB to report errors is a little messy and user-unfriendly though. I'd prefer to have Virtualmin do the checks..

Maybe I'm overlooking something - what's the problem with CREATE DATABASE my-own-database (or CREATE DATABASE "my-own-database" if using ANSI mode)?

Potential problems that I see are doubling any quote characters in the database name, and stripping out any NULs just to to stay on the paranoid side... both easy to do, so what else?

I agree that using database error codes can be messy. That's why my own database code bases any decisions just on whether the SQL succeeded or failed, and avoids inspecting the actual error code at (almost) all cost.

Letting the database do the checks would be more accurate, and less work in the long run as Virtualmin would never have to adapt to changing database name rules. (I can easily imagine future version of Mysql accepting longer names, or Postgresql imposing different restrictions than Mysql... well, the code says there already are differences.)

I did some more testing, and found that DBs with a - in their names are actually easier to work with than I first feared .. so I will allow them in the next Virtualmin release.

Also, the max DB name length will be computed from the mysql.db table's Db column, rather than being hard-coded.

I did some more testing, and found that DBs with a - in their names are actually easier to work with than I first feared .. so I will allow them in the next Virtualmin release.

Also, the max DB name length will be computed from the mysql.db table's Db column, rather than being hard-coded.

I did some more testing, and found that DBs with a - in their names are actually easier to work with than I first feared .. so I will allow them in the next Virtualmin release.

Also, the max DB name length will be computed from the mysql.db table's Db column, rather than being hard-coded.

I still don't understand what's the problem with letting the database check database name validity.

Virtualmin is running commands to create the database anyway, and check whether creation succeeded; why the additional checks?

Asking mysql.db.db isn't going to buy you much - if the column width gets increased, it's still possible that not all of Mysql's codebase can handle the longer names already. In the (admittedly unlikely) case that Mysql decides to use CLOBs, you may get a bogus field length (LOBs have a long history of quirks and metadata bugs); to make sure that retrieving the length value works as expected, you'd have to manually check that with each new release of Mysql.
I guess it's easier to just hardcode the limit and check the docs whether the limit increased...

If you want to be really thorough, plan to check against the list of reserved words. (Which is version-dependent.)

The reason to do my own checks is to display a more friendly error message - sure I could just try creating whatever name the user enters, but that would mean displaying a potentially confusing MySQL error message.

Ah, I found an interesting bit that may explain the particular form of Virtualmin's database name checks.

In the Java world, the DatabaseMetaData interface gives access to all sorts of database specifics, such as whether the database supports and/or preserves letter case in table identifiers, and similar nitty-gritty details.

The function DatabaseMetaData#getExtraNameCharacters() returns a String, and is documented like this:
> Retrieves all the "extra" characters that can be used in unquoted identifier names (those beyond a-z, A-Z, 0-9 and _).

Not relevant if the database name is quoted, as Virtualmin does in its quotestring subs. (Note that quotestring doesn't cover all bases, neither the mysql nor the pgsql version doubles quotes inside the name it's supposed to quote. If that's a factor, and DBI is an option, consider using DBI's quote_identifier.)

Automatically closed -- issue fixed for 2 weeks with no activity.