Because all SQL servers implement
different parts of SQL, it takes work to write portable SQL applications. For
very simple selects/inserts it is very easy, but the more you need the harder
it gets. If you want an application that is fast with many databases it becomes
even harder!
To make a complex application portable
you need to choose a number of SQL servers that it should work with.
You can use the MySQL crash-me
program/web-page http://www.mysql.com/information/crash-me.php to find
functions, types, and limits you can use with a selection of database servers.
Crash-me now tests far from everything possible, but it is still comprehensive
with about 450 things tested.
For example, you shouldn't have column
names longer than 18 characters if you want to be able to use Informix or DB2.
Both the MySQL benchmarks and crash-me
programs are very database-independent. By taking a look at how we have handled
this, you can get a feeling for what you have to do to write your application
database-independent. The benchmarks themselves can be found in the `sql-bench'
directory in the MySQL source distribution. They are written in Perl with DBI
database interface (which solves the access part of the problem).
See
http://www.mysql.com/information/benchmarks.html for the results from this
benchmark.
As you can see in these results, all
databases have some weak points. That is, they have different design
compromises that lead to different behavior.
If you strive for database
independence, you need to get a good feeling for each SQL server's bottlenecks.
MySQL is VERY fast in retrieving and updating things, but will have a problem
in mixing slow readers/writers on the same table. Oracle, on the other hand,
has a big problem when you try to access rows that you have recently updated
(until they are flushed to disk). Transaction databases in general are not very
good at generating summary tables from log tables, as in this case row locking
is almost useless.
To get your application really
database-independent, you need to define an easy extendable interface through
which you manipulate your data. As C++ is available on most systems, it makes
sense to use a C++ classes interface to the databases.
If you use some specific feature for
some database (like the REPLACE command in MySQL), you should code a method for
the other SQL servers to implement the same feature (but slower). With MySQL
you can use the /*! */ syntax to add MySQL-specific keywords to a query. The code
inside /**/ will be treated as a comment (ignored) by most other SQL servers.
If REAL high performance is more
important than exactness, as in some Web applications, a possibility is to
create an application layer that caches all results to give you even higher
performance. By letting old results 'expire' after a while, you can keep the
cache reasonably fresh. This is quite nice in case of extremely high load, in
which case you can dynamically increase the cache and set the expire timeout
higher until things get back to normal.
In this case the table creation
information should contain information of the initial size of the cache and how
often the table should normally be refreshed.
What Have We Used MySQL For?
During MySQL initial development, the
features of MySQL were made to fit our largest customer. They handle data
warehousing for a couple of the biggest retailers in Sweden.
From all stores, we get weekly
summaries of all bonus card transactions, and we are expected to provide useful
information for the store owners to help them find how their advertisement
campaigns are affecting their customers.
The data is quite huge (about 7
million summary transactions per month), and we have data for 4-10 years that
we need to present to the users. We got weekly requests from the customers that
they want to get 'instant' access to new reports from this data.
We solved this by storing all
information per month in compressed 'transaction' tables. We have a set of
simple macros (script) that generates summary tables grouped by different
criteria (product group, customer id, store ...) from the transaction tables.
The reports are Web pages that are dynamically generated by a small Perl script
that parses a Web page, executes the SQL statements in it, and inserts the
results. We would have used PHP or mod_perl instead but they were not available
at that time.
For graphical data we wrote a simple
tool in C that can produce GIFs based on the result of a SQL query (with some
processing of the result). This is also dynamically executed from the Perl
script that parses the HTML files.
In most cases a new report can simply
be done by copying an existing script and modifying the SQL query in it. In
some cases, we will need to add more fields to an existing summary table or
generate a new one, but this is also quite simple, as we keep all transactions
tables on disk. (Currently we have at least 50G of transactions tables and 200G
of other customer data.)
We also let our customers access the
summary tables directly with ODBC so that the advanced users can themselves
experiment with the data.
We haven't had any problems handling
this with quite modest Sun Ultra SPARCstation (2x200 Mhz). We recently upgraded
one of our servers to a 2 CPU 400 Mhz UltraSPARC, and we are now planning to
start handling transactions on the product level, which would mean a ten-fold
increase of data. We think we can keep up with this by just adding more disk to
our systems.
We are also experimenting with
Intel-Linux to be able to get more CPU power cheaper. Now that we have the
binary portable database format (new in Version 3.23), we will start to use
this for some parts of the application.
Our initial feelings are that Linux
will perform much better on low-to-medium load and Solaris will perform better
when you start to get a high load because of extreme disk IO, but we don't yet
have anything conclusive about this. After some discussion with a Linux Kernel
developer, this might be a side effect of Linux giving so much resources to the
batch job that the interactive performance gets very low. This makes the
machine feel very slow and unresponsive while big batches are going. Hopefully
this will be better handled in future Linux Kernels.
What is the difference between
mysql_fetch_array and mysql_fetch_object?
mysql_fetch_array — Fetch a result row
as an associative ARRAY, a numeric array, or both
mysql_fetch_object — Fetch a result
row as an OBJECT
What are the different table present
in MYsql?
MyISAM: This is default. Based on
Indexed Sequntial Access Method. The above SQL will create a MyISA table.
ISAM : same
HEAP : Fast data access, but will
loose data if there is a crash. Cannot have BLOB, TEXT & AUTO INCRIMENT
fields
BDB : Supports Transactions using
COMMIT & ROLLBACK. Slower that others.
InoDB : same as BDB
No comments:
Post a Comment