Month: July 2008

Scripts for loading up DBGEN TPC-H data

If you’re interested in creating a TPC-H schema for testing purposes, then the following scripts may be of use to you:

Unix Scripts:
Multi Load TPCH
Load TPCH Stream

SQL*Loader Control files:
REGION Table
NATION Table
SUPPLIER Table
CUSTOMER Table
PART Table
PARTSUPP Table
ORDERS Table
LINEITEM Table

You may wish to read and check them before you use them – they’re not exactly rocket science but they seem to do the job.

I have all the files in the same directory for simplicity sake.

I then use them to create a scale factor 1 target TPC-H schema using the following calls:

# “l” loads the REGION and NATION tables
On completion of the registration cheap online levitra process a mail will be sent to you confirming your request. The plant’s ability to increase Facts about viagra samples sale viagra sexual desire and improve sexual function. One medicine that has stood out in the market, and these equipment are utilized by skilled spebuy generic cialis http://davidfraymusic.com/events/meyerson-hall/ts in different fields. Thus, next time do buy whenever you are out for grocery purchase female viagra davidfraymusic.com shopping. ./multi_load_tpch.sh 1 l “tpch/tpch@test” 1

# “s” loads the SUPPLIER table
./multi_load_tpch.sh 1 s “tpch/tpch@test” 10

# “c” loads the CUSTOMER table
./multi_load_tpch.sh 1 c “tpch/tpch@test” 10

# “p” loads the PART and PARTSUPP tables
./multi_load_tpch.sh 1 p “tpch/tpch@test” 10

# “o” loads the ORDERS and LINEITEM tables
./multi_load_tpch.sh 1 o “tpch/tpch@test” 10

Obviously, you need to change the connection string to match your environment.

Caveats:

  1. Obviously, they are supplied as is – use at your own discretion and risk.
  2. You need to have created the target schema tables and made sure they are empty as the SQL*Loader control files use APPEND.
  3. Bear in mind that choosing too high a number of parallel streams (the last parameter in the calls) will overload your machine so try and balance it against the available system resources.

Bugs, issues or questions, please get in touch…enjoy.

Creating a TPC-H schema with DBGEN on HP-UX

I wanted to try out this HammerOra product from Steve Shaw, both at work and on my box at home…but after playing with it at home, I realised that it takes quite some time to build even a small (scale factor 1) TPC-H schema…I know it runs serially, but I’m still not quite sure why it’s that slow (on my system that is), but Steve does say it can take a while and that you might wish to consider using the TPC utility DBGEN to generate and load the schema quicker…particularly if you use some manual parallelisation.

Given that I also need to use this tool to help with some benchmarking at work, I decided to try to get DBGEN to run on a HP-UX box today and had one or two problems which I managed to sort out. The machine in question is an RP8420 running HP-UX B.11.11.

DBGEN is a utility that allows you to create a series of flat files which contain the data for a TPC-H schema. You can then use SQL*Loader to load these into appropriately constructed tables in an Oracle database – any database actually…but I only care about Oracle of course 😉

The utility can be called with various parameters including making the target datasets in smaller “child” files which can be created in a manually parallelised fashion to speed the whole process up. You have to download the DBGEN reference data set from the TPC website (lower right hand side).

This reference data set contains the ANSI C source code which makes the DBGEN executable (and QGEN also…but more on that another day)…unfortunately it’s just the source code, so that means you need to compile it yourself…which of course leads to the first problem…that I know diddly squat about C…yeah I know, not very manly! Luckily I can sometimes follow instructions (which come with the reference data set)…

1. Copy makefile.suite to makefile
2. Edit makefile and make the following amendments (in red):

 

################
## CHANGE NAME OF ANSI COMPILER HERE
################
CC = gcc
# Current values for DATABASE are: INFORMIX, DB2, TDAT (Teradata)
# SQLSERVER, SYBASE
# Current values for MACHINE are: ATT, DOS, HP, IBM, ICL, MVS,
# SGI, SUN, U2200, VMS, LINUX, WIN32
# Current values for WORKLOAD are: TPCH
DATABASE= ORACLE
MACHINE = HP
WORKLOAD = TPCH

That’s it for the makefile.

Now, as I mentioned, the ORACLE database is not a listed database variant in the DBGEN C code – it’s got all the other popular RDBMS which I find quite bizarre…I’m sure there’s a reason, but I can’t think of one. To get around this, as per Chapter 5 in “Pro Oracle Database 10g RAC on Linux” by Steve Shaw and Julian Dyke, I added a section to the tpcd.h for the ORACLE database:

 
#ifdef ORACLE
#define GEN_QUERY_PLAN “”
#define START_TRAN “”
#define END_TRAN “”
The pill is formulated using ingredients that are clinically tested to ensure they contain the same as the branded order viagra product used to treat erectile dysfunction or male impotence. Why the ED sufferers cost of viagra pills aspire to have sildenafil jelly rather than tablets? Let’s have a look at the major reasons to prescribe kamagra medicines to the ED patients. Erection difficulties buy levitra australia are also caused by the affect of surgery, including spinal, neurological or pelvic surgeries. You would have surely heard about kamagra- an excellent treatment for male erection disorders, available as tablets, jellies and soft tablets. bulk cialis #define SET_OUTPUT “”
#define SET_ROWCOUNT “”
#define SET_DBASE “”
#endif

I thought that was it – but it still would’t compile, giving the error:

config.h:213:2: #error Support for a 64-bit datatype is required in this release

Looking at the config.h – and bearing in mind I’m no C programmer – it struck me as odd that all bar the HP machine section, had stuff about DSS_HUGE and 64 bits…so I took a punt and copied some lines (in red) from the IBM section into the HP one to see if it worked…and it did. The HP section now looks like this:

 

#ifdef HP
#define _INCLUDE_POSIX_SOURCE
#define STDLIB_HAS_GETOPT
#define DSS_HUGE long long
#define HUGE_FORMAT “%lld”
#define HUGE_DATE_FORMAT “%02lld”
#define RNG_A 6364136223846793005ull
#define RNG_C 1ull
#endif /* HP */

Typing make at the command prompt then compiles the code and produces the dbgen executable…which I then spent a few hours playing with to create a scale factor 1 TPC-H set of files.

 

My next problem was one of my own making really in that I copied the CREATE TABLE statements for the TPC-H target tables from HammerOra’ TCL script for TPC-H creation, but unfortunately, the column ordering is slightly different in those DDL statements as compared to the DBGEN output files…which meant that I created the tables OK, but since I’d copied the column ordering to make the SQL*Loader control files, I got errors when I tried to load some of the files as the column order is different in one or two cases.

 

I then downloaded the TPC-H specification document which has, amongst other things, the data model, from which I cross checked the column ordering of the data model against the columns in the output files and then managed to rerun the data in without any further issues.

 

Tomorrow I’ll try running HammerOra against the target TPC-H schema and make some shell scripts to try and automate most of the process so we can build different scale factor schemae and do so in a manually parallelised fashion – scripts the amiable Scotsman created for his parallel testing a while back should give me a good start with that.

 

PC for manly men?

So, after suffering a hard disk failure, I figured it was time to buy a new PC for (Oracle) research purposes and the choice seemed to boil down to:

  1. Buy a Dell or HP high end PC from their website and pay serious money for it.
  2. Pick a proper server off Ebay – cheaper but may have pitfalls including warranty, dodgy sellers and delivery. (it was interesting to look for E10K Sun boxes on there)
  3. Spec a PC myself and get a box shifter to build and ship it.
  4. Upgrade my current PC with a selection of new bits.

I chose #3 and bought it from a company call Scan. I’m pretty happy with the result and the service I received although due to some DOA parts it took a little longer than I’d hoped…at least they had the problems to deal with instead of me!

#1 is expensive and you’re sort of limited to the options they offer. I costed up something similar to what I’ve ended up buying and it was nearer 3,000 pounds rather than the 1800 ish that I paid.

#2 is cheaper than #1 but these type of machines are really noisy, albeit solid pieces of kit and more akin to what I’d work on during my day job.

#4 and #3 are similar except for who gets the grief of making all the new bits work together, and every time I try to build things myself I get grief with it (usually parts arriving DOA or incompatible with each other). Scan had to deal with a DOA motherboard and CPU amongst other things…rather them than me.

So, #3 it was…and it arrived a few days ago.

Specification is:

In order to retain, the quality of erections Kamagra is available in the online shops where it can be ordered cialis low cost via a phone call to 1-844-844-2435. cheapest tadalafil online Male erectile problem is well-known as inability to keep or sustain healthy erections is now very easily available in various Sex Treatment Centers in Noida where anyone can consult or take treatment from a best sexologist who can provide better solution for any kind of disease. If one finds it difficult to achieve or sustain an erection for satisfactory sexual activity. Our pharmacy store purchase viagra is a blessing for all men suffering from mild cases of erectile dysfunction. It is the contemporary version of viagra pill price .
Pictures (Click on them for bigger images) below:

bigbox_closed_800x600.jpg
bigbox_open_800x600.jpg

Not quite an “enterprise server” and I’m sure it pales into insignificance against any of the kit Kevin uses, but pretty quick.

I’ve configured it for dual boot of Vista 64 Ultimate and Oracle Enterprise Linux 5 (using EasyBCD) and I’m about to start doing some installations and benchmarking…should be fun.

I installed VirtualBox on the Vista 64 OS and it’s working very nicely…well, it’s set up and working…we’ll find out how nicely it’s working when I install Oracle and HammerOra and give it a bit of a kicking!

Who knows, I might even find time to blog about it!