Thursday, August 1, 2013

TAF for Oracle RAC

During the PoC of the IBM SVC Stretched Cluster for Oracle RAC, some of the tests were to simulate Oracle RAC node failure or entire site failure.

We had configured Oracle SCAN IP on the cluster, this meant that the clients would be evenly balanced between the two nodes of the Oracle RAC cluster.

We also configured Oracle TAF(Transparent Application Failover).
FAILOVER CONCEPTS from
http://www.oracle.com/technetwork/database/features/oci/taf-10-133239.pdf
Failover allows a database to recover on another system within a cluster. Figure illustrates a typical database cluster configuration. Although the example shows a two-system cluster, larger clusters can be constructed. In a cold failover configuration, only one active instance can mount the database at a time. With Oracle Real Application Clusters, multiple instances can mount the database, speeding recovery from failures.

The failure of one instance will be detected by the surviving instances, which will assume the workload of the failed instance. Clients connected to the failed instance will migrate to a surviving instance. The mechanics of this migration will depend upon the cluster configuration. Transparent Application Failover feature will automatically reconnect client sessions to the database and minimize disruption to end-user applications.

Here is the tnsnames.ora file with TAF configured that we used for the PoC

On the client machine

-sh-4.1$ pwd
/home/oracle
-sh-4.1$ cat tnsnames.ora
# tnsnames.ora Network Configuration File: /u01/app/ora11/product/11.2.0/dbhome_1/network/admin/tnsnames.ora
# Generated by Oracle configuration tools.

svccon =
  (description=
    (address=(protocol=tcp)(host=192.168.45.244)(port=1521))
    (address=(protocol=tcp)(host=192.168.45.245)(port=1521))
    (load_balance=yes)
    (connect_data =
        (server = dedicated)
        (service_name=svcdb)
        (failover_mode =
        (type=select)
        (method=basic)
           (retries=180)
           (delay=5)
     )
     )
     )

Tuesday, July 23, 2013

Incompatible version of libocijdbc[Jdbc:112020, Jdbc-OCI:112030

I fixed the "Incompatible version of libocijdbc[Jdbc:112020, Jdbc-OCI:112030" error message that we were seeing while starting swingbench with Oracle oci Driver

Here is what I did to fix the issue:
Installed an older version of Oracle Client on the machine, and then did the following

bash-4.2$ pwd
/swingbench/swingbench/lib
bash-4.2$ cp /oh/product/11.2.0/client_2/instantclient/ojdbc5.jar .
bash-4.2$
bash-4.2$ ls -lrt
total 33888
-rw-r--r--    1 oracle   dba          479413 Mar  7 19:52 ucp.jar
-rw-r--r--    1 oracle   dba        11512178 Mar  7 19:52 swingbench.jar
-rw-r--r--    1 oracle   dba           20349 Mar  7 19:52 simplefan.jar
-rw-r--r--    1 oracle   dba           70569 Mar  7 19:52 ons.jar
-rw-r--r--    1 oracle   dba         2152849 Mar  7 19:52 ojdbc6.jar
drwxr-xr-x    2 oracle   dba             256 Mar  7 19:52 launcher
-rw-r--r--    1 oracle   dba          999966 Mar  7 19:52 ant.jar
-rw-r--r--    1 oracle   dba         2095661 Mar  7 19:53 ojdbc5.jar

bash-4.2$ cp ojdbc5.jar ojdbc6.jar

That did it.

Thursday, June 20, 2013

ORA-15032: not all alterations performed ORA-15027: active use of diskgroup "DATA" precludes its dismount

I was trying to start "fresh", so I tried to dismount the ASM diskgroups before trying to drop them. While trying to dismount using the ASM Configuration Assistant, I got the following error messages.

ORA-15032: not all alterations performed
ORA-15027: active use of diskgroup "DATA" precludes its dismount


The fix to this is to drop the database first, and after that dismount the ASM Diskgroup

Thursday, April 11, 2013

Slow connection between the Oracle RAC nodes

While setting up our Oracle RAC hardware running AIX 7.1 we noticed that a simple 'rsh' between the nodes would take really long. In this case we were trying to see if we could rsh from one node to the other node using the private interconnect which were using 10 GigE NICs.

As you see here that something like "rsh 192.168.150.22 'date'" took an awful long time.

bash-4.2# time rsh 192.168.150.22 'date'
Thu Apr 11 18:30:41 CDT 2013

real    1m11.372s
user    0m0.002s
sys     0m0.002s

bash-4.2#

This thing was that we did not have a 'hosts' entry in the /etc/netsvc.conf file. The /etc/netsvc.conf file is used to specify the ordering of name resolution. The order we specify in the /etc/netsvc.conf will override the default ordering, and use the order specified by us for the 'hosts' keyword.

After we added the following line to the /etc/netsvc.conf, everything worked as expected.
hosts=local,bind

Here is the rsh after adding the hosts field to the /etc/netsvc.conf

bash-4.2# time rsh 192.168.150.22 'date'
Thu Apr 11 18:31:58 CDT 2013

real    0m0.107s
user    0m0.002s
sys     0m0.002s
bash-4.2#

Monday, January 14, 2013

Mapping Storage Array(SAN) Volumes to AIX host

I have an older post where I talk about mapping SAN volumes to a Linux host, and on how to identify the volumes that were created on the array.

In this post I will show how to identify a volumes created on a Storwize V7000 Storage Array on the AIX host.


From the Storwize V7000 GUI we can see that I created 1 volume. Volume test_UID with UID 60050768027F0001F00000000000016C was created, the volume was then mapped to an AIX host.

On the AIX host run the lspv command this will list the current volumes available on the host. If you don't see the newly created volumes, run the cfgmgr command from the command line. Next run the lspv command, and you will see the two new volumes that were created and mapped onto the host.

 isvp14_ora> lspv
hdisk0          00f62a6b98742dad                    rootvg          active
hdisk59         00f62a6bf28975ed                    swapa           active
hdisk230        none                                None

Running the lsattr command on the AIX host as shown below will show lots of information about the LUN, including the unique_id       3321360050768027F0001F00000000000016C04214503IBMfcp

From highlighted part of the unique_id 3321360050768027F0001F00000000000016C04214503IBMfcp
field we can find that this LUN maps to the the newly created volume on the storage array.

isvp14_ora> lsattr -El hdisk229
PCM             PCM/friend/fcpother                                 Path Control Module              False
algorithm       fail_over                                           Algorithm                        True
clr_q           no                                                  Device CLEARS its Queue on error True
dist_err_pcnt   0                                                   Distributed Error Percentage     True
dist_tw_width   50                                                  Distributed Error Sample Time    True
hcheck_cmd      test_unit_rdy                                       Health Check Command             True
hcheck_interval 60                                                  Health Check Interval            True
hcheck_mode     nonactive                                           Health Check Mode                True
location                                                            Location Label                   True
lun_id          0x14000000000000                                    Logical Unit Number ID           False
lun_reset_spt   yes                                                 LUN Reset Supported              True
max_retry_delay 60                                                  Maximum Quiesce Time             True
max_transfer    0x40000                                             Maximum TRANSFER Size            True
node_name       0x50050768020000d2                                  FC Node Name                     False
pvid            none                                                Physical volume identifier       False
q_err           yes                                                 Use QERR bit                     True
q_type          simple                                              Queuing TYPE                     True
queue_depth     8                                                   Queue DEPTH                      True
reassign_to     120                                                 REASSIGN time out value          True
reserve_policy  single_path                                         Reserve Policy                   True
rw_timeout      30                                                  READ/WRITE time out value        True
scsi_id         0x10900                                             SCSI ID                          False
start_timeout   60                                                  START unit time out value        True
unique_id       3321360050768027F0001F00000000000016C04214503IBMfcp Unique device identifier         False
ww_name         0x50050768022000d2                                  FC World Wide Name               False

We can use the below script to show all the UID of the luns on the AIX host

#!/usr/ksh

for disk in $(lsdev -Cc disk|awk '{print $1}')
do
echo $disk:
lsattr -EHl $disk -a unique_id
echo ---------------------------------------
done






Friday, November 30, 2012

Configuring MongoDB Replication on RedHat Linux nodes

In this post I’ll look at configuring replication in MongoDB. In my setup I have 2 x RedHat Linux nodes that will act as the Primary and Secondary for my MongoDB replication set.

I began by downloading MongoDB on both the nodes of the replication set.

[root@isvx7 ~]# cd mongodb
[root@isvx7 ~]# wget ftp://ftp.muug.mb.ca/mirror/fedora/epel/5/x86_64/ganglia-3.0.7-1.el5.x86_64.rpm
[root@isvx7 mongodb]# ls
mongodb-linux-x86_64-2.2.0      mongodb-linux-x86_64-2.2.0.tar
[root@isvx7 mongodb]#

My MongoDB replication set will be called liverpool, and “anfield1” and “anfield2” will be the two dbpaths on each of the nodes respectively.

[root@isvx7 ~]# mkdir anfield1
[root@isvx3 ~]# mkdir anfield2

Next I started the mongod server passing it the replication set name, and the dbpath. I also use port 27001 and limited the oplogSize to 50

[root@isvx7 ~]# mongod --replSet liverpool --dbpath anfield1 --port 27001 --oplogSize 50
<some related configuration messages>
Fri Nov 30 16:13:36 [rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)
Fri Nov 30 16:13:46 [rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)

Next I went and did the same thing on the other node ie. isvx3

 [root@isvx3 ~]# mongod --replSet liverpool --dbpath anfield2 --port 27001 --oplogSize 50
<some related configuration messages>
Fri Nov 30 15:17:17 [rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)
Fri Nov 30 15:17:27 [rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)

Right now both these nodes know that they are part of the “liverpool” replication set, but that is about it. They don’t know who else is part of the same replication set.

So when I connect to the mongod server on isvx7, it tells me that is not the master or the secondary.

[root@isvx7 ~]# mongo --port 27001
MongoDB shell version: 2.2.0
connecting to: 127.0.0.1:27001/test
> db.isMaster()
{
        "ismaster" : false,
        "secondary" : false,
        "info" : "can't get local.system.replset config from self or any seed (EMPTYCONFIG)",
        "isreplicaset" : true,
        "maxBsonObjectSize" : 16777216,
        "localTime" : ISODate("2012-11-30T23:22:44.236Z"),
        "ok" : 1
}
>

This means that I would need to run replSetInitiate to initiate the replication set on the nodes, but before that we need to create the config file on node isvx7.

> cfg = { _id : "liverpool", members : [ { _id:0, host:"isvx7:27001" }, { _id:1, host:"isvx3:27001" } ] }
{
        "_id" : "liverpool",
        "members" : [
                {
                        "_id" : 0,
                        "host" : "isvx7:27001"
                },
                {
                        "_id" : 1,
                        "host" : "isvx3:27001"
                }
        ]
}
>

Now we will initiate the replica set on node isvx7 with the config file that we created.

> rs.initiate(cfg)
{
        "info" : "Config now saved locally.  Should come online in about a minute.",
        "ok" : 1
}
>

When we now do the db.isMaster() on isvx7, we see that it is the master/primary.

> db.isMaster()
{
        "setName" : "liverpool",
        "ismaster" : true,
        "secondary" : false,
        "hosts" : [
                "isvx7:27001",
                "isvx3:27001"
        ],
        "primary" : "isvx7:27001",
        "me" : "isvx7:27001",
        "maxBsonObjectSize" : 16777216,
        "localTime" : ISODate("2012-11-30T23:51:11.561Z"),
        "ok" : 1
}
liverpool:PRIMARY>

Here is what I see when I try to start the shell from the secondary node.

[root@isvx3 ~]#  mongo --port 27001
MongoDB shell version: 2.2.0
connecting to: 127.0.0.1:27001/test
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
        http://docs.mongodb.org/
Questions? Try the support group
        http://groups.google.com/group/mongodb-user
liverpool:SECONDARY>
liverpool:SECONDARY> db.isMaster()
{
        "setName" : "liverpool",
        "ismaster" : false,
        "secondary" : true,
        "hosts" : [
                "isvx3:27001",
                "isvx7:27001"
        ],
        "primary" : "isvx7:27001",
        "me" : "isvx3:27001",
        "maxBsonObjectSize" : 16777216,
        "localTime" : ISODate("2012-12-01T00:03:49.245Z"),
        "ok" : 1
}
liverpool:SECONDARY>

Friday, October 26, 2012

Install and configure MongoDB on Linux

I decided to play around with MongoDB a bit, so I downloaded it from http://www.mongodb.org/downloads as follows

[root@isvx7 ~]# mkdir mongodb
[root@isvx7 ~]# cd mongodb
[root@isvx7 mongodb]# wget http://fastdl.mongodb.org/linux/mongodb-linux-x86_64-2.2.0.tgz
--2012-10-26 11:15:58--  http://fastdl.mongodb.org/linux/mongodb-linux-x86_64-2.2.0.tgz
Resolving fastdl.mongodb.org... 54.240.190.202, 54.240.190.201, 54.240.190.172, ...
Connecting to fastdl.mongodb.org|54.240.190.202|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 56286069 (54M) [application/x-tar]
Saving to: `mongodb-linux-x86_64-2.2.0.tgz'

100%[======================================>] 56,286,069  5.97M/s   in 9.3s

2012-10-26 11:16:07 (5.75 MB/s) - `mongodb-linux-x86_64-2.2.0.tgz' saved [56286069/56286069]

[root@isvx7 mongodb]#
[root@isvx7 mongodb]# gunzip mongodb-linux-x86_64-2.2.0.tgz
[root@isvx7 mongodb]# ls
mongodb-linux-x86_64-2.2.0.tar
[root@isvx7 mongodb]# tar -xvf mongodb-linux-x86_64-2.2.0.tar
mongodb-linux-x86_64-2.2.0/GNU-AGPL-3.0
mongodb-linux-x86_64-2.2.0/README
mongodb-linux-x86_64-2.2.0/THIRD-PARTY-NOTICES
mongodb-linux-x86_64-2.2.0/bin/mongodump
mongodb-linux-x86_64-2.2.0/bin/mongorestore
mongodb-linux-x86_64-2.2.0/bin/mongoexport
mongodb-linux-x86_64-2.2.0/bin/mongoimport
mongodb-linux-x86_64-2.2.0/bin/mongostat
mongodb-linux-x86_64-2.2.0/bin/mongotop
mongodb-linux-x86_64-2.2.0/bin/mongooplog
mongodb-linux-x86_64-2.2.0/bin/mongofiles
mongodb-linux-x86_64-2.2.0/bin/bsondump
mongodb-linux-x86_64-2.2.0/bin/mongoperf
mongodb-linux-x86_64-2.2.0/bin/mongosniff
mongodb-linux-x86_64-2.2.0/bin/mongod
mongodb-linux-x86_64-2.2.0/bin/mongos
mongodb-linux-x86_64-2.2.0/bin/mongo
[root@isvx7 mongodb]# ls
mongodb-linux-x86_64-2.2.0  mongodb-linux-x86_64-2.2.0.tar
[root@isvx7 mongodb]# cd mongodb-linux-x86_64-2.2.0
[root@isvx7 mongodb-linux-x86_64-2.2.0]# ls
bin  GNU-AGPL-3.0  README  THIRD-PARTY-NOTICES
[root@isvx7 mongodb-linux-x86_64-2.2.0]# ls -l
total 56
drwxr-xr-x 2 root  root   4096 Oct 26 11:17 bin
-rw------- 1 admin admin 34520 Aug 13 10:38 GNU-AGPL-3.0
-rw------- 1 admin admin  1359 Aug 13 10:38 README
-rw------- 1 admin admin 11527 Aug 21 06:34 THIRD-PARTY-NOTICES
[root@isvx7 mongodb-linux-x86_64-2.2.0]# cd bin
[root@isvx7 bin]# ls
bsondump  mongodump    mongoimport  mongorestore  mongostat
mongo     mongoexport  mongooplog   mongos        mongotop
mongod    mongofiles   mongoperf    mongosniff
[root@isvx7 bin]#

By default mongodb uses /data/db for its datafiles. Below I check to see if I have enough space on / to create /data/db and have some files in it.

[root@isvx7 ~]# df -h /
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3             255G   76G  166G  32% /
[root@isvx7 ~]# mkdir -p /data/db
[root@isvx7 ~]#

Next, I start the server mongod, and strace it.

[root@isvx7 bin]# strace -o /root/output.txt ./mongod
./mongod --help for help and startup options
Fri Oct 26 15:07:52 [initandlisten] MongoDB starting : pid=31518 port=27017 dbpath=/data/db/ 64-bit host=isvx7.storage.tucson.ibm.com
Fri Oct 26 15:07:52 [initandlisten]
Fri Oct 26 15:07:52 [initandlisten] ** WARNING: You are running on a NUMA machine.
Fri Oct 26 15:07:52 [initandlisten] **          We suggest launching mongod like this to avoid performance problems:
Fri Oct 26 15:07:52 [initandlisten] **              numactl --interleave=all mongod [other options]
Fri Oct 26 15:07:52 [initandlisten]
Fri Oct 26 15:07:52 [initandlisten] ** WARNING: /proc/sys/vm/zone_reclaim_mode is 1
Fri Oct 26 15:07:52 [initandlisten] **          We suggest setting it to 0
Fri Oct 26 15:07:52 [initandlisten] **          http://www.kernel.org/doc/Documentation/sysctl/vm.txt
Fri Oct 26 15:07:52 [initandlisten]
Fri Oct 26 15:07:52 [initandlisten] db version v2.2.0, pdfile version 4.5
Fri Oct 26 15:07:52 [initandlisten] git version: f5e83eae9cfbec7fb7a071321928f00d1b0c5207
Fri Oct 26 15:07:52 [initandlisten] build info: Linux ip-10-2-29-40 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_49
Fri Oct 26 15:07:52 [initandlisten] options: {}
Fri Oct 26 15:07:52 [initandlisten] journal dir=/data/db/journal
Fri Oct 26 15:07:52 [initandlisten] recover : no journal files present, no recovery needed
Fri Oct 26 15:07:54 [initandlisten] preallocateIsFaster=true 28.32
Fri Oct 26 15:07:56 [initandlisten] preallocateIsFaster=true 27.74
Fri Oct 26 15:07:59 [initandlisten] preallocateIsFaster=true 30.82
Fri Oct 26 15:07:59 [initandlisten] preallocateIsFaster check took 7.423 secs
Fri Oct 26 15:07:59 [initandlisten] preallocating a journal file /data/db/journal/prealloc.0
Fri Oct 26 15:08:11 [initandlisten] preallocating a journal file /data/db/journal/prealloc.1
Fri Oct 26 15:08:23 [initandlisten] preallocating a journal file /data/db/journal/prealloc.2
Fri Oct 26 15:08:36 [websvr] admin web console waiting for connections on port 28017
Fri Oct 26 15:08:36 [initandlisten] waiting for connections on port 27017

Here is what got created under /data/db

[root@isvx7 ~]# cd /data/db
[root@isvx7 db]# ls
journal  mongod.lock
[root@isvx7 db]# ls -l
total 8
drwxr-xr-x 2 root root 4096 Oct 26 15:08 journal
-rwxr-xr-x 1 root root    6 Oct 26 15:07 mongod.lock
[root@isvx7 db]# cd journal/
[root@isvx7 journal]# ls
j._0  prealloc.1  prealloc.2
[root@isvx7 journal]# ls -lh
total 3.1G
-rw------- 1 root root 1.0G Oct 26 15:08 j._0
-rw------- 1 root root 1.0G Oct 26 15:08 prealloc.1
-rw------- 1 root root 1.0G Oct 26 15:08 prealloc.2
[root@isvx7 journal]#

Let us look into the output.txt to see the what happens when mongod is run.
  • Here we see that it checks to see if /data/db directory is present, and then goes onto check if mongod.lock file is present. It see that it is not there, so it goes onto create a mongod.lock file and writes 31518 into it which is the process id of mongod ie. "root     31518 31517  0 15:07 pts/2    00:00:04 ./mongod"
write(1, "Fri Oct 26 15:07:52 [initandlist"..., 48) = 48
stat("/data/db/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat("/data/db/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat("/data/db/mongod.lock", 0x7fffc6e703b0) = -1 ENOENT (No such file or directory)
open("/data/db/mongod.lock", O_RDWR|O_CREAT, 0777) = 4
flock(4, LOCK_EX|LOCK_NB)               = 0
ftruncate(4, 0)                         = 0
write(4, "31518\n", 6)                  = 6
fsync(4)                                = 0
  • Here you see that it checks to see if /data/db/journal is present, and then goes onto check i
write(1, "Fri Oct 26 15:07:52 [initandlist"..., 65) = 65
stat("/data/db/journal", 0x7fffc6e704d0) = -1 ENOENT (No such file or directory)
mkdir("/data/db/journal", 0777)         = 0
stat("/data/db/journal", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
open("/data/db/journal", O_RDONLY|O_NONBLOCK|O_DIRECTORY) = 5
fcntl(5, F_SETFD, FD_CLOEXEC)           = 0
getdents(5, /* 2 entries */, 32768)     = 48
getdents(5, /* 0 entries */, 32768)     = 0
close(5)                                = 0
  • Here we see it checks to see if files prealloc.0, prealloc.1, and prealloc.2 are present.
stat("/data/db/journal/prealloc.0", 0x7fffc6e703f0) = -1 ENOENT (No such file or directory)
stat("/data/db/journal/prealloc.1", 0x7fffc6e703f0) = -1 ENOENT (No such file or directory)
stat("/data/db/journal/prealloc.2", 0x7fffc6e703f0) = -1 ENOENT (No such file or directory)
stat("/data/db/journal/prealloc.0", 0x7fffc6e704f0) = -1 ENOENT (No such file or directory)
stat("/data/db/journal/prealloc.1", 0x7fffc6e704f0) = -1 ENOENT (No such file or directory)
stat("/data/db/journal/tempLatencyTest", 0x7fffc6e703c0) = -1 ENOENT (No such file or directory)
  • Next it creates the tempLatencyTest file, and writes to it as shown below.
open("/data/db/journal/tempLatencyTest", O_WRONLY|O_CREAT|O_DIRECT|O_NOATIME, 0600) = 5
open("/data/db/journal", O_RDONLY)      = 6
fsync(6)                                = 0
close(6)                                = 0
lseek(5, 0, SEEK_CUR)                   = 0
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
fdatasync(5)                            = 0
lseek(5, 0, SEEK_CUR)                   = 8192
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
fdatasync(5)                            = 0
lseek(5, 0, SEEK_CUR)                   = 16384
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
fdatasync(5)                            = 0
...........................................................
...........................................................
...........................................................
lseek(5, 0, SEEK_CUR)                   = 393216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
fdatasync(5)                            = 0
lseek(5, 0, SEEK_CUR)                   = 401408
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
fdatasync(5)                            = 0
close(5)  
                             = 0

open("/data/db/journal/tempLatencyTest", O_WRONLY|O_CREAT|O_DIRECT|O_NOATIME, 0600) = 5
open("/data/db/journal", O_RDONLY)      = 6
fsync(6)                                = 0
close(6)                                = 0
lseek(5, 0, SEEK_CUR)                   = 0
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
fdatasync(5)                            = 0
lseek(5, 0, SEEK_CUR)                   = 8192
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
fdatasync(5)                            = 0
lseek(5, 0, SEEK_CUR)                   = 16384
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
fdatasync(5)
                           = 0
...........................................................
...........................................................
...........................................................
fdatasync(5)                            = 0
lseek(5, 0, SEEK_CUR)                   = 393216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
fdatasync(5)                            = 0
lseek(5, 0, SEEK_CUR)                   = 401408
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
fdatasync(5)                            = 0
close(5)                                = 0
  • Here we see that the tempLatencyTest file is deleted.
write(1, "Fri Oct 26 15:07:54 [initandlist"..., 67) = 67
stat("/data/db/journal/tempLatencyTest", {st_mode=S_IFREG|0600, st_size=409600, ...}) = 0
lstat("/data/db/journal/tempLatencyTest", {st_mode=S_IFREG|0600, st_size=409600, ...}) = 0
unlink("/data/db/journal/tempLatencyTest") = 0
stat("/data/db/journal/tempLatencyTest", 0x7fffc6e703c0) = -1 ENOENT (No such file or directory
  • Next we see that the file prealloc.0 is created, and 1 MB is written into it using pwrite to create a totalof 1 GB. The same is repeated for file prealloc.1 and prealloc.2
"Prealloc Files (e.g. journal/prealloc.0)

mongod will create prealloc files in the journal directory under some circumstances to minimize journal write latency. On some filesystems, appending to a file and making it larger can be slower than writing to a file of a predefined size. mongod checks this at startup and if it finds this to be the case will use preallocated journal files. If found to be helpful, a small pool of prealloc files will be created in the journal directory before startup begins. This is a one time initiation and does not occur with future invocations. Approximately 3GB of files will be preallocated (and truly prewritten, not sparse allocated) - thus in this situation, expect roughly a 3 minute delay on the first startup to preallocate these files."
From MongoDb docs: http://www.mongodb.org/display/DOCS/Journaling+Administration+Notes#JournalingAdministrationNotes-PreallocFiles%28e.g.journal%2Fprealloc.0%29

open("/data/db/journal/prealloc.0", O_RDWR|O_CREAT|O_NOATIME, 0600) = 5
pwrite(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 0) = 1048576
pwrite(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 1048576) = 1048576
pwrite(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 2097152) = 1048576
pwrite(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 3145728) = 1048576
pwrite(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 4194304) = 1048576
pwrite(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 5242880) = 1048576
..............................................................
..............................................................
..............................................................
pwrite(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 1070596096) = 1048576
pwrite(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 1071644672) = 1048576
pwrite(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 1072693248) = 1048576
fsync(5)                                = 0
close(5)                                = 0
  • Here we see that it checks to see if file j._0 exists, and then goes onto rename file prealloc.0 to j._0
"Journal Files (e.g. journal/j._0)

Journal files are append-only and are written to the journal/ directory under the dbpath directory (which is /data/db/ by default).

Journal files are named j._0, j._1, etc. When a journal file reached 1GB in size, a new file is created. Old files which are no longer needed are rotated out (automatically deleted). Unless your write bytes/second rate is extremely high, you should have only two or three journal files.

Note: in more recent versions, the journal files are only 128MB apiece when using the --smallfiles command line option."
From MongoDb docs: http://www.mongodb.org/display/DOCS/Journaling+Administration+Notes#JournalingAdministrationNotes-PreallocFiles%28e.g.journal%2Fprealloc.0%29

stat("/data/db/journal/j._0", 0x7fffc6e6e350) = -1 ENOENT (No such file or directory)
rename("/data/db/journal/prealloc.0", "/data/db/journal/j._0") = 0
  • Here we see that the domain parameter for the socket() is PF_INET ie. IPv4 Internet protocol. SOCK_STREAM provides sequenced, reliable, two-way, connection-based byte streams. Also, we see that the port used is 27017
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 7
setsockopt(7, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(7, {sa_family=AF_INET, sin_port=htons(27017), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
listen(7, 128)                          = 0
socket(PF_FILE, SOCK_STREAM, 0)         = 8
unlink("/tmp/mongodb-27017.sock")       = -1 ENOENT (No such file or directory)
setsockopt(8, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(8, {sa_family=AF_FILE, path="/tmp/mongodb-27017.sock"...}, 110) = 0
chmod("/tmp/mongodb-27017.sock", 0777)  = 0
listen(8, 128)                          = 0
write(1, "Fri Oct 26 15:08:36 [initandlist"..., 74) = 74