I decided to play around with MongoDB a bit, so I downloaded it from http://www.mongodb.org/downloads as follows
By default mongodb uses /data/db for its datafiles. Below I check to see if I have enough space on / to create /data/db and have some files in it.
Next, I start the server mongod, and strace it.
Here is what got created under /data/db
Let us look into the output.txt to see the what happens when mongod is run.
mongod will create prealloc files in the journal directory under some circumstances to minimize journal write latency. On some filesystems, appending to a file and making it larger can be slower than writing to a file of a predefined size. mongod checks this at startup and if it finds this to be the case will use preallocated journal files. If found to be helpful, a small pool of prealloc files will be created in the journal directory before startup begins. This is a one time initiation and does not occur with future invocations. Approximately 3GB of files will be preallocated (and truly prewritten, not sparse allocated) - thus in this situation, expect roughly a 3 minute delay on the first startup to preallocate these files."
From MongoDb docs: http://www.mongodb.org/display/DOCS/Journaling+Administration+Notes#JournalingAdministrationNotes-PreallocFiles%28e.g.journal%2Fprealloc.0%29
Journal files are append-only and are written to the journal/ directory under the dbpath directory (which is /data/db/ by default).
Journal files are named j._0, j._1, etc. When a journal file reached 1GB in size, a new file is created. Old files which are no longer needed are rotated out (automatically deleted). Unless your write bytes/second rate is extremely high, you should have only two or three journal files.
Note: in more recent versions, the journal files are only 128MB apiece when using the --smallfiles command line option."
From MongoDb docs: http://www.mongodb.org/display/DOCS/Journaling+Administration+Notes#JournalingAdministrationNotes-PreallocFiles%28e.g.journal%2Fprealloc.0%29
[root@isvx7 ~]# mkdir mongodb [root@isvx7 ~]# cd mongodb [root@isvx7 mongodb]# wget http://fastdl.mongodb.org/linux/mongodb-linux-x86_64-2.2.0.tgz --2012-10-26 11:15:58-- http://fastdl.mongodb.org/linux/mongodb-linux-x86_64-2.2.0.tgz Resolving fastdl.mongodb.org... 54.240.190.202, 54.240.190.201, 54.240.190.172, ... Connecting to fastdl.mongodb.org|54.240.190.202|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 56286069 (54M) [application/x-tar] Saving to: `mongodb-linux-x86_64-2.2.0.tgz' 100%[======================================>] 56,286,069 5.97M/s in 9.3s 2012-10-26 11:16:07 (5.75 MB/s) - `mongodb-linux-x86_64-2.2.0.tgz' saved [56286069/56286069] [root@isvx7 mongodb]# [root@isvx7 mongodb]# gunzip mongodb-linux-x86_64-2.2.0.tgz [root@isvx7 mongodb]# ls mongodb-linux-x86_64-2.2.0.tar [root@isvx7 mongodb]# tar -xvf mongodb-linux-x86_64-2.2.0.tar mongodb-linux-x86_64-2.2.0/GNU-AGPL-3.0 mongodb-linux-x86_64-2.2.0/README mongodb-linux-x86_64-2.2.0/THIRD-PARTY-NOTICES mongodb-linux-x86_64-2.2.0/bin/mongodump mongodb-linux-x86_64-2.2.0/bin/mongorestore mongodb-linux-x86_64-2.2.0/bin/mongoexport mongodb-linux-x86_64-2.2.0/bin/mongoimport mongodb-linux-x86_64-2.2.0/bin/mongostat mongodb-linux-x86_64-2.2.0/bin/mongotop mongodb-linux-x86_64-2.2.0/bin/mongooplog mongodb-linux-x86_64-2.2.0/bin/mongofiles mongodb-linux-x86_64-2.2.0/bin/bsondump mongodb-linux-x86_64-2.2.0/bin/mongoperf mongodb-linux-x86_64-2.2.0/bin/mongosniff mongodb-linux-x86_64-2.2.0/bin/mongod mongodb-linux-x86_64-2.2.0/bin/mongos mongodb-linux-x86_64-2.2.0/bin/mongo [root@isvx7 mongodb]# ls mongodb-linux-x86_64-2.2.0 mongodb-linux-x86_64-2.2.0.tar [root@isvx7 mongodb]# cd mongodb-linux-x86_64-2.2.0 [root@isvx7 mongodb-linux-x86_64-2.2.0]# ls bin GNU-AGPL-3.0 README THIRD-PARTY-NOTICES [root@isvx7 mongodb-linux-x86_64-2.2.0]# ls -l total 56 drwxr-xr-x 2 root root 4096 Oct 26 11:17 bin -rw------- 1 admin admin 34520 Aug 13 10:38 GNU-AGPL-3.0 -rw------- 1 admin admin 1359 Aug 13 10:38 README -rw------- 1 admin admin 11527 Aug 21 06:34 THIRD-PARTY-NOTICES [root@isvx7 mongodb-linux-x86_64-2.2.0]# cd bin [root@isvx7 bin]# ls bsondump mongodump mongoimport mongorestore mongostat mongo mongoexport mongooplog mongos mongotop mongod mongofiles mongoperf mongosniff [root@isvx7 bin]# |
By default mongodb uses /data/db for its datafiles. Below I check to see if I have enough space on / to create /data/db and have some files in it.
[root@isvx7 ~]# df -h / Filesystem Size Used Avail Use% Mounted on /dev/sda3 255G 76G 166G 32% / [root@isvx7 ~]# mkdir -p /data/db [root@isvx7 ~]# |
Next, I start the server mongod, and strace it.
[root@isvx7 bin]# strace -o /root/output.txt ./mongod ./mongod --help for help and startup options Fri Oct 26 15:07:52 [initandlisten] MongoDB starting : pid=31518 port=27017 dbpath=/data/db/ 64-bit host=isvx7.storage.tucson.ibm.com Fri Oct 26 15:07:52 [initandlisten] Fri Oct 26 15:07:52 [initandlisten] ** WARNING: You are running on a NUMA machine. Fri Oct 26 15:07:52 [initandlisten] ** We suggest launching mongod like this to avoid performance problems: Fri Oct 26 15:07:52 [initandlisten] ** numactl --interleave=all mongod [other options] Fri Oct 26 15:07:52 [initandlisten] Fri Oct 26 15:07:52 [initandlisten] ** WARNING: /proc/sys/vm/zone_reclaim_mode is 1 Fri Oct 26 15:07:52 [initandlisten] ** We suggest setting it to 0 Fri Oct 26 15:07:52 [initandlisten] ** http://www.kernel.org/doc/Documentation/sysctl/vm.txt Fri Oct 26 15:07:52 [initandlisten] Fri Oct 26 15:07:52 [initandlisten] db version v2.2.0, pdfile version 4.5 Fri Oct 26 15:07:52 [initandlisten] git version: f5e83eae9cfbec7fb7a071321928f00d1b0c5207 Fri Oct 26 15:07:52 [initandlisten] build info: Linux ip-10-2-29-40 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_49 Fri Oct 26 15:07:52 [initandlisten] options: {} Fri Oct 26 15:07:52 [initandlisten] journal dir=/data/db/journal Fri Oct 26 15:07:52 [initandlisten] recover : no journal files present, no recovery needed Fri Oct 26 15:07:54 [initandlisten] preallocateIsFaster=true 28.32 Fri Oct 26 15:07:56 [initandlisten] preallocateIsFaster=true 27.74 Fri Oct 26 15:07:59 [initandlisten] preallocateIsFaster=true 30.82 Fri Oct 26 15:07:59 [initandlisten] preallocateIsFaster check took 7.423 secs Fri Oct 26 15:07:59 [initandlisten] preallocating a journal file /data/db/journal/prealloc.0 Fri Oct 26 15:08:11 [initandlisten] preallocating a journal file /data/db/journal/prealloc.1 Fri Oct 26 15:08:23 [initandlisten] preallocating a journal file /data/db/journal/prealloc.2 Fri Oct 26 15:08:36 [websvr] admin web console waiting for connections on port 28017 Fri Oct 26 15:08:36 [initandlisten] waiting for connections on port 27017 |
Here is what got created under /data/db
[root@isvx7 ~]# cd /data/db [root@isvx7 db]# ls journal mongod.lock [root@isvx7 db]# ls -l total 8 drwxr-xr-x 2 root root 4096 Oct 26 15:08 journal -rwxr-xr-x 1 root root 6 Oct 26 15:07 mongod.lock [root@isvx7 db]# cd journal/ [root@isvx7 journal]# ls j._0 prealloc.1 prealloc.2 [root@isvx7 journal]# ls -lh total 3.1G -rw------- 1 root root 1.0G Oct 26 15:08 j._0 -rw------- 1 root root 1.0G Oct 26 15:08 prealloc.1 -rw------- 1 root root 1.0G Oct 26 15:08 prealloc.2 [root@isvx7 journal]# |
Let us look into the output.txt to see the what happens when mongod is run.
- Here we see that it checks to see if /data/db directory is present, and then goes onto check if mongod.lock file is present. It see that it is not there, so it goes onto create a mongod.lock file and writes 31518 into it which is the process id of mongod ie. "root 31518 31517 0 15:07 pts/2 00:00:04 ./mongod"
write(1, "Fri Oct 26 15:07:52 [initandlist"..., 48) = 48 stat("/data/db/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 stat("/data/db/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 stat("/data/db/mongod.lock", 0x7fffc6e703b0) = -1 ENOENT (No such file or directory) open("/data/db/mongod.lock", O_RDWR|O_CREAT, 0777) = 4 flock(4, LOCK_EX|LOCK_NB) = 0 ftruncate(4, 0) = 0 write(4, "31518\n", 6) = 6 fsync(4) = 0 |
- Here you see that it checks to see if /data/db/journal is present, and then goes onto check i
write(1, "Fri Oct 26 15:07:52 [initandlist"..., 65) = 65 stat("/data/db/journal", 0x7fffc6e704d0) = -1 ENOENT (No such file or directory) mkdir("/data/db/journal", 0777) = 0 stat("/data/db/journal", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 open("/data/db/journal", O_RDONLY|O_NONBLOCK|O_DIRECTORY) = 5 fcntl(5, F_SETFD, FD_CLOEXEC) = 0 getdents(5, /* 2 entries */, 32768) = 48 getdents(5, /* 0 entries */, 32768) = 0 close(5) = 0 |
- Here we see it checks to see if files prealloc.0, prealloc.1, and prealloc.2 are present.
stat("/data/db/journal/prealloc.0", 0x7fffc6e703f0) = -1 ENOENT (No such file or directory) stat("/data/db/journal/prealloc.1", 0x7fffc6e703f0) = -1 ENOENT (No such file or directory) stat("/data/db/journal/prealloc.2", 0x7fffc6e703f0) = -1 ENOENT (No such file or directory) stat("/data/db/journal/prealloc.0", 0x7fffc6e704f0) = -1 ENOENT (No such file or directory) stat("/data/db/journal/prealloc.1", 0x7fffc6e704f0) = -1 ENOENT (No such file or directory) stat("/data/db/journal/tempLatencyTest", 0x7fffc6e703c0) = -1 ENOENT (No such file or directory) |
- Next it creates the tempLatencyTest file, and writes to it as shown below.
open("/data/db/journal/tempLatencyTest", O_WRONLY|O_CREAT|O_DIRECT|O_NOATIME, 0600) = 5 open("/data/db/journal", O_RDONLY) = 6 fsync(6) = 0 close(6) = 0 lseek(5, 0, SEEK_CUR) = 0 write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192 fdatasync(5) = 0 lseek(5, 0, SEEK_CUR) = 8192 write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192 fdatasync(5) = 0 lseek(5, 0, SEEK_CUR) = 16384 write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192 fdatasync(5) = 0 ........................................................... ........................................................... ........................................................... lseek(5, 0, SEEK_CUR) = 393216 write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192 fdatasync(5) = 0 lseek(5, 0, SEEK_CUR) = 401408 write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192 fdatasync(5) = 0 close(5) = 0 open("/data/db/journal/tempLatencyTest", O_WRONLY|O_CREAT|O_DIRECT|O_NOATIME, 0600) = 5 open("/data/db/journal", O_RDONLY) = 6 fsync(6) = 0 close(6) = 0 lseek(5, 0, SEEK_CUR) = 0 write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192 fdatasync(5) = 0 lseek(5, 0, SEEK_CUR) = 8192 write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192 fdatasync(5) = 0 lseek(5, 0, SEEK_CUR) = 16384 write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192 fdatasync(5) = 0 ........................................................... ........................................................... ........................................................... fdatasync(5) = 0 lseek(5, 0, SEEK_CUR) = 393216 write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192 fdatasync(5) = 0 lseek(5, 0, SEEK_CUR) = 401408 write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192 fdatasync(5) = 0 close(5) = 0 |
- Here we see that the tempLatencyTest file is deleted.
write(1, "Fri Oct 26 15:07:54 [initandlist"..., 67) = 67 stat("/data/db/journal/tempLatencyTest", {st_mode=S_IFREG|0600, st_size=409600, ...}) = 0 lstat("/data/db/journal/tempLatencyTest", {st_mode=S_IFREG|0600, st_size=409600, ...}) = 0 unlink("/data/db/journal/tempLatencyTest") = 0 stat("/data/db/journal/tempLatencyTest", 0x7fffc6e703c0) = -1 ENOENT (No such file or directory |
- Next we see that the file prealloc.0 is created, and 1 MB is written into it using pwrite to create a totalof 1 GB. The same is repeated for file prealloc.1 and prealloc.2
mongod will create prealloc files in the journal directory under some circumstances to minimize journal write latency. On some filesystems, appending to a file and making it larger can be slower than writing to a file of a predefined size. mongod checks this at startup and if it finds this to be the case will use preallocated journal files. If found to be helpful, a small pool of prealloc files will be created in the journal directory before startup begins. This is a one time initiation and does not occur with future invocations. Approximately 3GB of files will be preallocated (and truly prewritten, not sparse allocated) - thus in this situation, expect roughly a 3 minute delay on the first startup to preallocate these files."
From MongoDb docs: http://www.mongodb.org/display/DOCS/Journaling+Administration+Notes#JournalingAdministrationNotes-PreallocFiles%28e.g.journal%2Fprealloc.0%29
open("/data/db/journal/prealloc.0", O_RDWR|O_CREAT|O_NOATIME, 0600) = 5 pwrite(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 0) = 1048576 pwrite(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 1048576) = 1048576 pwrite(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 2097152) = 1048576 pwrite(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 3145728) = 1048576 pwrite(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 4194304) = 1048576 pwrite(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 5242880) = 1048576 .............................................................. .............................................................. .............................................................. pwrite(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 1070596096) = 1048576 pwrite(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 1071644672) = 1048576 pwrite(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576, 1072693248) = 1048576 fsync(5) = 0 close(5) = 0 |
- Here we see that it checks to see if file j._0 exists, and then goes onto rename file prealloc.0 to j._0
Journal files are append-only and are written to the journal/ directory under the dbpath directory (which is /data/db/ by default).
Journal files are named j._0, j._1, etc. When a journal file reached 1GB in size, a new file is created. Old files which are no longer needed are rotated out (automatically deleted). Unless your write bytes/second rate is extremely high, you should have only two or three journal files.
Note: in more recent versions, the journal files are only 128MB apiece when using the --smallfiles command line option."
From MongoDb docs: http://www.mongodb.org/display/DOCS/Journaling+Administration+Notes#JournalingAdministrationNotes-PreallocFiles%28e.g.journal%2Fprealloc.0%29
stat("/data/db/journal/j._0", 0x7fffc6e6e350) = -1 ENOENT (No such file or directory) rename("/data/db/journal/prealloc.0", "/data/db/journal/j._0") = 0 |
- Here we see that the domain parameter for the socket() is PF_INET ie. IPv4 Internet protocol. SOCK_STREAM provides sequenced, reliable, two-way, connection-based byte streams. Also, we see that the port used is 27017
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 7 setsockopt(7, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 bind(7, {sa_family=AF_INET, sin_port=htons(27017), sin_addr=inet_addr("0.0.0.0")}, 16) = 0 listen(7, 128) = 0 socket(PF_FILE, SOCK_STREAM, 0) = 8 unlink("/tmp/mongodb-27017.sock") = -1 ENOENT (No such file or directory) setsockopt(8, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 bind(8, {sa_family=AF_FILE, path="/tmp/mongodb-27017.sock"...}, 110) = 0 chmod("/tmp/mongodb-27017.sock", 0777) = 0 listen(8, 128) = 0 write(1, "Fri Oct 26 15:08:36 [initandlist"..., 74) = 74 |