Tokyo Cabinet第1版基本仕様書
Copyright (C) 2006-2009 Mikio Hirabayashi
Last Update: Sat, 03 Jan 2009 21:41:33 +0900
目次
- はじめに
- 特徴
- インストール
- ユーティリティAPI
- ハッシュデータベースAPI
- B+木データベースAPI
- 固定長データベースAPI
- 抽象データベースAPI
- ちょっとしたコツ
- ファイルフォーマット
- よく聞かれる質問
- ライセンス
はじめに
Tokyo Cabinetはデータベースを扱うルーチン群のライブラリです。データベースといっても単純なもので、キーと値のペアからなるレコード群を格納したデータファイルです。キーと値は任意の長さを持つ一連のバイト列であり、文字列でもバイナリでも扱うことができます。テーブルやデータ型の概念はありません。レコードはハッシュ表かB+木か固定長配列で編成されます。
ハッシュ表のデータベースでは、キーはデータベース内で一意であり、キーが重複する複数のレコードを格納することはできません。このデータベースに対しては、キーと値を指定してレコードを格納したり、キーを指定して対応するレコードを削除したり、キーを指定して対応するレコードを検索したりすることができます。また、データベースに格納してある全てのキーを順不同に一つずつ取り出すこともできます。このような操作は、UNIX標準で定義されているDBMライブラリおよびその追従であるNDBMやGDBMに類するものです。Tokyo CabinetはDBMのより良い代替として利用することができます。
B+木のデータベースでは、キーが重複する複数のレコードを格納することができます。このデータベースに対しては、ハッシュ表のデータベースと同様に、キーを指定してレコードを格納したり取り出したり削除したりすることができます。レコードはユーザが指示した比較関数に基づいて整列されて格納されます。カーソルを用いて各レコードを昇順または降順で参照することができます。この機構によって、文字列の前方一致検索や数値の範囲検索が可能になります。
固定長配列のデータベースでは、一意な自然数をキーとしてレコードが格納されます。キーが重複する複数のレコードを格納することはできません。また、各レコードの値の長さは一定以下に制限されます。提供される操作はハッシュデータベースとほぼ同様です。
Tokyo CabinetはC言語で記述され、CとPerlとRubyとJavaとLuaのAPIとして提供されます。Tokyo CabinetはC99およびPOSIX準拠のAPIを備えるプラットフォームで利用できます。Tokyo CabinetはGNU Lesser General Public Licenseに基づくフリーソフトウェアです。
特徴
Tokyo CabinetはQDBMの後継であり、空間効率と時間効率と使いやすさを向上させた製品です。この節ではTokyo Cabinetの特徴について説明します。
DBM一族の最右翼
Tokyo CabinetはGDBMやQDBMの後継として次の点を目標として開発されました。これらの目標は達成されており、Tokyo Cabinetは従来のDBMを置き換える製品だと言えます。
- 空間効率の向上 : データベースファイルがより小さい
- 時間効率の向上 : 処理がより高速である
- 並列性の向上 : マルチスレッド環境での同時実行性能の向上
- 利便性の向上 : APIがより単純である
- 堅牢性の向上 : 不慮の事態でもデータベースファイルが壊れにくい
- 64ビット対応 : 巨大なメモリ空間とデータベースファイルを扱える
Tokyo CabinetはQDBMと同様に、伝統的なDBMが抱える三つの制限事項を回避しています。すなわち、プロセス内で複数のデータベースを扱うことができ、キーと値のサイズに制限がなく、データベースファイルがスパースではありません。さらに、QDBMが抱える三つの制限事項を回避しています。すなわち、2GB以上のデータベースファイルを扱うことができ、バイトオーダの異なる環境間でデータベースファイルを共有することができ、複数のスレッドが同時にデータベースの探索を行うことができます。
Tokyo Cabinetは高速に動作します。例えば100万件のレコードの登録にかかる時間は、ハッシュデータベースで0.7秒ほど、B+木データベースで1.6秒ほどです。そしてTokyo Cabinetのデータベースは小さいです。例えば1レコードあたりのオーバーヘッドは、ハッシュデータベースで16バイトほど、B+木データベースで5バイトほどです。さらにTokyo Cabinetで扱えるデータの規模は莫大です。最大8EB(9.22e18バイト)までのデータベースファイルを扱うことができます。
効率的なハッシュデータベースの実装
Tokyo Cabinetはレコードの探索にハッシュアルゴリズムを用います。バケット配列に十分な要素数があれば、レコードの探索にかかる時間計算量は O(1) です。すなわち、レコードの探索に必要な時間はデータベースの規模に関わらず一定です。追加や削除に関しても同様です。ハッシュ値の衝突はセパレートチェーン法で管理します。チェーンのデータ構造は二分探索木です。したがって、バケット配列の要素数が著しく少ない場合でも、探索等の時間計算量は O(log n) に抑えられます。
Tokyo Cabinetはバケット配列を全てRAM上に保持することによって、処理の高速化を図ります。バケット配列がRAM上にあれば、ほぼ1パスのファイル操作でレコードに該当するファイル上の領域を参照することができます。ファイルに記録されたバケット配列は `read' コールでRAM上に読み込むのではなく、`mmap' コールでRAMに直接マッピングされます。したがって、データベースに接続する際の準備時間が極めて短く、また、複数のプロセスでメモリマップを共有することができます。
バケット配列の要素数が格納するレコード数の半分ほどであれば、データの性質によって多少前後しますが、ハッシュ値の衝突率は56.7%ほどです(等倍だと36.8%、2倍だと21.3%、4倍だと11.5%、8倍だと6.0%ほど)。そのような場合、平均2パス以下のファイル操作でレコードを探索することができます。これを性能指標とするならば、例えば100万個のレコードを格納するためには50万要素のバケット配列が求められます。バケット配列の各要素は4バイトです。すなわち、2MバイトのRAMが利用できれば100万レコードのデータベースが構築できます。
伝統的なDBMにはレコードの追加操作に関して「挿入」モードと「置換」モードがあります。前者では、キーが既存のレコードと重複する際に既存の値を残します。後者では、キーが既存のレコードと重複した際に新しい値に置き換えます。Tokyo Cabinetはその2つに加えて「連結」モードがあります。既存の値の末尾に指定された値を連結して格納する操作です。レコードの値を配列として扱う場合、要素を追加するには連結モードが役に立ちます。
一般的に、データベースの更新処理を続けるとファイル内の利用可能領域の断片化が起き、ファイルのサイズが肥大化してしまいます。Tokyo Cabinetは隣接する不要領域を連結して再利用し、またデータベースの最適化機能を備えることによってこの問題に対処します。既存のレコードの値をより大きなサイズの値に上書きする場合、そのレコードの領域をファイル中の別の位置に移動させる必要があります。この処理の時間計算量はレコードのサイズに依存するので、値を拡張していく場合には効率が悪くなります。しかし、Tokyo Cabinetはアラインメントによってこの問題に対処します。増分がパディングに収まれば領域を移動させる必要はありません。
便利なB+木データベースの実装
B+木データベースはハッシュデータベースより遅いのですが、ユーザが定義した順序に基づいて各レコードを参照できることが特長です。B+木は複数のレコードを整列させた状態で論理的なページにまとめて管理します。各ページに対してはB木すなわち多進平衡木によって階層化された疎インデックスが維持されます。したがって、各レコードの探索等にかかる時間計算量は O(log n) です。各レコードを順番に参照するためにカーソルが提供されます。カーソルの場所はキーを指定して飛ばすことができ、また現在の場所から次のレコードに進めたり前のレコードに戻したりすることができます。各ページは双方向リンクリストで編成されるので、カーソルを前後に移動させる操作の時間計算量は O(1) です。
B+木データベースは上述のハッシュデータベースを基盤として実装されます。B+木の各ページはハッシュデータベースのレコードとして記録されるので、ハッシュデータベースの記憶管理の効率性を継承しています。B+木では各レコードのヘッダが小さく、アラインメントはページの単位でとられるので、ほとんどの場合、ハッシュデータベースに較べてデータベースファイルのサイズが半減します。B+木を更新する際には多くのページを操作する必要がありますが、Tokyo Cabinetはページをキャッシュすることによってファイル操作を減らして処理を効率化します。ほとんどの場合、疎インデックス全体がメモリ上にキャッシュされるので、各レコードを参照するのに必要なファイル操作は平均1パス以下です。
各ページを圧縮して保存する機能も提供されます。圧縮方式はZLIBのDeflateとBZIP2のブロックソーティングの2種類をサポートしています。同一ページ内の各レコードは似たようなパターンを持つため、Lempel-ZivやBWTなどのアルゴリズムを適用すると高い圧縮効率が期待できます。テキストデータを扱う場合、データベースのサイズが元の25%程度になります。データベースの規模が大きくディスクI/Oがボトルネックとなる場合は、圧縮機能を有効化すると処理速度が大幅に改善されます。
素朴な固定長データベースの実装
固定長データベースは、キーが自然数でなくてはならず、また値のサイズが制限されますが、その条件を受諾できる場合には最も効率的です。レコード群は固定長の要素の配列として保持され、各レコードはキーの倍数から算出されるオフセットの位置に格納されます。したがって、各レコードの探索等にかかる時間計算量は O(1) です。提供される操作群はハッシュデータベースとほぼ同じです。
データベース全体を `mmap' コールでメモリ上にマッピングして多次元配列として参照するので、ファイルI/Oにかかるオーバーヘッドは極小化されます。構造が単純なおかげで、固定長データベースはハッシュデータベースよりもさらに高速に動作するとともに、マルチスレッド環境での並列実行性能も傑出しています。
データベースのサイズは、キーの変域と値の制限長に比例します。すなわち、キーの変域が小さく、値のサイズが小さいほど、空間効率は向上します。例えば、キーの最大値が100万で、値の制限長が100バイトの場合、データベースのサイズは100MBほどになります。RAM上に読み込まれるのは実際に参照されたレコードの周辺の領域のみなので、データベースのサイズは仮想メモリのサイズまで大きくすることができます。
実用的な機能性
ハッシュデータベースとB+木データベースはトランザクション機構を提供します。トランザクションを開始してから終了するまでの一連の操作を一括してデータベースにコミットしたり、一連の更新操作を破棄してデータベースの状態をトランザクションの開始前の状態にロールバックしたりすることができます。トランザクションの分離レベルは2種類あります。データベースに対する全ての操作をトランザクション内で行うと直列化可能(serializable)トランザクションとなり、トランザクション外の操作を同時に行うと非コミット読み取り(read uncommitted)トランザクションとなります。耐久性はログ先行書き込みとシャドウページングによって担保されます。
Tokyo Cabinetにはデータベースに接続するモードとして、「リーダ」と「ライタ」の二種類があります。リーダは読み込み専用で、ライタは読み書き両用です。データベースにはファイルロックによってプロセス間での排他制御が行われます。ライタが接続している間は、他のプロセスはリーダとしてもライタとしても接続できません。リーダが接続している間は、他のプロセスのリーダは接続できるが、ライタは接続できません。この機構によって、マルチタスク環境での同時接続に伴うデータの整合性が保証されます。
Tokyo CabinetのAPIの各関数はリエントラントであり、マルチスレッド環境で安全に利用することができます。別個のデータベースオブジェクトに対しては全ての操作を完全に並列に行うことができます。同一のデータベースオブジェクトに対しては、リードライトロックで排他制御を行います。すなわち、読み込みを行うスレッド同士は並列に実行でき、書き込みを行うスレッドは他の読み込みや書き込みをブロックします。
単純だが多様なインタフェース群
Tokyo Cabinetはオブジェクト指向に基づいた簡潔なAPIを提供します。データベースに対する全ての操作はデータベースオブジェクトにカプセル化され、開く(open)、閉じる(close)、挿入する(put)、削除する(out)、取得する(get)といった関数(メソッド)を呼ぶことでプログラミングを進めていけます。ハッシュデータベースとB+木データベースと固定長データベースのAPIは互いに酷似しているので、アプリケーションを一方から他方に移植することも簡単です。
メモリ上でレコードを簡単に扱うために、ユーティリティAPIが提供されます。リストやマップといった基本的なデータ構造をはじめ、メモリプールや文字列処理や符号処理など、プログラミングで良く使う機能を詰め込んでいます。
C言語のAPIには、ユーティリティAPI、ハッシュデータベースAPI、B+木データベースAPI、固定長データベースAPI、抽象データベースAPIの5種類があります。各APIに対応したコマンドラインインタフェースも用意されています。それらはプロトタイピングやテストやデバッグなどで活躍するでしょう。Tokyo CabinetはC言語の他にも、PerlとRubyとJavaとLuaのAPIを提供します。Perl用APIはXS言語を用いてハッシュデータベースAPIとB+木データベースAPIと固定長データベースAPIを呼び出すものです。Ruby用APIはRubyのモジュールとしてハッシュデータベースAPIとB+木データベースAPIと固定長データベースAPIを呼び出すものです。Java用APIはJava Native Interfaceを用いてハッシュデータベースAPIとB+木データベースAPIと固定長データベースAPIを呼び出すものです。Lua用APIはLuaのモジュールとしてハッシュデータベースAPIとB+木データベースAPIと固定長データベースAPIを呼び出すものです。その他の言語のインターフェイスも第三者によって提供されるでしょう。
複数のプロセスが同時にデータベースを操作したい場合やリモートホストにあるデータベースを操作したい場合には、リモートサービスを使うと便利です。リモートサービスはデータベースサーバとそのアクセスライブラリからなり、アプリケーションはリモートデータベースAPIを介してデータベースサーバを操作することができます。HTTPやmemcachedプロトコルもサポートするので、ほぼ全てのプラットフォームからデータベースサーバを簡単に操作することができます。
インストール
Tokyo Cabinetのソースパッケージからのインストール方法を説明します。バイナリパッケージのインストール方法についてはそれぞれのパッケージの説明書をご覧ください。
前提
Tokyo Cabinetの現在バージョンは、UNIX系のOSで利用することができます。少なくとも、以下の環境では動作するはずです。
- Linux 2.4以降 (x86-32/x86-64/PowerPC/Alpha/SPARC)
- Mac OS X 10.3以降 (x86-32/x86-64/PowerPC)
ソースパッケージを用いてTokyo Cabinetをインストールするには、gcc
のバージョン3.1以降とmake
が必要です。それらはLinuxやFreeBSDなどには標準的にインストールされています。
Tokyo Cabinetは、以下のライブラリを利用しています。予めインストールしておいてください。
- zlib : 可逆データ圧縮。バージョン1.2.3以降推奨。
- bzip2 : 可逆データ圧縮。バージョン1.0.5以降推奨。
ビルドとインストール
Tokyo Cabinetの配布用アーカイブファイルを展開したら、作成されたディレクトリに入ってインストール作業を行います。
configure
スクリプトを実行して、ビルド環境を設定します。
./configure
プログラムをビルドします。
make
プログラムの自己診断テストを行います。
make check
プログラムをインストールします。作業はroot
ユーザで行います。
make install
結果
一連の作業が終ると、以下のファイルがインストールされます。
/usr/local/include/tcutil.h
/usr/local/include/tchdb.h
/usr/local/include/tcbdb.h
/usr/local/include/tcfdb.h
/usr/local/include/tcadb.h
/usr/local/lib/libtokyocabinet.a
/usr/local/lib/libtokyocabinet.so.7.4.0
/usr/local/lib/libtokyocabinet.so.7
/usr/local/lib/libtokyocabinet.so
/usr/local/lib/pkgconfig/tokyocabinet.pc
/usr/local/bin/tcutest
/usr/local/bin/tcumttest
/usr/local/bin/tcucodec
/usr/local/bin/tchtest
/usr/local/bin/tchmttest
/usr/local/bin/tchmgr
/usr/local/bin/tcbmgr
/usr/local/bin/tcbtest
/usr/local/bin/tcbmttest
/usr/local/bin/tcftest
/usr/local/bin/tcfmttest
/usr/local/bin/tcfmgr
/usr/local/bin/tcamgr
/usr/local/bin/tcatest
/usr/local/libexec/tcawmgr.cgi
/usr/local/share/tokyocabinet/...
/usr/local/man/man1/...
/usr/local/man/man3/...
configureのオプション
「./configure
」を実行する際には、以下のオプションを指定することができます。
--enable-debug
: デバッグ用にビルドする。デバッグシンボルを有効化し、最適化を行わず、静的にリンクする。
--enable-devel
: 開発用にビルドする。デバッグシンボルを有効化し、最適化を行い、動的にリンクする。
--enable-profile
: プロファイル用にビルドする。プロファイルオプションを有効化し、最適化を行い、動的にリンクする。
--enable-static
: 静的にリンクする。
--enable-fastest
: 最高速になるように最適化を行う。
--enable-off64
: 32ビット環境でも64ビットのファイルオフセットを用いる。
--enable-swab
: バイトオーダの変換を強制する。
--enable-uyield
: レースコンディションの検出用にビルドする。
--disable-zlib
: ZLIBによるレコード圧縮を無効にする。
--disable-bzip
: BZIP2によるレコード圧縮を無効にする。
--disable-pthread
: POSIXスレッドのサポートを無効にする。
--disable-shared
: 共有ライブラリのビルドを行わない。
`--prefix
' などのオプションも一般的なUNIXソフトウェアのパッケージと同様に利用可能です。`/usr/local
' 以下ではなく '/usr
' 以下にインストールしたい場合は `--prefix=/usr
' を指定してください。なお、ライブラリ検索パスに `/usr/local/lib
' が入っていない環境では、Tokyo Cabinetのアプリケーションを実行する際に環境変数 `LD_LIBRARY_PATH
' の値に `/usr/local/lib
' を含めておくようにしてください。
ライブラリの使い方
Tokyo CabinetはC言語のAPIを提供し、それはC89標準(ANSI C)またはC99標準に準拠したプログラムから利用することができます。Tokyo Cabinetヘッダは `tcutil.h
'、`tchdb.h
'、`tcbdb.h
'、`tcadb.h
' として提供されますので、適宜それらをアプリケーションのソースコード中でインクルードした上で、APIの各種機能を利用してください。ライブラリは `libtokyocabinet.a
' および `libtokyocabinet.so
' として提供され、それらは `libz.so
'、`libpthread.so
'、`libm.so
'、`libc.so
' に依存しますので、アプリケーションプログラムをビルドする際のリンカオプションには `-ltokyocabinet
'、`-lz
'、`-lbz2
'、`-lpthread
'、`-lm
'、`-lc
' を加えてください。最も典型的なビルド手順は以下のようになります。
gcc -I/usr/local/include tc_example.c -o tc_example \
-L/usr/local/lib -ltokyocabinet -lz -lbz2 -lpthread -lm -lc
Tokyo CabinetはC++言語のプログラムからも利用することができます。各ヘッダは暗黙的にCリンケージ(「extern "C"
」ブロック)で包まれているので、単にインクルードするだけで利用することができます。
ユーティリティAPI
ユーティリティAPIは、メモリ上で簡単にレコードを扱うためのルーチン集です。特に拡張可能文字列と配列リストがハッシュマップと順序木が便利です。`tcutil.h
' にAPIの仕様の完全な記述があります。
概要
ユーティリティAPIを使うためには、`tcutil.h
' および関連する標準ヘッダファイルをインクルードしてください。通常、ソースファイルの冒頭付近で以下の記述を行います。
#include <tcutil.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>
拡張可能文字列を扱う際には、`TCXSTR
' 型へのポインタをオブジェクトとして用います。拡張可能文字列オブジェクトは、関数 `tcxstrnew
' で作成し、関数 `tcxstrdel
' で破棄します。配列リストを扱う際には、`TCLIST
' 型へのポインタをオブジェクトとして用います。リストオブジェクトは、関数 `tclistnew
' で作成し、関数 `tclistdel
' で破棄します。ハッシュマップを扱う際には、`TCMAP
' 型へのポインタをオブジェクトとして用います。マップオブジェクトは、関数 `tcmapopen
' で作成し、関数 `tcmapdel
' で破棄します。順序木を扱う際には、`TCTREE
' 型へのポインタをオブジェクトとして用います。ツリーオブジェクトは、関数 `tctreeopen
' で作成し、関数 `tctreedel
' で破棄します。作成したオブジェクトを使い終わったら必ず破棄してください。そうしないとメモリリークが発生します。
基礎的なユーティリティのAPI(英語御免)
The constant `tcversion' is the string containing the version information.
extern const char *tcversion;
The variable `tcfatalfunc' is the pointer to the call back function for handling a fatal error.
extern void (*tcfatalfunc)(const char *);
- The argument specifies the error message.
- The initial value of this variable is `NULL'. If the value is `NULL', the default function is called when a fatal error occurs. A fatal error occurs when memory allocation is failed.
The function `tcmalloc' is used in order to allocate a region on memory.
void *tcmalloc(size_t size);
- `size' specifies the size of the region.
- The return value is the pointer to the allocated region.
- This function handles failure of memory allocation implicitly. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tccalloc' is used in order to allocate a nullified region on memory.
void *tccalloc(size_t nmemb, size_t size);
- `nmemb' specifies the number of elements.
- `size' specifies the size of each element.
- The return value is the pointer to the allocated nullified region.
- This function handles failure of memory allocation implicitly. Because the region of the return value is allocated with the `calloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcrealloc' is used in order to re-allocate a region on memory.
void *tcrealloc(void *ptr, size_t size);
- `ptr' specifies the pointer to the region.
- `size' specifies the size of the region.
- The return value is the pointer to the re-allocated region.
- This function handles failure of memory allocation implicitly. Because the region of the return value is allocated with the `realloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcmemdup' is used in order to duplicate a region on memory.
void *tcmemdup(const void *ptr, size_t size);
- `ptr' specifies the pointer to the region.
- `size' specifies the size of the region.
- The return value is the pointer to the allocated region of the duplicate.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcstrdup' is used in order to duplicate a string on memory.
char *tcstrdup(const void *str);
- `str' specifies the string.
- The return value is the allocated string equivalent to the specified string.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcfree' is used in order to free a region on memory.
void tcfree(void *ptr);
- `ptr' specifies the pointer to the region. If it is `NULL', this function has no effect.
- Although this function is just a wrapper of `free' call, this is useful in applications using another package of the `malloc' series.
拡張可能文字列のAPI(英語御免)
The function `tcxstrnew' is used in order to create an extensible string object.
TCXSTR *tcxstrnew(void);
- The return value is the new extensible string object.
The function `tcxstrnew2' is used in order to create an extensible string object from a character string.
TCXSTR *tcxstrnew2(const char *str);
- `str' specifies the string of the initial content.
- The return value is the new extensible string object containing the specified string.
The function `tcxstrnew3' is used in order to create an extensible string object with the initial allocation size.
TCXSTR *tcxstrnew3(int asiz);
- `asiz' specifies the initial allocation size.
- The return value is the new extensible string object.
The function `tcxstrdup' is used in order to copy an extensible string object.
TCXSTR *tcxstrdup(const TCXSTR *xstr);
- `xstr' specifies the extensible string object.
- The return value is the new extensible string object equivalent to the specified object.
The function `tcxstrdel' is used in order to delete an extensible string object.
void tcxstrdel(TCXSTR *xstr);
- `xstr' specifies the extensible string object.
- Note that the deleted object and its derivatives can not be used anymore.
The function `tcxstrcat' is used in order to concatenate a region to the end of an extensible string object.
void tcxstrcat(TCXSTR *xstr, const void *ptr, int size);
- `xstr' specifies the extensible string object.
- `ptr' specifies the pointer to the region to be appended.
- `size' specifies the size of the region.
The function `tcxstrcat2' is used in order to concatenate a character string to the end of an extensible string object.
void tcxstrcat2(TCXSTR *xstr, const char *str);
- `xstr' specifies the extensible string object.
- `str' specifies the string to be appended.
The function `tcxstrptr' is used in order to get the pointer of the region of an extensible string object.
const void *tcxstrptr(const TCXSTR *xstr);
- `xstr' specifies the extensible string object.
- The return value is the pointer of the region of the object.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string.
The function `tcxstrsize' is used in order to get the size of the region of an extensible string object.
int tcxstrsize(const TCXSTR *xstr);
- `xstr' specifies the extensible string object.
- The return value is the size of the region of the object.
The function `tcxstrclear' is used in order to clear an extensible string object.
void tcxstrclear(TCXSTR *xstr);
- `xstr' specifies the extensible string object.
- The internal buffer of the object is cleared and the size is set zero.
The function `tcxstrprintf' is used in order to perform formatted output into an extensible string object.
void tcxstrprintf(TCXSTR *xstr, const char *format, ...);
- `xstr' specifies the extensible string object.
- `format' specifies the printf-like format string. The conversion character `%' can be used with such flag characters as `s', `d', `o', `u', `x', `X', `c', `e', `E', `f', `g', `G', `@', `?', `b', and `%'. `@' works as with `s' but escapes meta characters of XML. `?' works as with `s' but escapes meta characters of URL. `b' converts an integer to the string as binary numbers. The other conversion character work as with each original.
- The other arguments are used according to the format string.
The function `tcsprintf' is used in order to allocate a formatted string on memory.
char *tcsprintf(const char *format, ...);
- `format' specifies the printf-like format string. The conversion character `%' can be used with such flag characters as `s', `d', `o', `u', `x', `X', `c', `e', `E', `f', `g', `G', `@', `?', `b', and `%'. `@' works as with `s' but escapes meta characters of XML. `?' works as with `s' but escapes meta characters of URL. `b' converts an integer to the string as binary numbers. The other conversion character work as with each original.
- The other arguments are used according to the format string.
- The return value is the pointer to the region of the result string.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
配列リストのAPI(英語御免)
The function `tclistnew' is used in order to create a list object.
TCLIST *tclistnew(void);
- The return value is the new list object.
The function `tclistnew2' is used in order to create a list object with expecting the number of elements.
TCLIST *tclistnew2(int anum);
- `anum' specifies the number of elements expected to be stored in the list.
- The return value is the new list object.
The function `tclistdup' is used in order to copy a list object.
TCLIST *tclistdup(const TCLIST *list);
- `list' specifies the list object.
- The return value is the new list object equivalent to the specified object.
The function `tclistdel' is used in order to delete a list object.
void tclistdel(TCLIST *list);
- `list' specifies the list object.
- Note that the deleted object and its derivatives can not be used anymore.
The function `tclistnum' is used in order to get the number of elements of a list object.
int tclistnum(const TCLIST *list);
- `list' specifies the list object.
- The return value is the number of elements of the list.
The function `tclistval' is used in order to get the pointer to the region of an element of a list object.
const void *tclistval(const TCLIST *list, int index, int *sp);
- `list' specifies the list object.
- `index' specifies the index of the element.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- The return value is the pointer to the region of the value.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. If `index' is equal to or more than the number of elements, the return value is `NULL'.
The function `tclistval2' is used in order to get the string of an element of a list object.
const char *tclistval2(const TCLIST *list, int index);
- `list' specifies the list object.
- `index' specifies the index of the element.
- The return value is the string of the value.
- If `index' is equal to or more than the number of elements, the return value is `NULL'.
The function `tclistpush' is used in order to add an element at the end of a list object.
void tclistpush(TCLIST *list, const void *ptr, int size);
- `list' specifies the list object.
- `ptr' specifies the pointer to the region of the new element.
- `size' specifies the size of the region.
The function `tclistpush2' is used in order to add a string element at the end of a list object.
void tclistpush2(TCLIST *list, const char *str);
- `list' specifies the list object.
- `str' specifies the string of the new element.
The function `tclistpop' is used in order to remove an element of the end of a list object.
void *tclistpop(TCLIST *list, int *sp);
- `list' specifies the list object.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- The return value is the pointer to the region of the removed element.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use. If the list is empty, the return value is `NULL'.
The function `tclistpop2' is used in order to remove a string element of the end of a list object.
char *tclistpop2(TCLIST *list);
- `list' specifies the list object.
- The return value is the string of the removed element.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use. If the list is empty, the return value is `NULL'.
The function `tclistunshift' is used in order to add an element at the top of a list object.
void tclistunshift(TCLIST *list, const void *ptr, int size);
- `list' specifies the list object.
- `ptr' specifies the pointer to the region of the new element.
- `size' specifies the size of the region.
The function `tclistunshift2' is used in order to add a string element at the top of a list object.
void tclistunshift2(TCLIST *list, const char *str);
- `list' specifies the list object.
- `str' specifies the string of the new element.
The function `tclistshift' is used in order to remove an element of the top of a list object.
void *tclistshift(TCLIST *list, int *sp);
- `list' specifies the list object.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- The return value is the pointer to the region of the removed element.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use. If the list is empty, the return value is `NULL'.
The function `tclistshift2' is used in order to remove a string element of the top of a list object.
char *tclistshift2(TCLIST *list);
- `list' specifies the list object.
- The return value is the string of the removed element.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use. If the list is empty, the return value is `NULL'.
The function `tclistinsert' is used in order to add an element at the specified location of a list object.
void tclistinsert(TCLIST *list, int index, const void *ptr, int size);
- `list' specifies the list object.
- `index' specifies the index of the new element.
- `ptr' specifies the pointer to the region of the new element.
- `size' specifies the size of the region.
- If `index' is equal to or more than the number of elements, this function has no effect.
The function `tclistinsert2' is used in order to add a string element at the specified location of a list object.
void tclistinsert2(TCLIST *list, int index, const char *str);
- `list' specifies the list object.
- `index' specifies the index of the new element.
- `str' specifies the string of the new element.
- If `index' is equal to or more than the number of elements, this function has no effect.
The function `tclistremove' is used in order to remove an element at the specified location of a list object.
void *tclistremove(TCLIST *list, int index, int *sp);
- `list' specifies the list object.
- `index' specifies the index of the element to be removed.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- The return value is the pointer to the region of the removed element.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use. If `index' is equal to or more than the number of elements, no element is removed and the return value is `NULL'.
The function `tclistremove2' is used in order to remove a string element at the specified location of a list object.
char *tclistremove2(TCLIST *list, int index);
- `list' specifies the list object.
- `index' specifies the index of the element to be removed.
- The return value is the string of the removed element.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use. If `index' is equal to or more than the number of elements, no element is removed and the return value is `NULL'.
The function `tclistover' is used in order to overwrite an element at the specified location of a list object.
void tclistover(TCLIST *list, int index, const void *ptr, int size);
- `list' specifies the list object.
- `index' specifies the index of the element to be overwritten.
- `ptr' specifies the pointer to the region of the new content.
- `size' specifies the size of the new content.
- If `index' is equal to or more than the number of elements, this function has no effect.
The function `tclistover2' is used in order to overwrite a string element at the specified location of a list object.
void tclistover2(TCLIST *list, int index, const char *str);
- `list' specifies the list object.
- `index' specifies the index of the element to be overwritten.
- `str' specifies the string of the new content.
- If `index' is equal to or more than the number of elements, this function has no effect.
The function `tclistsort' is used in order to sort elements of a list object in lexical order.
void tclistsort(TCLIST *list);
- `list' specifies the list object.
The function `tclistlsearch' is used in order to search a list object for an element using liner search.
int tclistlsearch(const TCLIST *list, const void *ptr, int size);
- `list' specifies the list object.
- `ptr' specifies the pointer to the region of the key.
- `size' specifies the size of the region.
- The return value is the index of a corresponding element or -1 if there is no corresponding element.
- If two or more elements correspond, the former returns.
The function `tclistbsearch' is used in order to search a list object for an element using binary search.
int tclistbsearch(const TCLIST *list, const void *ptr, int size);
- `list' specifies the list object. It should be sorted in lexical order.
- `ptr' specifies the pointer to the region of the key.
- `size' specifies the size of the region.
- The return value is the index of a corresponding element or -1 if there is no corresponding element.
- If two or more elements correspond, which returns is not defined.
The function `tclistclear' is used in order to clear a list object.
void tclistclear(TCLIST *list);
- `list' specifies the list object.
- All elements are removed.
The function `tclistdump' is used in order to serialize a list object into a byte array.
void *tclistdump(const TCLIST *list, int *sp);
- `list' specifies the list object.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- The return value is the pointer to the region of the result serial region.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tclistload' is used in order to create a list object from a serialized byte array.
TCLIST *tclistload(const void *ptr, int size);
- `ptr' specifies the pointer to the region of serialized byte array.
- `size' specifies the size of the region.
- The return value is a new list object.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use.
ハッシュマップのAPI(英語御免)
The function `tcmapnew' is used in order to create a map object.
TCMAP *tcmapnew(void);
- The return value is the new map object.
The function `tcmapnew2' is used in order to create a map object with specifying the number of the buckets.
TCMAP *tcmapnew2(uint32_t bnum);
- `bnum' specifies the number of the buckets.
- The return value is the new map object.
The function `tcmapdup' is used in order to copy a map object.
TCMAP *tcmapdup(const TCMAP *map);
- `map' specifies the map object.
- The return value is the new map object equivalent to the specified object.
The function `tcmapdel' is used in order to delete a map object.
void tcmapdel(TCMAP *map);
- `map' specifies the map object.
- Note that the deleted object and its derivatives can not be used anymore.
The function `tcmapput' is used in order to store a record into a map object.
void tcmapput(TCMAP *map, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `map' specifies the map object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If a record with the same key exists in the map, it is overwritten.
The function `tcmapput2' is used in order to store a string record into a map object.
void tcmapput2(TCMAP *map, const char *kstr, const char *vstr);
- `map' specifies the map object.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If a record with the same key exists in the map, it is overwritten.
The function `tcmapputkeep' is used in order to store a new record into a map object.
bool tcmapputkeep(TCMAP *map, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `map' specifies the map object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the map, this function has no effect.
The function `tcmapputkeep2' is used in order to store a new string record into a map object.
bool tcmapputkeep2(TCMAP *map, const char *kstr, const char *vstr);
- `map' specifies the map object.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the map, this function has no effect.
The function `tcmapputcat' is used in order to concatenate a value at the end of the value of the existing record in a map object.
void tcmapputcat(TCMAP *map, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `map' specifies the map object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If there is no corresponding record, a new record is created.
The function `tcmapputcat2' is used in order to concatenate a string value at the end of the value of the existing record in a map object.
void tcmapputcat2(TCMAP *map, const char *kstr, const char *vstr);
- `map' specifies the map object.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If there is no corresponding record, a new record is created.
The function `tcmapout' is used in order to remove a record of a map object.
bool tcmapout(TCMAP *map, const void *kbuf, int ksiz);
- `map' specifies the map object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- If successful, the return value is true. False is returned when no record corresponds to the specified key.
The function `tcmapout2' is used in order to remove a string record of a map object.
bool tcmapout2(TCMAP *map, const char *kstr);
- `map' specifies the map object.
- `kstr' specifies the string of the key.
- If successful, the return value is true. False is returned when no record corresponds to the specified key.
The function `tcmapget' is used in order to retrieve a record in a map object.
const void *tcmapget(const TCMAP *map, const void *kbuf, int ksiz, int *sp);
- `map' specifies the map object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the region of the value of the corresponding record. `NULL' is returned when no record corresponds.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string.
The function `tcmapget2' is used in order to retrieve a string record in a map object.
const char *tcmapget2(const TCMAP *map, const char *kstr);
- `map' specifies the map object.
- `kstr' specifies the string of the key.
- If successful, the return value is the string of the value of the corresponding record. `NULL' is returned when no record corresponds.
The function `tcmapmove' is used in order to move a record to the edge of a map object.
bool tcmapmove(TCMAP *map, const void *kbuf, int ksiz, bool head);
- `map' specifies the map object.
- `kbuf' specifies the pointer to the region of a key.
- `ksiz' specifies the size of the region of the key.
- `head' specifies the destination which is the head if it is true or the tail if else.
- If successful, the return value is true. False is returned when no record corresponds to the specified key.
The function `tcmapmove2' is used in order to move a string record to the edge of a map object.
bool tcmapmove2(TCMAP *map, const char *kstr, bool head);
- `map' specifies the map object.
- `kstr' specifies the string of a key.
- `head' specifies the destination which is the head if it is true or the tail if else.
- If successful, the return value is true. False is returned when no record corresponds to the specified key.
The function `tcmapiterinit' is used in order to initialize the iterator of a map object.
void tcmapiterinit(TCMAP *map);
- `map' specifies the map object.
- The iterator is used in order to access the key of every record stored in the map object.
The function `tcmapiternext' is used in order to get the next key of the iterator of a map object.
const void *tcmapiternext(TCMAP *map, int *sp);
- `map' specifies the map object.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the region of the next key, else, it is `NULL'. `NULL' is returned when no record can be fetched from the iterator.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. The order of iteration is assured to be the same as the stored order.
The function `tcmapiternext2' is used in order to get the next key string of the iterator of a map object.
const char *tcmapiternext2(TCMAP *map);
- `map' specifies the map object.
- If successful, the return value is the pointer to the region of the next key, else, it is `NULL'. `NULL' is returned when no record can be fetched from the iterator.
- The order of iteration is assured to be the same as the stored order.
The function `tcmaprnum' is used in order to get the number of records stored in a map object.
uint64_t tcmaprnum(const TCMAP *map);
- `map' specifies the map object.
- The return value is the number of the records stored in the map object.
The function `tcmapmsiz' is used in order to get the total size of memory used in a map object.
uint64_t tcmapmsiz(const TCMAP *map);
- `map' specifies the map object.
- The return value is the total size of memory used in a map object.
The function `tcmapkeys' is used in order to create a list object containing all keys in a map object.
TCLIST *tcmapkeys(const TCMAP *map);
- `map' specifies the map object.
- The return value is the new list object containing all keys in the map object.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use.
The function `tcmapvals' is used in order to create a list object containing all values in a map object.
TCLIST *tcmapvals(const TCMAP *map);
- `map' specifies the map object.
- The return value is the new list object containing all values in the map object.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use.
The function `tcmapaddint' is used in order to add an integer to a record in a map object.
int tcmapaddint(TCMAP *map, const void *kbuf, int ksiz, int num);
- `map' specifies the map object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `num' specifies the additional value.
- The return value is the summation value.
- If the corresponding record exists, the value is treated as an integer and is added to. If no record corresponds, a new record of the additional value is stored.
The function `tcmapadddouble' is used in order to add a real number to a record in a map object.
double tcmapadddouble(TCMAP *map, const void *kbuf, int ksiz, double num);
- `map' specifies the map object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `num' specifies the additional value.
- The return value is the summation value.
- If the corresponding record exists, the value is treated as a real number and is added to. If no record corresponds, a new record of the additional value is stored.
The function `tcmapclear' is used in order to clear a map object.
void tcmapclear(TCMAP *map);
- `map' specifies the map object.
- All records are removed.
The function `tcmapcutfront' is used in order to remove front records of a map object.
void tcmapcutfront(TCMAP *map, int num);
- `map' specifies the map object.
- `num' specifies the number of records to be removed.
The function `tcmapdump' is used in order to serialize a map object into a byte array.
void *tcmapdump(const TCMAP *map, int *sp);
- `map' specifies the map object.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- The return value is the pointer to the region of the result serial region.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcmapload' is used in order to create a map object from a serialized byte array.
TCMAP *tcmapload(const void *ptr, int size);
- `ptr' specifies the pointer to the region of serialized byte array.
- `size' specifies the size of the region.
- The return value is a new map object.
- Because the object of the return value is created with the function `tcmapnew', it should be deleted with the function `tcmapdel' when it is no longer in use.
順序木のAPI(英語御免)
The function `tctreenew' is used in order to create a tree object.
TCTREE *tctreenew(void);
- The return value is the new tree object.
The function `tctreenew2' is used in order to create a tree object with specifying the custom comparison function.
TCTREE *tctreenew2(TCCMP cmp, void *cmpop);
- `cmp' specifies the pointer to the custom comparison function.
- `cmpop' specifies an arbitrary pointer to be given as a parameter of the comparison function. If it is not needed, `NULL' can be specified.
- The return value is the new tree object.
- The default comparison function compares keys of two records by lexical order. The functions `tccmplexical' (dafault), `tccmpdecimal', `tccmpint32', and `tccmpint64' are built-in.
The function `tctreedup' is used in order to copy a tree object.
TCTREE *tctreedup(const TCTREE *tree);
- `tree' specifies the tree object.
- The return value is the new tree object equivalent to the specified object.
The function `tctreedel' is used in order to delete a tree object.
void tctreedel(TCTREE *tree);
- `tree' specifies the tree object.
- Note that the deleted object and its derivatives can not be used anymore.
The function `tctreeput' is used in order to store a record into a tree object.
void tctreeput(TCTREE *tree, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `tree' specifies the tree object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If a record with the same key exists in the tree, it is overwritten.
The function `tctreeput2' is used in order to store a string record into a tree object.
void tctreeput2(TCTREE *tree, const char *kstr, const char *vstr);
- `tree' specifies the tree object.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If a record with the same key exists in the tree, it is overwritten.
The function `tctreeputkeep' is used in order to store a new record into a tree object.
bool tctreeputkeep(TCTREE *tree, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `tree' specifies the tree object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the tree, this function has no effect.
The function `tctreeputkeep2' is used in order to store a new string record into a tree object.
bool tctreeputkeep2(TCTREE *tree, const char *kstr, const char *vstr);
- `tree' specifies the tree object.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the tree, this function has no effect.
The function `tctreeputcat' is used in order to concatenate a value at the end of the value of the existing record in a tree object.
void tctreeputcat(TCTREE *tree, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `tree' specifies the tree object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If there is no corresponding record, a new record is created.
The function `tctreeputcat2' is used in order to concatenate a string value at the end of the value of the existing record in a tree object.
void tctreeputcat2(TCTREE *tree, const char *kstr, const char *vstr);
- `tree' specifies the tree object.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If there is no corresponding record, a new record is created.
The function `tctreeout' is used in order to remove a record of a tree object.
bool tctreeout(TCTREE *tree, const void *kbuf, int ksiz);
- `tree' specifies the tree object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- If successful, the return value is true. False is returned when no record corresponds to the specified key.
The function `tctreeout2' is used in order to remove a string record of a tree object.
bool tctreeout2(TCTREE *tree, const char *kstr);
- `tree' specifies the tree object.
- `kstr' specifies the string of the key.
- If successful, the return value is true. False is returned when no record corresponds to the specified key.
The function `tctreeget' is used in order to retrieve a record in a tree object.
const void *tctreeget(TCTREE *tree, const void *kbuf, int ksiz, int *sp);
- `tree' specifies the tree object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the region of the value of the corresponding record. `NULL' is returned when no record corresponds.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string.
The function `tctreeget2' is used in order to retrieve a string record in a tree object.
const char *tctreeget2(TCTREE *tree, const char *kstr);
- `tree' specifies the tree object.
- `kstr' specifies the string of the key.
- If successful, the return value is the string of the value of the corresponding record. `NULL' is returned when no record corresponds.
The function `tctreeiterinit' is used in order to initialize the iterator of a tree object.
void tctreeiterinit(TCTREE *tree);
- `tree' specifies the tree object.
- The iterator is used in order to access the key of every record stored in the tree object.
The function `tctreeiterinit2' is used in order to initialize the iterator of a tree object in front of records corresponding a key.
void tctreeiterinit2(TCTREE *tree, const void *kbuf, int ksiz);
- `tree' specifies the tree object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- The iterator is set to the first record corresponding the key or the next substitute if completely matching record does not exist.
The function `tctreeiterinit3' is used in order to initialize the iterator of a tree object in front of records corresponding a key string.
void tctreeiterinit3(TCTREE *tree, const char *kstr);
- `tree' specifies the tree object.
- `kstr' specifies the string of the key.
- The iterator is set to the first record corresponding the key or the next substitute if completely matching record does not exist.
The function `tctreeiternext' is used in order to get the next key of the iterator of a tree object.
const void *tctreeiternext(TCTREE *tree, int *sp);
- `tree' specifies the tree object.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the region of the next key, else, it is `NULL'. `NULL' is returned when no record can be fetched from the iterator.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. The order of iteration is assured to be ascending of the keys.
The function `tctreeiternext2' is used in order to get the next key string of the iterator of a tree object.
const char *tctreeiternext2(TCTREE *tree);
- `tree' specifies the tree object.
- If successful, the return value is the pointer to the region of the next key, else, it is `NULL'. `NULL' is returned when no record can be fetched from the iterator.
- The order of iteration is assured to be ascending of the keys.
The function `tctreernum' is used in order to get the number of records stored in a tree object.
uint64_t tctreernum(const TCTREE *tree);
- `tree' specifies the tree object.
- The return value is the number of the records stored in the tree object.
The function `tctreemsiz' is used in order to get the total size of memory used in a tree object.
uint64_t tctreemsiz(const TCTREE *tree);
- `tree' specifies the tree object.
- The return value is the total size of memory used in a tree object.
The function `tctreekeys' is used in order to create a list object containing all keys in a tree object.
TCLIST *tctreekeys(const TCTREE *tree);
- `tree' specifies the tree object.
- The return value is the new list object containing all keys in the tree object.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use.
The function `tctreevals' is used in order to create a list object containing all values in a tree object.
TCLIST *tctreevals(const TCTREE *tree);
- `tree' specifies the tree object.
- The return value is the new list object containing all values in the tree object.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use.
The function `tctreeaddint' is used in order to add an integer to a record in a tree object.
int tctreeaddint(TCTREE *tree, const void *kbuf, int ksiz, int num);
- `tree' specifies the tree object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `num' specifies the additional value.
- The return value is the summation value.
- If the corresponding record exists, the value is treated as an integer and is added to. If no record corresponds, a new record of the additional value is stored.
The function `tctreeadddouble' is used in order to add a real number to a record in a tree object.
double tctreeadddouble(TCTREE *tree, const void *kbuf, int ksiz, double num);
- `tree' specifies the tree object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `num' specifies the additional value.
- The return value is the summation value.
- If the corresponding record exists, the value is treated as a real number and is added to. If no record corresponds, a new record of the additional value is stored.
The function `tctreeclear' is used in order to clear a tree object.
void tctreeclear(TCTREE *tree);
- `tree' specifies the tree object.
- All records are removed.
The function `tctreecutfringe' is used in order to remove fringe records of a tree object.
void tctreecutfringe(TCTREE *tree, int num);
- `tree' specifies the tree object.
- `num' specifies the number of records to be removed.
The function `tctreedump' is used in order to serialize a tree object into a byte array.
void *tctreedump(const TCTREE *tree, int *sp);
- `tree' specifies the tree object.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- The return value is the pointer to the region of the result serial region.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tctreeload' is used in order to create a tree object from a serialized byte array.
TCTREE *tctreeload(const void *ptr, int size, TCCMP cmp, void *cmpop);
- `ptr' specifies the pointer to the region of serialized byte array.
- `size' specifies the size of the region.
- `cmp' specifies the pointer to the custom comparison function.
- `cmpop' specifies an arbitrary pointer to be given as a parameter of the comparison function.
- If it is not needed, `NULL' can be specified.
- The return value is a new tree object.
- Because the object of the return value is created with the function `tctreenew', it should be deleted with the function `tctreedel' when it is no longer in use.
オンメモリハッシュデータベースのAPI(英語御免)
The function `tcmdbnew' is used in order to create an on-memory hash database object.
TCMDB *tcmdbnew(void);
- The return value is the new on-memory hash database object.
- The object can be shared by plural threads because of the internal mutex.
The function `tcmdbnew2' is used in order to create an on-memory hash database object with specifying the number of the buckets.
TCMDB *tcmdbnew2(uint32_t bnum);
- `bnum' specifies the number of the buckets.
- The return value is the new on-memory hash database object.
- The object can be shared by plural threads because of the internal mutex.
The function `tcmdbdel' is used in order to delete an on-memory hash database object.
void tcmdbdel(TCMDB *mdb);
- `mdb' specifies the on-memory hash database object.
The function `tcmdbput' is used in order to store a record into an on-memory hash database object.
void tcmdbput(TCMDB *mdb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `mdb' specifies the on-memory hash database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If a record with the same key exists in the database, it is overwritten.
The function `tcmdbput2' is used in order to store a string record into an on-memory hash database object.
void tcmdbput2(TCMDB *mdb, const char *kstr, const char *vstr);
- `mdb' specifies the on-memory hash database object.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If a record with the same key exists in the database, it is overwritten.
The function `tcmdbputkeep' is used in order to store a new record into an on-memory hash database object.
bool tcmdbputkeep(TCMDB *mdb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `mdb' specifies the on-memory hash database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, this function has no effect.
The function `tcmdbputkeep2' is used in order to store a new string record into an on-memory hash database object.
bool tcmdbputkeep2(TCMDB *mdb, const char *kstr, const char *vstr);
- `mdb' specifies the on-memory hash database object.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, this function has no effect.
The function `tcmdbputcat' is used in order to concatenate a value at the end of the existing record in an on-memory hash database.
void tcmdbputcat(TCMDB *mdb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `mdb' specifies the on-memory hash database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If there is no corresponding record, a new record is created.
The function `tcmdbputcat2' is used in order to concatenate a string at the end of the existing record in an on-memory hash database.
void tcmdbputcat2(TCMDB *mdb, const char *kstr, const char *vstr);
- `mdb' specifies the on-memory hash database object.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If there is no corresponding record, a new record is created.
The function `tcmdbout' is used in order to remove a record of an on-memory hash database object.
bool tcmdbout(TCMDB *mdb, const void *kbuf, int ksiz);
- `mdb' specifies the on-memory hash database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- If successful, the return value is true. False is returned when no record corresponds to the specified key.
The function `tcmdbout2' is used in order to remove a string record of an on-memory hash database object.
bool tcmdbout2(TCMDB *mdb, const char *kstr);
- `mdb' specifies the on-memory hash database object.
- `kstr' specifies the string of the key.
- If successful, the return value is true. False is returned when no record corresponds to the specified key.
The function `tcmdbget' is used in order to retrieve a record in an on-memory hash database object.
void *tcmdbget(TCMDB *mdb, const void *kbuf, int ksiz, int *sp);
- `mdb' specifies the on-memory hash database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the region of the value of the corresponding record. `NULL' is returned when no record corresponds.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcmdbget2' is used in order to retrieve a string record in an on-memory hash database object.
char *tcmdbget2(TCMDB *mdb, const char *kstr);
- `mdb' specifies the on-memory hash database object.
- `kstr' specifies the string of the key.
- If successful, the return value is the string of the value of the corresponding record. `NULL' is returned when no record corresponds.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcmdbvsiz' is used in order to get the size of the value of a record in an on-memory hash database object.
int tcmdbvsiz(TCMDB *mdb, const void *kbuf, int ksiz);
- `mdb' specifies the on-memory hash database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- If successful, the return value is the size of the value of the corresponding record, else, it is -1.
The function `tcmdbvsiz2' is used in order to get the size of the value of a string record in an on-memory hash database object.
int tcmdbvsiz2(TCMDB *mdb, const char *kstr);
- `mdb' specifies the on-memory hash database object.
- `kstr' specifies the string of the key.
- If successful, the return value is the size of the value of the corresponding record, else, it is -1.
The function `tcmdbiterinit' is used in order to initialize the iterator of an on-memory hash database object.
void tcmdbiterinit(TCMDB *mdb);
- `mdb' specifies the on-memory hash database object.
- The iterator is used in order to access the key of every record stored in the on-memory hash database.
The function `tcmdbiternext' is used in order to get the next key of the iterator of an on-memory hash database object.
void *tcmdbiternext(TCMDB *mdb, int *sp);
- `mdb' specifies the on-memory hash database object.
- `sp' specifies the pointer to the variable into which the size of the region of the return
- value is assigned.
- If successful, the return value is the pointer to the region of the next key, else, it is `NULL'. `NULL' is returned when no record can be fetched from the iterator.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use. The order of iteration is assured to be the same as the stored order.
The function `tcmdbiternext2' is used in order to get the next key string of the iterator of an on-memory hash database object.
char *tcmdbiternext2(TCMDB *mdb);
- `mdb' specifies the on-memory hash database object.
- If successful, the return value is the pointer to the region of the next key, else, it is `NULL'. `NULL' is returned when no record can be fetched from the iterator.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use. The order of iteration is assured to be the same as the stored order.
The function `tcmdbfwmkeys' is used in order to get forward matching keys in an on-memory hash database object.
TCLIST *tcmdbfwmkeys(TCMDB *mdb, const void *pbuf, int psiz, int max);
- `mdb' specifies the on-memory hash database object.
- `pbuf' specifies the pointer to the region of the prefix.
- `psiz' specifies the size of the region of the prefix.
- `max' specifies the maximum number of keys to be fetched. If it is negative, no limit is specified.
- The return value is a list object of the corresponding keys. This function does never fail and return an empty list even if no key corresponds.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use. Note that this function may be very slow because every key in the database is scanned.
The function `tcmdbfwmkeys2' is used in order to get forward matching string keys in an on-memory hash database object.
TCLIST *tcmdbfwmkeys2(TCMDB *mdb, const char *pstr, int max);
- `mdb' specifies the on-memory hash database object.
- `pstr' specifies the string of the prefix.
- `max' specifies the maximum number of keys to be fetched. If it is negative, no limit is specified.
- The return value is a list object of the corresponding keys. This function does never fail and return an empty list even if no key corresponds.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use. Note that this function may be very slow because every key in the database is scanned.
The function `tcmdbrnum' is used in order to get the number of records stored in an on-memory hash database object.
uint64_t tcmdbrnum(TCMDB *mdb);
- `mdb' specifies the on-memory hash database object.
- The return value is the number of the records stored in the database.
The function `tcmdbmsiz' is used in order to get the total size of memory used in an on-memory hash database object.
uint64_t tcmdbmsiz(TCMDB *mdb);
- `mdb' specifies the on-memory hash database object.
- The return value is the total size of memory used in the database.
The function `tcmdbaddint' is used in order to add an integer to a record in an on-memory hash database object.
int tcmdbaddint(TCMDB *mdb, const void *kbuf, int ksiz, int num);
- `mdb' specifies the on-memory hash database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `num' specifies the additional value.
- The return value is the summation value.
- If the corresponding record exists, the value is treated as an integer and is added to. If no record corresponds, a new record of the additional value is stored.
The function `tcmdbadddouble' is used in order to add a real number to a record in an on-memory hash database object.
double tcmdbadddouble(TCMDB *mdb, const void *kbuf, int ksiz, double num);
- `mdb' specifies the on-memory hash database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `num' specifies the additional value.
- The return value is the summation value.
- If the corresponding record exists, the value is treated as a real number and is added to. If no record corresponds, a new record of the additional value is stored.
The function `tcmdbvanish' is used in order to clear an on-memory hash database object.
void tcmdbvanish(TCMDB *mdb);
- `mdb' specifies the on-memory hash database object.
- All records are removed.
The function `tcmdbcutfront' is used in order to remove front records of an on-memory hash database object.
void tcmdbcutfront(TCMDB *mdb, int num);
- `mdb' specifies the on-memory hash database object.
- `num' specifies the number of records to be removed.
オンメモリツリーデータベースのAPI(英語御免)
The function `tcndbnew' is used in order to create an on-memory tree database object.
TCNDB *tcndbnew(void);
- The return value is the new on-memory tree database object.
- The object can be shared by plural threads because of the internal mutex.
The function `tcndbnew2' is used in order to create an on-memory tree database object with specifying the custom comparison function.
TCNDB *tcndbnew2(TCCMP cmp, void *cmpop);
- `cmp' specifies the pointer to the custom comparison function.
- `cmpop' specifies an arbitrary pointer to be given as a parameter of the comparison function. If it is not needed, `NULL' can be specified.
- The return value is the new on-memory tree database object.
- The default comparison function compares keys of two records by lexical order. The functions `tccmplexical' (dafault), `tccmpdecimal', `tccmpint32', and `tccmpint64' are built-in. The object can be shared by plural threads because of the internal mutex.
The function `tcndbdel' is used in order to delete an on-memory tree database object.
void tcndbdel(TCNDB *ndb);
- `ndb' specifies the on-memory tree database object.
The function `tcndbput' is used in order to store a record into an on-memory tree database object.
void tcndbput(TCNDB *ndb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `ndb' specifies the on-memory tree database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If a record with the same key exists in the database, it is overwritten.
The function `tcndbput2' is used in order to store a string record into an on-memory tree database object.
void tcndbput2(TCNDB *ndb, const char *kstr, const char *vstr);
- `ndb' specifies the on-memory tree database object.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If a record with the same key exists in the database, it is overwritten.
The function `tcndbputkeep' is used in order to store a new record into an on-memory tree database object.
bool tcndbputkeep(TCNDB *ndb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `ndb' specifies the on-memory tree database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, this function has no effect.
The function `tcndbputkeep2' is used in order to store a new string record into an on-memory tree database object.
bool tcndbputkeep2(TCNDB *ndb, const char *kstr, const char *vstr);
- `ndb' specifies the on-memory tree database object.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, this function has no effect.
The function `tcndbputcat' is used in order to concatenate a value at the end of the existing record in an on-memory tree database.
void tcndbputcat(TCNDB *ndb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `ndb' specifies the on-memory tree database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If there is no corresponding record, a new record is created.
The function `tcndbputcat2' is used in order to concatenate a string at the end of the existing record in an on-memory tree database.
void tcndbputcat2(TCNDB *ndb, const char *kstr, const char *vstr);
- `ndb' specifies the on-memory tree database object.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If there is no corresponding record, a new record is created.
The function `tcndbout' is used in order to remove a record of an on-memory tree database object.
bool tcndbout(TCNDB *ndb, const void *kbuf, int ksiz);
- `ndb' specifies the on-memory tree database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- If successful, the return value is true. False is returned when no record corresponds to the specified key.
The function `tcndbout2' is used in order to remove a string record of an on-memory tree database object.
bool tcndbout2(TCNDB *ndb, const char *kstr);
- `ndb' specifies the on-memory tree database object.
- `kstr' specifies the string of the key.
- If successful, the return value is true. False is returned when no record corresponds to the specified key.
The function `tcndbget' is used in order to retrieve a record in an on-memory tree database object.
void *tcndbget(TCNDB *ndb, const void *kbuf, int ksiz, int *sp);
- `ndb' specifies the on-memory tree database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the region of the value of the corresponding record. `NULL' is returned when no record corresponds.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcndbget2' is used in order to retrieve a string record in an on-memory tree database object.
char *tcndbget2(TCNDB *ndb, const char *kstr);
- `ndb' specifies the on-memory tree database object.
- `kstr' specifies the string of the key.
- If successful, the return value is the string of the value of the corresponding record. `NULL' is returned when no record corresponds.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcndbvsiz' is used in order to get the size of the value of a record in an on-memory tree database object.
int tcndbvsiz(TCNDB *ndb, const void *kbuf, int ksiz);
- `ndb' specifies the on-memory tree database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- If successful, the return value is the size of the value of the corresponding record, else, it is -1.
The function `tcndbvsiz2' is used in order to get the size of the value of a string record in an on-memory tree database object.
int tcndbvsiz2(TCNDB *ndb, const char *kstr);
- `ndb' specifies the on-memory tree database object.
- `kstr' specifies the string of the key.
- If successful, the return value is the size of the value of the corresponding record, else, it is -1.
The function `tcndbiterinit' is used in order to initialize the iterator of an on-memory tree database object.
void tcndbiterinit(TCNDB *ndb);
- `ndb' specifies the on-memory tree database object.
- The iterator is used in order to access the key of every record stored in the on-memory database.
The function `tcndbiterinit2' is used in order to initialize the iterator of an on-memory tree database object in front of a key.
void tcndbiterinit2(TCNDB *ndb, const void *kbuf, int ksiz);
- `ndb' specifies the on-memory tree database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- The iterator is set to the first record corresponding the key or the next substitute if completely matching record does not exist.
The function `tcndbiterinit3' is used in order to initialize the iterator of an on-memory tree database object in front of a key string.
void tcndbiterinit3(TCNDB *ndb, const char *kstr);
- `ndb' specifies the on-memory tree database object.
- `kstr' specifies the string of the key.
- The iterator is set to the first record corresponding the key or the next substitute if completely matching record does not exist.
The function `tcndbiternext' is used in order to get the next key of the iterator of an on-memory tree database object.
void *tcndbiternext(TCNDB *ndb, int *sp);
- `ndb' specifies the on-memory tree database object.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the region of the next key, else, it is `NULL'. `NULL' is returned when no record can be fetched from the iterator.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use. The order of iteration is assured to be the same as the stored order.
The function `tcndbiternext2' is used in order to get the next key string of the iterator of an on-memory tree database object.
char *tcndbiternext2(TCNDB *ndb);
- `ndb' specifies the on-memory tree database object.
- If successful, the return value is the pointer to the region of the next key, else, it is `NULL'. `NULL' is returned when no record can be fetched from the iterator.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use. The order of iteration is assured to be the same as the stored order.
The function `tcndbfwmkeys' is used in order to get forward matching keys in an on-memory tree database object.
TCLIST *tcndbfwmkeys(TCNDB *ndb, const void *pbuf, int psiz, int max);
- `ndb' specifies the on-memory tree database object.
- `pbuf' specifies the pointer to the region of the prefix.
- `psiz' specifies the size of the region of the prefix.
- `max' specifies the maximum number of keys to be fetched. If it is negative, no limit is specified.
- The return value is a list object of the corresponding keys. This function does never fail and return an empty list even if no key corresponds.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use.
The function `tcndbfwmkeys2' is used in order to get forward matching string keys in an on-memory tree database object.
TCLIST *tcndbfwmkeys2(TCNDB *ndb, const char *pstr, int max);
- `ndb' specifies the on-memory tree database object.
- `pstr' specifies the string of the prefix.
- `max' specifies the maximum number of keys to be fetched. If it is negative, no limit is specified.
- The return value is a list object of the corresponding keys. This function does never fail and return an empty list even if no key corresponds.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use.
The function `tcndbrnum' is used in order to get the number of records stored in an on-memory tree database object.
uint64_t tcndbrnum(TCNDB *ndb);
- `ndb' specifies the on-memory tree database object.
- The return value is the number of the records stored in the database.
The function `tcndbmsiz' is used in order to get the total size of memory used in an on-memory tree database object.
uint64_t tcndbmsiz(TCNDB *ndb);
- `ndb' specifies the on-memory tree database object.
- The return value is the total size of memory used in the database.
The function `tcndbaddint' is used in order to add an integer to a record in an on-memory tree database object.
int tcndbaddint(TCNDB *ndb, const void *kbuf, int ksiz, int num);
- `ndb' specifies the on-memory tree database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `num' specifies the additional value.
- The return value is the summation value.
- If the corresponding record exists, the value is treated as an integer and is added to. If no record corresponds, a new record of the additional value is stored.
The function `tcndbadddouble' is used in order to add a real number to a record in an on-memory tree database object.
double tcndbadddouble(TCNDB *ndb, const void *kbuf, int ksiz, double num);
- `ndb' specifies the on-memory tree database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `num' specifies the additional value.
- The return value is the summation value.
- If the corresponding record exists, the value is treated as a real number and is added to. If no record corresponds, a new record of the additional value is stored.
The function `tcndbvanish' is used in order to clear an on-memory tree database object.
void tcndbvanish(TCNDB *ndb);
- `ndb' specifies the on-memory tree database object.
- All records are removed.
The function `tcndbcutfringe' is used in order to remove fringe records of an on-memory tree database object.
void tcndbcutfringe(TCNDB *ndb, int num);
- `ndb' specifies the on-memory tree database object.
- `num' specifies the number of records to be removed.
メモリプールのAPI(英語御免)
The function `tcmpoolnew' is used in order to create a memory pool object.
TCMPOOL *tcmpoolnew(void);
- The return value is the new memory pool object.
The function `tcmpooldel' is used in order to delete a memory pool object.
void tcmpooldel(TCMPOOL *mpool);
- `mpool' specifies the memory pool object.
- Note that the deleted object and its derivatives can not be used anymore.
The function `tcmpoolput' is used in order to relegate an arbitrary object to a memory pool object.
void tcmpoolput(TCMPOOL *mpool, void *ptr, void (*del)(void *));
- `mpool' specifies the memory pool object.
- `ptr' specifies the pointer to the object to be relegated.
- `del' specifies the pointer to the function to delete the object.
- This function assures that the specified object is deleted when the memory pool object is deleted.
The function `tcmpoolputptr' is used in order to relegate an allocated region to a memory pool object.
void tcmpoolputptr(TCMPOOL *mpool, void *ptr);
- `ptr' specifies the pointer to the region to be relegated.
- This function assures that the specified region is released when the memory pool object is deleted.
The function `tcmpoolputxstr' is used in order to relegate an extensible string object to a memory pool object.
void tcmpoolputxstr(TCMPOOL *mpool, TCXSTR *xstr);
- `mpool' specifies the memory pool object.
- `xstr' specifies the extensible string object.
- This function assures that the specified object is deleted when the memory pool object is deleted.
The function `tcmpoolputlist' is used in order to relegate a list object to a memory pool object.
void tcmpoolputlist(TCMPOOL *mpool, TCLIST *list);
- `mpool' specifies the memory pool object.
- `list' specifies the list object.
- This function assures that the specified object is deleted when the memory pool object is deleted.
The function `tcmpoolputmap' is used in order to relegate a map object to a memory pool object.
void tcmpoolputmap(TCMPOOL *mpool, TCMAP *map);
- `mpool' specifies the memory pool object.
- `map' specifies the map object.
- This function assures that the specified object is deleted when the memory pool object is deleted.
The function `tcmpoolputtree' is used in order to relegate a tree object to a memory pool object.
void tcmpoolputtree(TCMPOOL *mpool, TCTREE *tree);
- `mpool' specifies the memory pool object.
- `tree' specifies the tree object.
- This function assures that the specified object is deleted when the memory pool object is deleted.
The function `tcmpoolmalloc' is used in order to allocate a region relegated to a memory pool object.
void *tcmpoolmalloc(TCMPOOL *mpool, size_t size);
- `mpool' specifies the memory pool object.
- The return value is the pointer to the allocated region under the memory pool.
The function `tcmpoolxstrnew' is used in order to create an extensible string object relegated to a memory pool object.
TCXSTR *tcmpoolxstrnew(TCMPOOL *mpool);
- The return value is the new extensible string object under the memory pool.
The function `tcmpoollistnew' is used in order to create a list object relegated to a memory pool object.
TCLIST *tcmpoollistnew(TCMPOOL *mpool);
- The return value is the new list object under the memory pool.
The function `tcmpoolmapnew' is used in order to create a map object relegated to a memory pool object.
TCMAP *tcmpoolmapnew(TCMPOOL *mpool);
- The return value is the new map object under the memory pool.
The function `tcmpooltreenew' is used in order to create a tree object relegated to a memory pool object.
TCTREE *tcmpooltreenew(TCMPOOL *mpool);
- The return value is the new tree object under the memory pool.
The function `tcmpoolglobal' is used in order to get the global memory pool object.
TCMPOOL *tcmpoolglobal(void);
- The return value is the global memory pool object.
- The global memory pool object is a singleton and assured to be deleted when the process is terminating normally.
雑多なユーティリティのAPI(英語御免)
The function `tclmax' is used in order to get the larger value of two integers.
long tclmax(long a, long b);
- `a' specifies an integer.
- `b' specifies the other integer.
- The return value is the larger value of the two.
The function `tclmin' is used in order to get the lesser value of two integers.
long tclmin(long a, long b);
- `a' specifies an integer.
- `b' specifies the other integer.
- The return value is the lesser value of the two.
The function `tclrand' is used in order to get a random number as long integer based on uniform distribution.
unsigned long tclrand(void);
- The return value is the random number between 0 and `ULONG_MAX'.
- This function uses the random number source device and generates a real random number if possible.
The function `tcdrand' is used in order to get a random number as double decimal based on uniform distribution.
double tcdrand(void);
- The return value is the random number equal to or greater than 0, and less than 1.0.
- This function uses the random number source device and generates a real random number if possible.
The function `tcdrandnd' is used in order to get a random number as double decimal based on normal distribution.
double tcdrandnd(double avg, double sd);
- `avg' specifies the average.
- `sd' specifies the standard deviation.
- The return value is the random number.
- This function uses the random number source device and generates a real random number if possible.
The function `tcstricmp' is used in order to compare two strings with case insensitive evaluation.
int tcstricmp(const char *astr, const char *bstr);
- `astr' specifies a string.
- `bstr' specifies of the other string.
- The return value is positive if the former is big, negative if the latter is big, 0 if both are equivalent.
The function `tcstrfwm' is used in order to check whether a string begins with a key.
bool tcstrfwm(const char *str, const char *key);
- `str' specifies the target string.
- `key' specifies the forward matching key string.
- The return value is true if the target string begins with the key, else, it is false.
The function `tcstrifwm' is used in order to check whether a string begins with a key with case insensitive evaluation.
bool tcstrifwm(const char *str, const char *key);
- `str' specifies the target string.
- `key' specifies the forward matching key string.
- The return value is true if the target string begins with the key, else, it is false.
The function `tcstrbwm' is used in order to check whether a string ends with a key.
bool tcstrbwm(const char *str, const char *key);
- `str' specifies the target string.
- `key' specifies the backward matching key string.
- The return value is true if the target string ends with the key, else, it is false.
The function `tcstribwm' is used in order to check whether a string ends with a key with case insensitive evaluation.
bool tcstribwm(const char *str, const char *key);
- `str' specifies the target string.
- `key' specifies the backward matching key string.
- The return value is true if the target string ends with the key, else, it is false.
The function `tcstrdist' is used in order to calculate the edit distance of two strings.
int tcstrdist(const char *astr, const char *bstr);
- `astr' specifies a string.
- `bstr' specifies of the other string.
- The return value is the edit distance which is known as the Levenshtein distance. The cost is calculated by byte.
The function `tcstrdistutf' is used in order to calculate the edit distance of two UTF-8 strings.
int tcstrdistutf(const char *astr, const char *bstr);
- `astr' specifies a string.
- `bstr' specifies of the other string.
- The return value is the edit distance which is known as the Levenshtein distance. The cost is calculated by Unicode character.
The function `tcstrtoupper' is used in order to convert the letters of a string into upper case.
char *tcstrtoupper(char *str);
- `str' specifies the string to be converted.
- The return value is the string itself.
The function `tcstrtolower' is used in order to convert the letters of a string into lower case.
char *tcstrtolower(char *str);
- `str' specifies the string to be converted.
- The return value is the string itself.
The function `tcstrtrim' is used in order to cut space characters at head or tail of a string.
char *tcstrtrim(char *str);
- `str' specifies the string to be converted.
- The return value is the string itself.
The function `tcstrsqzspc' is used in order to squeeze space characters in a string and trim it.
char *tcstrsqzspc(char *str);
- `str' specifies the string to be converted.
- The return value is the string itself.
The function `tcstrsubchr' is used in order to substitute characters in a string.
char *tcstrsubchr(char *str, const char *rstr, const char *sstr);
- `str' specifies the string to be converted.
- `rstr' specifies the string containing characters to be replaced.
- `sstr' specifies the string containing characters to be substituted.
- If the substitute string is shorter then the replacement string, corresponding characters are removed.
The function `tcstrcntutf' is used in order to count the number of characters in a string of UTF-8.
int tcstrcntutf(const char *str);
- `str' specifies the string of UTF-8.
- The return value is the number of characters in the string.
The function `tcstrcututf' is used in order to cut a string of UTF-8 at the specified number of characters.
char *tcstrcututf(char *str, int num);
- `str' specifies the string of UTF-8.
- `num' specifies the number of characters to be kept.
- The return value is the string itself.
The function `tcstrutftoucs' is used in order to convert a UTF-8 string into a UCS-2 array.
void tcstrutftoucs(const char *str, uint16_t *ary, int *np);
- `str' specifies the UTF-8 string.
- `ary' specifies the pointer to the region into which the result UCS-2 codes are written. The size of the buffer should be sufficient.
- `np' specifies the pointer to a variable into which the number of elements of the result array is assigned.
The function `tcstrucstoutf' is used in order to convert a UCS-2 array into a UTF-8 string.
void tcstrucstoutf(const uint16_t *ary, int num, char *str);
- `ary' specifies the array of UCS-2 code codes.
- `num' specifies the number of the array.
- `str' specifies the pointer to the region into which the result UTF-8 string is written. The size of the buffer should be sufficient.
The function `tcstrsplit' is used in order to create a list object by splitting a string.
TCLIST *tcstrsplit(const char *str, const char *delims);
- `str' specifies the source string.
- `delims' specifies a string containing delimiting characters.
- The return value is a list object of the split elements.
- If two delimiters are successive, it is assumed that an empty element is between the two. Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use.
The function `tcstrjoin' is used in order to create a string by joining all elements of a list object.
char *tcstrjoin(TCLIST *list, char delim);
- `list' specifies a list object.
- `delim' specifies a delimiting character.
- The return value is the result string.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcatoi' is used in order to convert a string with a metric prefix to an integer.
int64_t tcatoi(const char *str);
- `str' specifies a string which can be trailed by a binary metric prefix. "K", "M", "G", "T", "P", and "E" are supported. They are case-insensitive.
- The return value is the integer. If the string does not contain numeric expression, 0 is returned. If the integer overflows the domain, `INT64_MAX' or `INT64_MIN' is returned according to the sign.
The function `tcregexmatch' is used in order to check whether a string matches a regular expression.
bool tcregexmatch(const char *str, const char *regex);
- `str' specifies the target string.
- `regex' specifies the regular expression string. If it begins with `*', the trailing substring is used as a case-insensitive regular expression.
- The return value is true if matching is success, else, it is false.
The function `tcregexreplace' is used in order to replace each substring matching a regular expression string.
char *tcregexreplace(const char *str, const char *regex, const char *alt);
- `str' specifies the target string.
- `regex' specifies the regular expression string for substrings. If it begins with `*', the trailing substring is used as a case-insensitive regular expression.
- `alt' specifies the alternative string with which each substrings is replaced. Each `&' in the string is replaced with the matched substring. Each `\' in the string escapes the following character. Special escapes "\1" through "\9" referring to the corresponding matching sub-expressions in the regular expression string are supported.
- The return value is a new converted string. Even if the regular expression is invalid, a copy of the original string is returned.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcmd5hash' is used in order to get the MD5 hash value of a record.
void tcmd5hash(const void *ptr, int size, char *buf);
- `ptr' specifies the pointer to the region of the record.
- `size' specifies the size of the region.
- `buf' specifies the pointer to the region into which the result string is written. The size of the buffer should be equal to or more than 48 bytes.
The function `tcchidxnew' is used in order to create a consistent hashing object.
TCCHIDX *tcchidxnew(int range);
- `range' specifies the number of nodes. It should be more than 0. The range of hash values is from 0 to less than the specified number.
- The return value is the new consistent hashing object.
- Consistent hashing is useful because the addition or removal of one node does not significantly change the mapping of keys to nodes.
The function `tcchidxdel' is used in order to delete a consistent hashing object.
void tcchidxdel(TCCHIDX *chidx);
- `chidx' specifies the consistent hashing object.
The function `tcchidxhash' is used in order to get the consistent hashing value of a record.
int tcchidxhash(TCCHIDX *chidx, const void *ptr, int size);
- `chidx' specifies the consistent hashing object.
- `ptr' specifies the pointer to the region of the record.
- `size' specifies the size of the region.
- The return value is the hash value of the record.
The function `tctime' is used in order to get the time of day in seconds.
double tctime(void);
- The return value is the time of day in seconds. The accuracy is in microseconds.
The function `tccalendar' is used in order to get the Gregorian calendar of a time.
void tccalendar(int64_t t, int jl, int *yearp, int *monp, int *dayp, int *hourp, int *minp, int *secp);
- `t' specifies the source time in seconds from the epoch. If it is `INT64_MAX', the current time is specified.
- `jl' specifies the jet lag of a location in seconds. If it is `INT_MAX', the local jet lag is specified.
- `yearp' specifies the pointer to a variable to which the year is assigned. If it is `NULL', it is not used.
- `monp' specifies the pointer to a variable to which the month is assigned. If it is `NULL', it is not used. 1 means January and 12 means December.
- `dayp' specifies the pointer to a variable to which the day of the month is assigned. If it is `NULL', it is not used.
- `hourp' specifies the pointer to a variable to which the hours is assigned. If it is `NULL', it is not used.
- `minp' specifies the pointer to a variable to which the minutes is assigned. If it is `NULL', it is not used.
- `secp' specifies the pointer to a variable to which the seconds is assigned. If it is `NULL', it is not used.
The function `tcdatestrwww' is used in order to format a date as a string in W3CDTF.
void tcdatestrwww(int64_t t, int jl, char *buf);
- `t' specifies the source time in seconds from the epoch. If it is `INT64_MAX', the current time is specified.
- `jl' specifies the jet lag of a location in seconds. If it is `INT_MAX', the local jet lag is specified.
- `buf' specifies the pointer to the region into which the result string is written. The size of the buffer should be equal to or more than 48 bytes.
- W3CDTF represents a date as "YYYY-MM-DDThh:mm:ddTZD".
The function `tcdatestrhttp' is used in order to format a date as a string in RFC 1123 format.
void tcdatestrhttp(int64_t t, int jl, char *buf);
- `t' specifies the source time in seconds from the epoch. If it is `INT64_MAX', the current time is specified.
- `jl' specifies the jet lag of a location in seconds. If it is `INT_MAX', the local jet lag is specified.
- `buf' specifies the pointer to the region into which the result string is written. The size of the buffer should be equal to or more than 48 bytes.
- RFC 1123 format represents a date as "Wdy, DD-Mon-YYYY hh:mm:dd TZD".
The function `tcstrmktime' is used in order to get the time value of a date string.
int64_t tcstrmktime(const char *str);
- `str' specifies the date string in decimal, hexadecimal, W3CDTF, or RFC 822 (1123). Decimal can be trailed by "s" for in seconds, "m" for in minutes, "h" for in hours, and "d" for in days.
- The return value is the time value of the date or `INT64_MAX' if the format is invalid.
The function `tcjetlag' is used in order to get the jet lag of the local time.
int tcjetlag(void);
- The return value is the jet lag of the local time in seconds.
The function `tcdayofweek' is used in order to get the day of week of a date.
int tcdayofweek(int year, int mon, int day);
- `year' specifies the year of a date.
- `mon' specifies the month of the date.
- `day' specifies the day of the date.
- The return value is the day of week of the date. 0 means Sunday and 6 means Saturday.
ファイルシステム関連ユーティリティのAPI(英語御免)
The function `tcrealpath' is used in order to get the canonicalized absolute path of a file.
char *tcrealpath(const char *path);
- `path' specifies the path of the file.
- The return value is the canonicalized absolute path of a file, or `NULL' if the path is invalid.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcreadfile' is used in order to read whole data of a file.
void *tcreadfile(const char *path, int limit, int *sp);
- `path' specifies the path of the file. If it is `NULL', the standard input is specified.
- `limit' specifies the limiting size of reading data. If it is not more than 0, the limitation is not specified.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned. If it is `NULL', it is not used.
- The return value is the pointer to the allocated region of the read data, or `NULL' if the file could not be opened.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when when is no longer in use.
The function `tcreadfilelines' is used in order to read every line of a file.
TCLIST *tcreadfilelines(const char *path);
- `path' specifies the path of the file. If it is `NULL', the standard input is specified.
- The return value is a list object of every lines if successful, else it is `NULL'.
- Line separators are cut out. Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use.
The function `tcwritefile' is used in order to write data into a file.
bool tcwritefile(const char *path, const void *ptr, int size);
- `path' specifies the path of the file. If it is `NULL', the standard output is specified.
- `ptr' specifies the pointer to the data region.
- `size' specifies the size of the region.
- If successful, the return value is true, else, it is false.
The function `tccopyfile' is used in order to copy a file.
bool tccopyfile(const char *src, const char *dest);
- `src' specifies the path of the source file.
- `dest' specifies the path of the destination file.
- The return value is true if successful, else, it is false.
- If the destination file exists, it is overwritten.
The function `tcreaddir' is used in order to read names of files in a directory.
TCLIST *tcreaddir(const char *path);
- `path' specifies the path of the directory.
- The return value is a list object of names if successful, else it is `NULL'.
- Links to the directory itself and to the parent directory are ignored.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use.
The function `tcglobpat' is used in order to expand a pattern into a list of matched paths.
TCLIST *tcglobpat(const char *pattern);
- `pattern' specifies the matching pattern.
- The return value is a list object of matched paths. If no path is matched, an empty list is returned.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use.
The function `tcremovelink' is used in order to remove a file or a directory and its sub ones recursively.
bool tcremovelink(const char *path);
- `path' specifies the path of the link.
- If successful, the return value is true, else, it is false. False is returned when the link does not exist or the permission is denied.
The function `tcwrite' is used in order to write data into a file.
bool tcwrite(int fd, const void *buf, size_t size);
- `fd' specifies the file descriptor.
- `buf' specifies the buffer to be written.
- `size' specifies the size of the buffer.
- The return value is true if successful, else, it is false.
The function `tcread' is used in order to read data from a file.
bool tcread(int fd, void *buf, size_t size);
- `fd' specifies the file descriptor.
- `buf' specifies the buffer to store into.
- `size' specifies the size of the buffer.
- The return value is true if successful, else, it is false.
The function `tclock' is used in order to lock a file.
bool tclock(int fd, bool ex, bool nb);
- `fd' specifies the file descriptor.
- `ex' specifies whether an exclusive lock or a shared lock is performed.
- `nb' specifies whether to request with non-blocking.
- The return value is true if successful, else, it is false.
The function `tcunlock' is used in order to unlock a file.
bool tcunlock(int fd);
- `fd' specifies the file descriptor.
- The return value is true if successful, else, it is false.
The function `tcsystem' is used in order to execute a shell command.
int tcsystem(const char **args, int anum);
- `args' specifies an array of the command name and its arguments.
- `anum' specifies the number of elements of the array.
- The return value is the exit code of the command or `INT_MAX' on failure.
- The command name and the arguments are quoted and meta characters are escaped.
エンコーディング関連ユーティリティののAPI(英語御免)
The function `tcurlencode' is used in order to encode a serial object with URL encoding.
char *tcurlencode(const char *ptr, int size);
- `ptr' specifies the pointer to the region.
- `size' specifies the size of the region.
- The return value is the result string.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call if when is no longer in use.
The function `tcurldecode' is used in order to decode a string encoded with URL encoding.
char *tcurldecode(const char *str, int *sp);
- `str' specifies the encoded string.
- `sp' specifies the pointer to a variable into which the size of the region of the return value is assigned.
- The return value is the pointer to the region of the result.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcurlbreak' is used in order to break up a URL into elements.
TCMAP *tcurlbreak(const char *str);
- `str' specifies the URL string.
- The return value is the map object whose keys are the name of elements. The key "self" specifies the URL itself. The key "scheme" specifies the scheme. The key "host" specifies the host of the server. The key "port" specifies the port number of the server. The key "authority" specifies the authority information. The key "path" specifies the path of the resource. The key "file" specifies the file name without the directory section. The key "query" specifies the query string. The key "fragment" specifies the fragment string.
- Supported schema are HTTP, HTTPS, FTP, and FILE. Absolute URL and relative URL are supported. Because the object of the return value is created with the function `tcmapnew', it should be deleted with the function `tcmapdel' when it is no longer in use.
The function `tcurlresolve' is used in order to resolve a relative URL with an absolute URL.
char *tcurlresolve(const char *base, const char *target);
- `base' specifies the absolute URL of the base location.
- `target' specifies the URL to be resolved.
- The return value is the resolved URL. If the target URL is relative, a new URL of relative location from the base location is returned. Else, a copy of the target URL is returned.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcbaseencode' is used in order to encode a serial object with Base64 encoding.
char *tcbaseencode(const char *ptr, int size);
- `ptr' specifies the pointer to the region.
- `size' specifies the size of the region.
- The return value is the result string.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call if when is no longer in use.
The function `tcbasedecode' is used in order to decode a string encoded with Base64 encoding.
char *tcbasedecode(const char *str, int *sp);
- `str' specifies the encoded string.
- `sp' specifies the pointer to a variable into which the size of the region of the return value is assigned.
- The return value is the pointer to the region of the result.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcquoteencode' is used in order to encode a serial object with Quoted-printable encoding.
char *tcquoteencode(const char *ptr, int size);
- `ptr' specifies the pointer to the region.
- `size' specifies the size of the region.
- The return value is the result string.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call if when is no longer in use.
The function `tcquotedecode' is used in order to decode a string encoded with Quoted-printable encoding.
char *tcquotedecode(const char *str, int *sp);
- `str' specifies the encoded string.
- `sp' specifies the pointer to a variable into which the size of the region of the return value is assigned.
- The return value is the pointer to the region of the result.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcmimeencode' is used in order to encode a string with MIME encoding.
char *tcmimeencode(const char *str, const char *encname, bool base);
- `str' specifies the string.
- `encname' specifies the string of the name of the character encoding.
- `base' specifies whether to use Base64 encoding. If it is false, Quoted-printable is used.
- The return value is the result string.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcmimedecode' is used in order to decode a string encoded with MIME encoding.
char *tcmimedecode(const char *str, char *enp);
- `str' specifies the encoded string.
- `enp' specifies the pointer to the region into which the name of encoding is written. If it is `NULL', it is not used. The size of the buffer should be equal to or more than 32 bytes.
- The return value is the result string.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcmimebreak' is used in order to split a string of MIME into headers and the body.
char *tcmimebreak(const char *ptr, int size, TCMAP *headers, int *sp);
- `ptr' specifies the pointer to the region of MIME data.
- `size' specifies the size of the region.
- `headers' specifies a map object to store headers. If it is `NULL', it is not used. Each key of the map is an uncapitalized header name.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- The return value is the pointer to the region of the body data.
- If the content type is defined, the header map has the key "TYPE" specifying the type. If the character encoding is defined, the key "CHARSET" specifies the encoding name. If the boundary string of multipart is defined, the key "BOUNDARY" specifies the string. If the content disposition is defined, the key "DISPOSITION" specifies the direction. If the file name is defined, the key "FILENAME" specifies the name. If the attribute name is defined, the key "NAME" specifies the name. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcmimeparts' is used in order to split multipart data of MIME into its parts.
TCLIST *tcmimeparts(const char *ptr, int size, const char *boundary);
- `ptr' specifies the pointer to the region of multipart data of MIME.
- `size' specifies the size of the region.
- `boundary' specifies the boundary string.
- The return value is a list object. Each element of the list is the data of a part.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use.
The function `tchexencode' is used in order to encode a serial object with hexadecimal encoding.
char *tchexencode(const char *ptr, int size);
- `ptr' specifies the pointer to the region.
- `size' specifies the size of the region.
- The return value is the result string.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call if when is no longer in use.
The function `tchexdecode' is used in order to decode a string encoded with hexadecimal encoding.
char *tchexdecode(const char *str, int *sp);
- `str' specifies the encoded string.
- `sp' specifies the pointer to a variable into which the size of the region of the return
- value is assigned.
- The return value is the pointer to the region of the result.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcpackencode' is used in order to compress a serial object with Packbits encoding.
char *tcpackencode(const char *ptr, int size, int *sp);
- `ptr' specifies the pointer to the region.
- `size' specifies the size of the region.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the result object, else, it is `NULL'.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcpackdecode' is used in order to decompress a serial object compressed with Packbits encoding.
char *tcpackdecode(const char *ptr, int size, int *sp);
- `ptr' specifies the pointer to the region.
- `size' specifies the size of the region.
- `sp' specifies the pointer to a variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the result object, else, it is `NULL'.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcbsencode' is used in order to compress a serial object with TCBS encoding.
char *tcbsencode(const char *ptr, int size, int *sp);
- `ptr' specifies the pointer to the region.
- `size' specifies the size of the region.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the result object, else, it is `NULL'.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcbsdecode' is used in order to decompress a serial object compressed with TCBS encoding.
char *tcbsdecode(const char *ptr, int size, int *sp);
- `ptr' specifies the pointer to the region.
- `size' specifies the size of the region.
- `sp' specifies the pointer to a variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the result object, else, it is `NULL'.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcdeflate' is used in order to compress a serial object with Deflate encoding.
char *tcdeflate(const char *ptr, int size, int *sp);
- `ptr' specifies the pointer to the region.
- `size' specifies the size of the region.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the result object, else, it is `NULL'.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcinflate' is used in order to decompress a serial object compressed with Deflate encoding.
char *tcinflate(const char *ptr, int size, int *sp);
- `ptr' specifies the pointer to the region.
- `size' specifies the size of the region.
- `sp' specifies the pointer to a variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the result object, else, it is `NULL'.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcgzipencode' is used in order to compress a serial object with GZIP encoding.
char *tcgzipencode(const char *ptr, int size, int *sp);
- `ptr' specifies the pointer to the region.
- `size' specifies the size of the region.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the result object, else, it is `NULL'.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcgzipdecode' is used in order to decompress a serial object compressed with GZIP encoding.
char *tcgzipdecode(const char *ptr, int size, int *sp);
- `ptr' specifies the pointer to the region.
- `size' specifies the size of the region.
- `sp' specifies the pointer to a variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the result object, else, it is `NULL'.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcgetcrc' is used in order to get the CRC32 checksum of a serial object.
unsigned int tcgetcrc(const char *ptr, int size);
- `ptr' specifies the pointer to the region.
- `size' specifies the size of the region.
- The return value is the CRC32 checksum of the object.
The function `tcbzipencode' is used in order to compress a serial object with BZIP2 encoding.
char *tcbzipencode(const char *ptr, int size, int *sp);
- `ptr' specifies the pointer to the region.
- `size' specifies the size of the region.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the result object, else, it is `NULL'.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcbzipdecode' is used in order to decompress a serial object compressed with BZIP2 encoding.
char *tcbzipdecode(const char *ptr, int size, int *sp);
- `ptr' specifies the pointer to the region.
- `size' specifies the size of the region.
- `sp' specifies the pointer to a variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the result object, else, it is `NULL'.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcberencode' is used in order to encode an array of nonnegative integers with BER encoding.
char *tcberencode(const unsigned int *ary, int anum, int *sp);
- `ary' specifies the pointer to the array of nonnegative integers.
- `anum' specifies the size of the array.
- `sp' specifies the pointer to a variable into which the size of the region of the return value is assigned.
- The return value is the pointer to the region of the result.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call if when is no longer in use.
The function `tcberdecode' is used in order to decode a serial object encoded with BER encoding.
unsigned int *tcberdecode(const char *ptr, int size, int *np);
- `ptr' specifies the pointer to the region.
- `size' specifies the size of the region.
- `np' specifies the pointer to a variable into which the number of elements of the return value is assigned.
- The return value is the pointer to the array of the result.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call if when is no longer in use.
The function `tcxmlescape' is used in order to escape meta characters in a string with the entity references of XML.
char *tcxmlescape(const char *str);
- `str' specifies the string.
- The return value is the pointer to the escaped string.
- This function escapes only `&', `<', `>', and `"'. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcxmlunescape' is used in order to unescape entity references in a string of XML.
char *tcxmlunescape(const char *str);
- `str' specifies the string.
- The return value is the unescaped string.
- This function restores only `&', `<', `>', and `"'. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcxmlbreak' is used in order to split an XML string into tags and text sections.
TCLIST *tcxmlbreak(const char *str);
- `str' specifies the XML string.
- The return value is the list object whose elements are strings of tags or text sections.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use. Because this function does not check validation, it can handle also HTML and SGML.
The function `tcxmlattrs' is used in order to get the map of attributes of an XML tag.
TCMAP *tcxmlattrs(const char *str);
- `str' specifies the pointer to the region of a tag string.
- The return value is the map object containing attribute names and their values which are unescaped. You can get the name of the tag with the key of an empty string.
- Because the object of the return value is created with the function `tcmapnew', it should be deleted with the function `tcmapdel' when it is no longer in use.
コード例
拡張可能文字列と配列リストとハッシュマップを使ったコード例を以下に示します。
#include <tcutil.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
int main(int argc, char **argv){
{ /* 拡張可能文字列オブジェクトの使用例 */
TCXSTR *xstr;
/* オブジェクトを作成する */
xstr = tcxstrnew();
/* 文字列を連結する */
tcxstrcat2(xstr, "hop");
tcxstrcat2(xstr, "step");
tcxstrcat2(xstr, "jump");
/* サイズと内容を印字する */
printf("%d:%s\n", tcxstrsize(xstr), (char *)tcxstrptr(xstr));
/* オブジェクトを破棄する */
tcxstrdel(xstr);
}
{ /* リストオブジェクトの使用例 */
TCLIST *list;
int i;
/* オブジェクトを作成する */
list = tclistnew();
/* 末尾に文字列を追加する */
tclistpush2(list, "hop");
tclistpush2(list, "step");
tclistpush2(list, "jump");
/* 全ての要素を印字する */
for(i = 0; i < tclistnum(list); i++){
printf("%d:%s\n", i, tclistval2(list, i));
}
/* オブジェクトを破棄する */
tclistdel(list);
}
{ /* マップオブジェクトの使用例 */
TCMAP *map;
const char *key;
/* オブジェクトを作成する */
map = tcmapnew();
/* レコードを追加する */
tcmapput2(map, "foo", "hop");
tcmapput2(map, "bar", "step");
tcmapput2(map, "baz", "jump");
/* 全てのレコードを印字する */
tcmapiterinit(map);
while((key = tcmapiternext2(map)) != NULL){
printf("%s:%s\n", key, tcmapget2(map, key));
}
/* オブジェクトを破棄する */
tcmapdel(map);
}
{ /* マップオブジェクトの使用例 */
TCTREE *tree;
const char *key;
/* オブジェクトを作成する */
tree = tctreenew();
/* レコードを追加する */
tctreeput2(tree, "foo", "hop");
tctreeput2(tree, "bar", "step");
tctreeput2(tree, "baz", "jump");
/* 全てのレコードを印字する */
tctreeiterinit(tree);
while((key = tctreeiternext2(tree)) != NULL){
printf("%s:%s\n", key, tctreeget2(tree, key));
}
/* オブジェクトを破棄する */
tctreedel(tree);
}
return 0;
}
CLI
ユーティリティAPIを簡単に利用するために、コマンドラインインターフェイスとして `tcutest
' と `tcumttest
' と `tcucodec
' が提供されます。
コマンド `tcutest
' は、ユーティリティAPIの機能テストや性能テストに用いるツールです。以下の書式で用います。`rnum' は試行回数を指定し、`anum' は配列の初期容量を指定し、`bnum' はバケット数を指定します。
tcutest xstr rnum
- 拡張可能文字列に文字列を連結するテストを行います。
tcutest list [-rd] rnum [anum]
- 配列リストに要素を追加するテストを行います。
tcutest map [-rd] [-tr] [-rnd] [-dk|-dc|-dai|-dad] rnum [bnum]
- ハッシュマップにレコードを追加するテストを行います。
tcutest tree [-rd] [-tr] [-rnd] [-dk|-dc|-dai|-dad] rnum
- 順序木にレコードを追加するテストを行います。
tcutest mdb [-rd] [-tr] [-rnd] [-dk|-dc|-dai|-dad] rnum [bnum]
- オンメモリハッシュデータベースにレコードを追加するテストを行います。
tcutest ndb [-rd] [-tr] [-rnd] [-dk|-dc|-dai|-dad] rnum
- オンメモリツリーデータベースにレコードを追加するテストを行います。
tcutest misc rnum
- その他の雑多なテストを行います。
tcutest wicked rnum
- 配列リストとハッシュマップの各種更新操作を無作為に選択して実行するテストを行います。
各オプションは以下の機能を持ちます。
-rd
: 取得のテストも行う。
-tr
: イテレータのテストも行う。
-rnd
: キーを無作為に選択する。
-dk
: 関数 `tcxxxput' の代わりに関数 `tcxxxputkeep' を用いる。
-dc
: 関数 `tcxxxput' の代わりに関数 `tcxxxputcat' を用いる。
-dai
: 関数 `tcxxxput' の代わりに関数 `tcxxxaddint' を用いる。
-dad
: 関数 `tcxxxput' の代わりに関数 `tcxxxadddouble' を用いる。
このコマンドは処理が正常に終了すれば 0 を返し、エラーがあればそれ以外の値を返して終了します。
コマンド `tcumttest
' は、オンメモリハッシュデータベースAPIとオンメモリツリーデータベースAPIの機能テストをマルチスレッドで行うツールです。以下の書式で用います。`tnum' はスレッド数を指定し、`rnum' は試行回数を指定し、`bnum' はバケット数を指定します。
tcumttest combo [-rnd] tnum rnum [bnum]
- レコードの格納と検索と削除を順に実行する。
tcumttest typical [-nc] [-rr num] tnum rnum [bnum]
- 典型的な操作を無作為に選択して実行する。
各オプションは以下の機能を持ちます
-rnd
: キーを無作為に選択する。
-nc
: 比較テストを行わない。
-rr num
: 読み込み操作の割合を百分率で指定する。
このコマンドは処理が正常に終了すれば 0 を返し、エラーがあればそれ以外の値を返して終了します。
コマンド `tcucodec
' は、ユーティリティAPIが提供するエンコードおよびデコードの機能を利用するツールです。以下の書式で用います。`file' は入力ファイルを指定しますが、省略されれば標準入力を読み込みます。
tcucodec url [-d] [-br] [-rs base] [file]
- URLエンコードとそのデコードを行う。
tcucodec base [-d] [file]
- Base64エンコードとそのデコードを行う。
tcucodec quote [-d] [file]
- Quoted-printableエンコードとそのデコードを行う。
tcucodec mime [-d] [-en name] [-q] [-on] [-hd] [-bd] [-part num] [file]
- MIMEエンコードとそのデコードを行う。
tcucodec hex [-d] [file]
- 16進数エンコードとそのデコードを行う。
tcucodec pack [-d] [-bwt] [file]
- Packbitsの圧縮とその伸長を行う。
tcucodec tcbs [-d] [file]
- TCBSの圧縮とその伸長を行う。
tcucodec zlib [-d] [-gz] [file]
- ZLIBの圧縮とその伸長を行う。
tcucodec bzip [-d] [file]
- BZIP2の圧縮とその伸長を行う。
tcucodec xml [-d] [-br] [file]
- XMLの処理を行う。デフォルトではメタ文字のエスケープを行う。
tcucodec ucs [-d] [file]
- UTF-8の文字列をUCS-2の配列に変換する。
tcucodec hash [-crc] [-ch num] [file]
- ハッシュ値を算出する。デフォルトではMD5関数を用いる。
tcucodec date [-ds str] [-jl num] [-wf] [-rf]
- 時刻の書式変換を行う。デフォルトでは現在のUNIX時間を出力する。
tcucodec conf [-v|-i|-l|-p]
- 各種の設定情報を出力する。
各オプションは以下の機能を持ちます。
-d
: エンコード(エスケープ)ではなく、デコード(アンエスケープ)を行う。
-br
: URLやXMLを構成要素に分解する。
-rs base
: ベースURLを指定して、相対URLを解決する。
-en name
: 入力の文字コードを指定する。デフォルトはUTF-8である。
-q
: Quoted-printableエンコードを用いる。デフォルトはBase64である。
-on
: デコード時に結果でなく文字コード名を出力する。
-bd
: MIME解析を行ってボディを出力する。
-hd
: MIME解析を行ってヘッダを出力する。
-part num
: MIME解析を行ってマルチパートの指定されたパートを出力する。
-bwt
: 前処理としてBWTを用いる。
-gz
: GZIP形式を用いる。
-crc
: CRC32関数を用いる。
-ch num
: コンシステントハッシュ関数を用いる。
-ds str
: 時刻を指定する。
-jl num
: 時差を指定する。
-wf
: 出力をW3CDTF形式にする。
-rf
: 出力をRFC 1123形式にする。
-v
: Tokyo Cabinetのバージョン番号を表示する。
-i
: Tokyo Cabinetのヘッダのインクルードオプションを表示する。
-l
: Tokyo Cabinetのライブラリのリンクオプションを表示する。
-p
: Tokyo Cabinetのコマンドのあるディレクトリを表示する。
このコマンドは処理が正常に終了すれば 0 を返し、エラーがあればそれ以外の値を返して終了します。
ハッシュデータベースAPI
ハッシュデータベースは、ハッシュ表を単一のファイルに記録したデータベースです。それを扱うのがハッシュデータベースAPIです。`tchdb.h
' にAPIの仕様の完全な記述があります。
概要
ハッシュデータベースAPIを使うためには、`tcutil.h
'、`tchdb.h
' および関連する標準ヘッダファイルをインクルードしてください。通常、ソースファイルの冒頭付近で以下の記述を行います。
#include <tcutil.h>
#include <tchdb.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>
ハッシュデータベースを扱う際には、`TCHDB
' 型へのポインタをオブジェクトとして用います。ハッシュデータベースオブジェクトは、関数 `tchdbnew
' で作成し、関数 `tchdbdel
' で破棄します。作成したオブジェクトを使い終わったら必ず破棄してください。そうしないとメモリリークが発生します。
レコードの格納や探索を行う前提として、ハッシューデータベースオブジェクトをデータベースファイルと接続させる必要があります。データベースファイルを開いて接続するには関数 `tchdbopen
' を用い、接続の解除してファイルを閉じるには関数 `tchdbclose
' を用います。開いたデータベースファイルは必ず閉じてください。そうしないとデータベースファイルが壊れたり格納したデータが失われたりする可能性があります。
API(英語ゴメン)
The function `tchdberrmsg' is used in order to get the message string corresponding to an error code.
const char *tchdberrmsg(int ecode);
- `ecode' specifies the error code.
- The return value is the message string of the error code.
The function `tchdbnew' is used in order to create a hash database object.
TCHDB *tchdbnew(void);
- The return value is the new hash database object.
The function `tchdbdel' is used in order to delete a hash database object.
void tchdbdel(TCHDB *hdb);
- `hdb' specifies the hash database object.
- If the database is not closed, it is closed implicitly. Note that the deleted object and its derivatives can not be used anymore.
The function `tchdbecode' is used in order to get the last happened error code of a hash database object.
int tchdbecode(TCHDB *hdb);
- `hdb' specifies the hash database object.
- The return value is the last happened error code.
- The following error codes are defined: `TCESUCCESS' for success, `TCETHREAD' for threading error, `TCEINVALID' for invalid operation, `TCENOFILE' for file not found, `TCENOPERM' for no permission, `TCEMETA' for invalid meta data, `TCERHEAD' for invalid record header, `TCEOPEN' for open error, `TCECLOSE' for close error, `TCETRUNC' for trunc error, `TCESYNC' for sync error, `TCESTAT' for stat error, `TCESEEK' for seek error, `TCEREAD' for read error, `TCEWRITE' for write error, `TCEMMAP' for mmap error, `TCELOCK' for lock error, `TCEUNLINK' for unlink error, `TCERENAME' for rename error, `TCEMKDIR' for mkdir error, `TCERMDIR' for rmdir error, `TCEKEEP' for existing record, `TCENOREC' for no record found, and `TCEMISC' for miscellaneous error.
The function `tchdbsetmutex' is used in order to set mutual exclusion control of a hash database object for threading.
bool tchdbsetmutex(TCHDB *hdb);
- `hdb' specifies the hash database object which is not opened.
- If successful, the return value is true, else, it is false.
- Note that the mutual exclusion control is needed if the object is shared by plural threads and this function should should be called before the database is opened.
The function `tchdbtune' is used in order to set the tuning parameters of a hash database object.
bool tchdbtune(TCHDB *hdb, int64_t bnum, int8_t apow, int8_t fpow, uint8_t opts);
- `hdb' specifies the hash database object which is not opened.
- `bnum' specifies the number of elements of the bucket array. If it is not more than 0, the default value is specified. The default value is 131071. Suggested size of the bucket array is about from 0.5 to 4 times of the number of all records to be stored.
- `apow' specifies the size of record alignment by power of 2. If it is negative, the default value is specified. The default value is 4 standing for 2^4=16.
- `fpow' specifies the maximum number of elements of the free block pool by power of 2. If it is negative, the default value is specified. The default value is 10 standing for 2^10=1024.
- `opts' specifies options by bitwise or: `HDBTLARGE' specifies that the size of the database can be larger than 2GB by using 64-bit bucket array, `HDBTDEFLATE' specifies that each record is compressed with Deflate encoding, `HDBTBZIP' specifies that each record is compressed with BZIP2 encoding, `HDBTTCBS' specifies that each record is compressed with TCBS encoding.
- If successful, the return value is true, else, it is false.
- Note that the tuning parameters should be set before the database is opened.
The function `tchdbsetcache' is used in order to set the caching parameters of a hash database object.
bool tchdbsetcache(TCHDB *hdb, int32_t rcnum);
- `hdb' specifies the hash database object which is not opened.
- `rcnum' specifies the maximum number of records to be cached. If it is not more than 0, the record cache is disabled. It is disabled by default.
- If successful, the return value is true, else, it is false.
- Note that the caching parameters should be set before the database is opened.
The function `tchdbsetxmsiz' is used in order to set the size of the extra mapped memory of a hash database object.
bool tchdbsetxmsiz(TCHDB *hdb, int64_t xmsiz);
- `hdb' specifies the hash database object which is not opened.
- `xmsiz' specifies the size of the extra mapped memory. If it is not more than 0, the extra mapped memory is disabled. The default size is 67108864.
- If successful, the return value is true, else, it is false.
- Note that the mapping parameters should be set before the database is opened.
The function `tchdbopen' is used in order to open a database file and connect a hash database object.
bool tchdbopen(TCHDB *hdb, const char *path, int omode);
- `hdb' specifies the hash database object which is not opened.
- `path' specifies the path of the database file.
- `omode' specifies the connection mode: `HDBOWRITER' as a writer, `HDBOREADER' as a reader. If the mode is `HDBOWRITER', the following may be added by bitwise or: `HDBOCREAT', which means it creates a new database if not exist, `HDBOTRUNC', which means it creates a new database regardless if one exists, `HDBOTSYNC', which means every transaction synchronizes updated contents with the device. Both of `HDBOREADER' and `HDBOWRITER' can be added to by bitwise or: `HDBONOLCK', which means it opens the database file without file locking, or `HDBOLCKNB', which means locking is performed without blocking.
- If successful, the return value is true, else, it is false.
The function `tchdbclose' is used in order to close a hash database object.
bool tchdbclose(TCHDB *hdb);
- `hdb' specifies the hash database object.
- If successful, the return value is true, else, it is false.
- Update of a database is assured to be written when the database is closed. If a writer opens a database but does not close it appropriately, the database will be broken.
The function `tchdbput' is used in order to store a record into a hash database object.
bool tchdbput(TCHDB *hdb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `hdb' specifies the hash database object connected as a writer.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, it is overwritten.
The function `tchdbput2' is used in order to store a string record into a hash database object.
bool tchdbput2(TCHDB *hdb, const char *kstr, const char *vstr);
- `hdb' specifies the hash database object connected as a writer.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, it is overwritten.
The function `tchdbputkeep' is used in order to store a new record into a hash database object.
bool tchdbputkeep(TCHDB *hdb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `hdb' specifies the hash database object connected as a writer.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, this function has no effect.
The function `tchdbputkeep2' is used in order to store a new string record into a hash database object.
bool tchdbputkeep2(TCHDB *hdb, const char *kstr, const char *vstr);
- `hdb' specifies the hash database object connected as a writer.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, this function has no effect.
The function `tchdbputcat' is used in order to concatenate a value at the end of the existing record in a hash database object.
bool tchdbputcat(TCHDB *hdb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `hdb' specifies the hash database object connected as a writer.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If successful, the return value is true, else, it is false.
- If there is no corresponding record, a new record is created.
The function `tchdbputcat2' is used in order to concatenate a string value at the end of the existing record in a hash database object.
bool tchdbputcat2(TCHDB *hdb, const char *kstr, const char *vstr);
- `hdb' specifies the hash database object connected as a writer.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If successful, the return value is true, else, it is false.
- If there is no corresponding record, a new record is created.
The function `tchdbputasync' is used in order to store a record into a hash database object in asynchronous fashion.
bool tchdbputasync(TCHDB *hdb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `hdb' specifies the hash database object connected as a writer.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, it is overwritten. Records passed to this function are accumulated into the inner buffer and wrote into the file at a blast.
The function `tchdbputasync2' is used in order to store a string record into a hash database object in asynchronous fashion.
bool tchdbputasync2(TCHDB *hdb, const char *kstr, const char *vstr);
- `hdb' specifies the hash database object connected as a writer.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, it is overwritten. Records passed to this function are accumulated into the inner buffer and wrote into the file at a blast.
The function `tchdbout' is used in order to remove a record of a hash database object.
bool tchdbout(TCHDB *hdb, const void *kbuf, int ksiz);
- `hdb' specifies the hash database object connected as a writer.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- If successful, the return value is true, else, it is false.
The function `tchdbout2' is used in order to remove a string record of a hash database object.
bool tchdbout2(TCHDB *hdb, const char *kstr);
- `hdb' specifies the hash database object connected as a writer.
- `kstr' specifies the string of the key.
- If successful, the return value is true, else, it is false.
The function `tchdbget' is used in order to retrieve a record in a hash database object.
void *tchdbget(TCHDB *hdb, const void *kbuf, int ksiz, int *sp);
- `hdb' specifies the hash database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the region of the value of the corresponding record. `NULL' is returned if no record corresponds.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tchdbget2' is used in order to retrieve a string record in a hash database object.
char *tchdbget2(TCHDB *hdb, const char *kstr);
- `hdb' specifies the hash database object.
- `kstr' specifies the string of the key.
- If successful, the return value is the string of the value of the corresponding record. `NULL' is returned if no record corresponds.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tchdbget3' is used in order to retrieve a record in a hash database object and write the value into a buffer.
int tchdbget3(TCHDB *hdb, const void *kbuf, int ksiz, void *vbuf, int max);
- `hdb' specifies the hash database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the buffer into which the value of the corresponding record is written.
- `max' specifies the size of the buffer.
- If successful, the return value is the size of the written data, else, it is -1. -1 is returned if no record corresponds to the specified key.
- Note that an additional zero code is not appended at the end of the region of the writing buffer.
The function `tchdbvsiz' is used in order to get the size of the value of a record in a hash database object.
int tchdbvsiz(TCHDB *hdb, const void *kbuf, int ksiz);
- `hdb' specifies the hash database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- If successful, the return value is the size of the value of the corresponding record, else, it is -1.
The function `tchdbvsiz2' is used in order to get the size of the value of a string record in a hash database object.
int tchdbvsiz2(TCHDB *hdb, const char *kstr);
- `hdb' specifies the hash database object.
- `kstr' specifies the string of the key.
- If successful, the return value is the size of the value of the corresponding record, else, it is -1.
The function `tchdbiterinit' is used in order to initialize the iterator of a hash database object.
bool tchdbiterinit(TCHDB *hdb);
- `hdb' specifies the hash database object.
- If successful, the return value is true, else, it is false.
- The iterator is used in order to access the key of every record stored in a database.
The function `tchdbiternext' is used in order to get the next key of the iterator of a hash database object.
void *tchdbiternext(TCHDB *hdb, int *sp);
- `hdb' specifies the hash database object.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the region of the next key, else, it is `NULL'. `NULL' is returned when no record is to be get out of the iterator.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use. It is possible to access every record by iteration of calling this function. It is allowed to update or remove records whose keys are fetched while the iteration. However, it is not assured if updating the database is occurred while the iteration. Besides, the order of this traversal access method is arbitrary, so it is not assured that the order of storing matches the one of the traversal access.
The function `tchdbiternext2' is used in order to get the next key string of the iterator of a hash database object.
char *tchdbiternext2(TCHDB *hdb);
- `hdb' specifies the hash database object.
- If successful, the return value is the string of the next key, else, it is `NULL'. `NULL' is returned when no record is to be get out of the iterator.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use. It is possible to access every record by iteration of calling this function. However, it is not assured if updating the database is occurred while the iteration. Besides, the order of this traversal access method is arbitrary, so it is not assured that the order of storing matches the one of the traversal access.
The function `tchdbiternext3' is used in order to get the next extensible objects of the iterator of a hash database object.
bool tchdbiternext3(TCHDB *hdb, TCXSTR *kxstr, TCXSTR *vxstr);
- `hdb' specifies the hash database object.
- `kxstr' specifies the object into which the next key is wrote down.
- `vxstr' specifies the object into which the next value is wrote down.
- If successful, the return value is true, else, it is false. False is returned when no record is to be get out of the iterator.
The function `tchdbfwmkeys' is used in order to get forward matching keys in a hash database object.
TCLIST *tchdbfwmkeys(TCHDB *hdb, const void *pbuf, int psiz, int max);
- `hdb' specifies the hash database object.
- `pbuf' specifies the pointer to the region of the prefix.
- `psiz' specifies the size of the region of the prefix.
- `max' specifies the maximum number of keys to be fetched. If it is negative, no limit is specified.
- The return value is a list object of the corresponding keys. This function does never fail and return an empty list even if no key corresponds.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use. Note that this function may be very slow because every key in the database is scanned.
The function `tchdbfwmkeys2' is used in order to get forward matching string keys in a hash database object.
TCLIST *tchdbfwmkeys2(TCHDB *hdb, const char *pstr, int max);
- `hdb' specifies the hash database object.
- `pstr' specifies the string of the prefix.
- `max' specifies the maximum number of keys to be fetched. If it is negative, no limit is specified.
- The return value is a list object of the corresponding keys. This function does never fail and return an empty list even if no key corresponds.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use. Note that this function may be very slow because every key in the database is scanned.
The function `tchdbaddint' is used in order to add an integer to a record in a hash database object.
int tchdbaddint(TCHDB *hdb, const void *kbuf, int ksiz, int num);
- `hdb' specifies the hash database object connected as a writer.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `num' specifies the additional value.
- If successful, the return value is the summation value, else, it is `INT_MIN'.
- If the corresponding record exists, the value is treated as an integer and is added to. If no record corresponds, a new record of the additional value is stored.
The function `tchdbdbadddouble' is used in order to add a real number to a record in a hash database object.
double tchdbadddouble(TCHDB *hdb, const void *kbuf, int ksiz, double num);
- `hdb' specifies the hash database object connected as a writer.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `num' specifies the additional value.
- If successful, the return value is the summation value, else, it is `NAN'.
- If the corresponding record exists, the value is treated as a real number and is added to. If no record corresponds, a new record of the additional value is stored.
The function `tchdbsync' is used in order to synchronize updated contents of a hash database object with the file and the device.
bool tchdbsync(TCHDB *hdb);
- `hdb' specifies the hash database object connected as a writer.
- If successful, the return value is true, else, it is false.
- This function is useful when another process connects to the same database file.
The function `tchdboptimize' is used in order to optimize the file of a hash database object.
bool tchdboptimize(TCHDB *hdb, int64_t bnum, int8_t apow, int8_t fpow, uint8_t opts);
- `hdb' specifies the hash database object connected as a writer.
- `bnum' specifies the number of elements of the bucket array. If it is not more than 0, the default value is specified. The default value is two times of the number of records.
- `apow' specifies the size of record alignment by power of 2. If it is negative, the current setting is not changed.
- `fpow' specifies the maximum number of elements of the free block pool by power of 2. If it is negative, the current setting is not changed.
- `opts' specifies options by bitwise or: `HDBTLARGE' specifies that the size of the database can be larger than 2GB by using 64-bit bucket array, `HDBTDEFLATE' specifies that each record is compressed with Deflate encoding, `HDBTTCBS' specifies that each record is compressed with TCBS encoding. If it is `UINT8_MAX', the current setting is not changed.
- If successful, the return value is true, else, it is false.
- This function is useful to reduce the size of the database file with data fragmentation by successive updating.
The function `tchdbvanish' is used in order to remove all records of a hash database object.
bool tchdbvanish(TCHDB *hdb);
- `hdb' specifies the hash database object connected as a writer.
- If successful, the return value is true, else, it is false.
The function `tchdbcopy' is used in order to copy the database file of a hash database object.
bool tchdbcopy(TCHDB *hdb, const char *path);
- `hdb' specifies the hash database object.
- `path' specifies the path of the destination file. If it begins with `@', the trailing substring is executed as a command line.
- If successful, the return value is true, else, it is false. False is returned if the executed command returns non-zero code.
- The database file is assured to be kept synchronized and not modified while the copying or executing operation is in progress. So, this function is useful to create a backup file of the database file.
The function `tchdbtranbegin' is used in order to begin the transaction of a hash database object.
bool tchdbtranbegin(TCHDB *hdb);
- `hdb' specifies the hash database object connected as a writer.
- If successful, the return value is true, else, it is false.
- The database is locked by the thread while the transaction so that only one transaction can be activated with a database object at the same time. Thus, the serializable isolation level is assumed if every database operation is performed in the transaction. All updated regions are kept track of by write ahead logging while the transaction. If the database is closed during transaction, the transaction is aborted implicitly.
The function `tchdbtrancommit' is used in order to commit the transaction of a hash database object.
bool tchdbtrancommit(TCHDB *hdb);
- `hdb' specifies the hash database object connected as a writer.
- If successful, the return value is true, else, it is false.
- Update in the transaction is fixed when it is committed successfully.
The function `tchdbtranabort' is used in order to abort the transaction of a hash database object.
bool tchdbtranabort(TCHDB *hdb);
- `hdb' specifies the hash database object connected as a writer.
- If successful, the return value is true, else, it is false.
- Update in the transaction is discarded when it is aborted. The state of the database is rollbacked to before transaction.
The function `tchdbpath' is used in order to get the file path of a hash database object.
const char *tchdbpath(TCHDB *hdb);
- `hdb' specifies the hash database object.
- The return value is the path of the database file or `NULL' if the object does not connect to any database file.
The function `tchdbrnum' is used in order to get the number of records of a hash database object.
uint64_t tchdbrnum(TCHDB *hdb);
- `hdb' specifies the hash database object.
- The return value is the number of records or 0 if the object does not connect to any database file.
The function `tchdbfsiz' is used in order to get the size of the database file of a hash database object.
uint64_t tchdbfsiz(TCHDB *hdb);
- `hdb' specifies the hash database object.
- The return value is the size of the database file or 0 if the object does not connect to any database file.
コード例
ハッシュデータベースを使ったコード例を以下に示します。
#include <tcutil.h>
#include <tchdb.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>
int main(int argc, char **argv){
TCHDB *hdb;
int ecode;
char *key, *value;
/* オブジェクトを作成する */
hdb = tchdbnew();
/* データベースを開く */
if(!tchdbopen(hdb, "casket.hdb", HDBOWRITER | HDBOCREAT)){
ecode = tchdbecode(hdb);
fprintf(stderr, "open error: %s\n", tchdberrmsg(ecode));
}
/* レコードを格納する */
if(!tchdbput2(hdb, "foo", "hop") ||
!tchdbput2(hdb, "bar", "step") ||
!tchdbput2(hdb, "baz", "jump")){
ecode = tchdbecode(hdb);
fprintf(stderr, "put error: %s\n", tchdberrmsg(ecode));
}
/* レコードを取得する */
value = tchdbget2(hdb, "foo");
if(value){
printf("%s\n", value);
free(value);
} else {
ecode = tchdbecode(hdb);
fprintf(stderr, "get error: %s\n", tchdberrmsg(ecode));
}
/* 横断的にレコードを参照する */
tchdbiterinit(hdb);
while((key = tchdbiternext2(hdb)) != NULL){
value = tchdbget2(hdb, key);
if(value){
printf("%s:%s\n", key, value);
free(value);
}
free(key);
}
/* データベースを閉じる */
if(!tchdbclose(hdb)){
ecode = tchdbecode(hdb);
fprintf(stderr, "close error: %s\n", tchdberrmsg(ecode));
}
/* オブジェクトを破棄する */
tchdbdel(hdb);
return 0;
}
CLI
ハッシュデータベースAPIを簡単に利用するために、コマンドラインインターフェイスとして `tchtest
' と `tchmttest
' と `tchmgr
' が提供されます。
コマンド `tchtest
' は、ハッシュデータベースAPIの機能テストや性能テストに用いるツールです。以下の書式で用います。`path' はデータベースファイルのパスを指定し、`rnum' は試行回数を指定し、`bnum' はバケット数を指定し、`apow' はアラインメント力を指定し、`fpow' はフリーブロックプール力を指定します。
tchtest write [-mt] [-tl] [-td|-tb|-tt|-tx] [-rc num] [-xm num] [-nl|-nb] [-as] [-rnd] path rnum [bnum [apow [fpow]]]
- `00000001'、`00000002' のように変化する8バイトのキーと値を連続してデータベースに追加する。
tchtest read [-mt] [-rc num] [-xm num] [-nl|-nb] [-wb] [-rnd] path
- 上記で生成したデータベースの全レコードを検索する。
tchtest remove [-mt] [-rc num] [-xm num] [-nl|-nb] [-rnd] path
- 上記で生成したデータベースの全レコードを削除する。
tchtest rcat [-mt] [-tl] [-td|-tb|-tt|-tx] [-rc num] [-xm num] [-nl|-nb] [-pn num] [-dai|-dad|-rl] path rnum [bnum [apow [fpow]]]
- キーがある程度重複するようにレコードの追加を行い、連結モードで処理する。
tchtest misc [-mt] [-tl] [-td|-tb|-tt|-tx] [-nl|-nb] path rnum
- 各種操作の組み合わせテストを行う。
tchtest wicked [-mt] [-tl] [-td|-tb|-tt|-tx] [-nl|-nb] path rnum
- 各種更新操作を無作為に選択して実行する。
各オプションは以下の機能を持ちます
-mt
: 関数 `tchdbsetmutex' を呼び出す。
-tl
: オプション `HDBTLARGE' を有効にする。
-td
: オプション `HDBTDEFLATE' を有効にする。
-tb
: オプション `HDBTBZIP' を有効にする。
-tt
: オプション `HDBTTCBS' を有効にする。
-tx
: オプション `HDBTEXCODEC' を有効にする。
-rc num
: レコード用キャッシュの最大数を指定する。
-xm num
: 拡張マップメモリのサイズを指定する。
-nl
: オプション `HDBNOLCK' を有効にする。
-nb
: オプション `HDBLCKNB' を有効にする。
-as
: 関数 `tchdbput' の代わりに関数 `tchdbputasync' を用いる。
-rnd
: キーを無作為に選択する。
-wb
: 関数 `tchdbget' の代わりに関数 `tchdbget3' を用いる。
-pn num
: パターン数を指定する。
-dai
: 関数 `tchdbputcat' の代わりに関数 `tchdbaddint' を用いる。
-dad
: 関数 `tchdbputcat' の代わりに関数 `tchdbadddouble' を用いる。
-rl
: 値を無作為な長さにする。
このコマンドは処理が正常に終了すれば 0 を返し、エラーがあればそれ以外の値を返して終了します。
コマンド `tchmttest
' は、ハッシュデータベースAPIの機能テストをマルチスレッドで行うツールです。以下の書式で用います。`path' はデータベースファイルのパスを指定し、`tnum' はスレッド数を指定し、`rnum' は試行回数を指定し、`bnum' はバケット数を指定し、`apow' はアラインメント力を指定し、`fpow' はフリーブロックプール力を指定します。
tchmttest write [-tl] [-td|-tb|-tt|-tx] [-rc num] [-xm num] [-nl|-nb] [-as] [-rnd] path tnum rnum [bnum [apow [fpow]]]
- `00000001'、`00000002' のように変化する8バイトのキーと値を連続してデータベースに追加する。
tchmttest read [-rc num] [-xm num] [-nl|-nb] [-wb] [-rnd] path tnum
- 上記で生成したデータベースの全レコードを検索する。
tchmttest remove [-rc num] [-xm num] [-nl|-nb] [-rnd] path tnum
- 上記で生成したデータベースの全レコードを削除する。
tchmttest wicked [-tl] [-td|-tb|-tt|-tx] [-nl|-nb] [-nc] path tnum rnum
- 各種更新操作を無作為に選択して実行する。
tchmttest typical [-tl] [-td|-tb|-tt|-tx] [-rc num] [-xm num] [-nl|-nb] [-nc] [-rr num] path tnum rnum [bnum [apow [fpow]]
- 典型的な操作を無作為に選択して実行する。
各オプションは以下の機能を持ちます
-tl
: オプション `HDBTLARGE' を有効にする。
-td
: オプション `HDBTDEFLATE' を有効にする。
-tb
: オプション `HDBTBZIP' を有効にする。
-tt
: オプション `HDBTTCBS' を有効にする。
-tx
: オプション `HDBTEXCODEC' を有効にする。
-rc num
: レコード用キャッシュの最大数を指定する。
-xm num
: 拡張マップメモリのサイズを指定する。
-nl
: オプション `HDBNOLCK' を有効にする。
-nb
: オプション `HDBLCKNB' を有効にする。
-as
: 関数 `tchdbput' の代わりに関数 `tchdbputasync' を用いる。
-rnd
: キーを無作為に選択する。
-wb
: 関数 `tchdbget' の代わりに関数 `tchdbget3' を用いる。
-nc
: 比較テストを行わない。
-rr num
: 読み込み操作の割合を百分率で指定する。
このコマンドは処理が正常に終了すれば 0 を返し、エラーがあればそれ以外の値を返して終了します。
コマンド `tchmgr
' は、ハッシュデータベースAPIやそのアプリケーションのテストやデバッグに役立つツールです。以下の書式で用います。`path' はデータベースファイルのパスを指定し、`bnum' はバケット数を指定し、`apow' はアラインメント力を指定し、`fpow' はフリーブロックプール力を指定し、`key' はレコードのキーを指定し、`value' はレコードの値を指定し、`file' は入力ファイルを指定します。
tchmgr create [-tl] [-td|-tb|-tt|-tx] path [bnum [apow [fpow]]]
- データベースファイルを作成する。
tchmgr inform [-nl|-nb] path
- データベースの雑多な情報を出力する。
tchmgr put [-nl|-nb] [-sx] [-dk|-dc] path key value
- レコードを追加する。
tchmgr out [-nl|-nb] [-sx] path key
- レコードを削除する。
tchmgr get [-nl|-nb] [-sx] [-px] [-pz] path key
- レコードの値を取得して標準出力する。
tchmgr list [-nl|-nb] [-m num] [-pv] [-px] [-fm str] path
- 全てのレコードのキーを改行で区切って標準出力する。
tchmgr optimize [-tl] [-td|-tb|-tt|-tx] [-tz] [-nl|-nb] path [bnum [apow [fpow]]]
- データベースを最適化する。
tchmgr importtsv [-nl|-nb] [-sc] path [file]
- TSVファイルの各行をキーと値とみなしてレコードを登録する。
tchmgr version
- Tokyo Cabinetのバージョン情報を標準出力する。
各オプションは以下の機能を持ちます
-tl
: オプション `HDBTLARGE' を有効にする。
-td
: オプション `HDBTDEFLATE' を有効にする。
-tb
: オプション `HDBTBZIP' を有効にする。
-tt
: オプション `HDBTTCBS' を有効にする。
-tx
: オプション `HDBTEXCODEC' を有効にする。
-nl
: オプション `HDBNOLCK' を有効にする。
-nb
: オプション `HDBLCKNB' を有効にする。
-sx
: 入力を16進数の文字列で行う。
-dk
: 関数 `tchdbput' の代わりに関数 `tchdbputkeep' を用いる。
-dc
: 関数 `tchdbput' の代わりに関数 `tchdbputcat' を用いる。
-px
: 出力を16進数の文字列で行う。
-pz
: 出力の末尾に改行を付加しない。
-m num
: 出力の最大数を指定する。
-pv
: レコードの値も出力する。
-fm str
: キーの接頭辞を指定する。
-tz
: オプション `UINT8_MAX' を有効にする。
-sc
: キーを小文字に正規化する。
このコマンドは処理が正常に終了すれば 0 を返し、エラーがあればそれ以外の値を返して終了します。
B+木データベースAPI
B+木データベースは、B+木を単一のファイルに記録したデータベースです。それを扱うのがB+木データベースAPIです。`tcbdb.h
' にAPIの仕様の完全な記述があります。
概要
ハッシュデータベースAPIを使うためには、`tcutil.h
'、`tcbdb.h
' および関連する標準ヘッダファイルをインクルードしてください。通常、ソースファイルの冒頭付近で以下の記述を行います。
#include <tcutil.h>
#include <tcbdb.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>
B+木データベースを扱う際には、`TCBDB
' 型へのポインタをオブジェクトとして用います。B+木データベースオブジェクトは、関数 `tcbdbnew
' で作成し、関数 `tcbdbdel
' で破棄します。作成したオブジェクトを使い終わったら必ず破棄してください。そうしないとメモリリークが発生します。
レコードの格納や探索を行う前提として、B+木ーデータベースオブジェクトをデータベースファイルと接続させる必要があります。データベースファイルを開いて接続するには関数 `tcbdbopen
' を用い、接続の解除してファイルを閉じるには関数 `tcbdbclose
' を用います。開いたデータベースファイルは必ず閉じてください。そうしないとデータベースファイルが壊れたり格納したデータが失われたりする可能性があります。
API(英語ゴメソ)
The function `tcbdberrmsg' is used in order to get the message string corresponding to an error code.
const char *tcbdberrmsg(int ecode);
- `ecode' specifies the error code.
- The return value is the message string of the error code.
The function `tcbdbnew' is used in order to create a B+ tree database object.
TCBDB *tcbdbnew(void);
- The return value is the new B+ tree database object.
The function `tcbdbdel' is used in order to delete a B+ tree database object.
void tcbdbdel(TCBDB *bdb);
- `bdb' specifies the B+ tree database object.
- If the database is not closed, it is closed implicitly. Note that the deleted object and its derivatives can not be used anymore.
The function `tcbdbecode' is used in order to get the last happened error code of a B+ tree database object.
int tcbdbecode(TCBDB *bdb);
- `bdb' specifies the B+ tree database object.
- The return value is the last happened error code.
- The following error codes are defined: `TCESUCCESS' for success, `TCETHREAD' for threading error, `TCEINVALID' for invalid operation, `TCENOFILE' for file not found, `TCENOPERM' for no permission, `TCEMETA' for invalid meta data, `TCERHEAD' for invalid record header, `TCEOPEN' for open error, `TCECLOSE' for close error, `TCETRUNC' for trunc error, `TCESYNC' for sync error, `TCESTAT' for stat error, `TCESEEK' for seek error, `TCEREAD' for read error, `TCEWRITE' for write error, `TCEMMAP' for mmap error, `TCELOCK' for lock error, `TCEUNLINK' for unlink error, `TCERENAME' for rename error, `TCEMKDIR' for mkdir error, `TCERMDIR' for rmdir error, `TCEKEEP' for existing record, `TCENOREC' for no record found, and `TCEMISC' for miscellaneous error.
The function `tcbdbsetmutex' is used in order to set mutual exclusion control of a B+ tree database object for threading.
bool tcbdbsetmutex(TCBDB *bdb);
- `bdb' specifies the B+ tree database object which is not opened.
- If successful, the return value is true, else, it is false.
- Note that the mutual exclusion control is needed if the object is shared by plural threads and this function should should be called before the database is opened.
The function `tcbdbsetcmpfunc' is used in order to set the custom comparison function of a B+ tree database object.
bool tcbdbsetcmpfunc(TCBDB *bdb, TCCMP cmp, void *cmpop);
- `bdb' specifies the B+ tree database object which is not opened.
- `cmp' specifies the pointer to the custom comparison function.
- `cmpop' specifies an arbitrary pointer to be given as a parameter of the comparison function. If it is not needed, `NULL' can be specified.
- If successful, the return value is true, else, it is false.
- The default comparison function compares keys of two records by lexical order. The functions `tccmplexical' (dafault), `tccmpdecimal', `tccmpint32', and `tccmpint64' are built-in. Note that the comparison function should be set before the database is opened. Moreover, user-defined comparison functions should be set every time the database is being opened.
The function `tcbdbtune' is used in order to set the tuning parameters of a B+ tree database object.
bool tcbdbtune(TCBDB *bdb, int32_t lmemb, int32_t nmemb, int64_t bnum, int8_t apow, int8_t fpow, uint8_t opts);
- `bdb' specifies the B+ tree database object which is not opened.
- `lmemb' specifies the number of members in each leaf page. If it is not more than 0, the default value is specified. The default value is 128.
- `nmemb' specifies the number of members in each non-leaf page. If it is not more than 0, the default value is specified. The default value is 256.
- `bnum' specifies the number of elements of the bucket array. If it is not more than 0, the default value is specified. The default value is 32749. Suggested size of the bucket array is about from 1 to 4 times of the number of all pages to be stored.
- `apow' specifies the size of record alignment by power of 2. If it is negative, the default value is specified. The default value is 8 standing for 2^8=256.
- `fpow' specifies the maximum number of elements of the free block pool by power of 2. If it is negative, the default value is specified. The default value is 10 standing for 2^10=1024.
- `opts' specifies options by bitwise or: `BDBTLARGE' specifies that the size of the database can be larger than 2GB by using 64-bit bucket array, `BDBTDEFLATE' specifies that each page is compressed with Deflate encoding, `BDBTBZIP' specifies that each page is compressed with BZIP2 encoding, `BDBTTCBS' specifies that each page is compressed with TCBS encoding.
- If successful, the return value is true, else, it is false.
- Note that the tuning parameters should be set before the database is opened.
The function `tcbdbsetcache' is used in order to set the caching parameters of a B+ tree database object.
bool tcbdbsetcache(TCBDB *bdb, int32_t lcnum, int32_t ncnum);
- `bdb' specifies the B+ tree database object which is not opened.
- `lcnum' specifies the maximum number of leaf nodes to be cached. If it is not more than 0, the default value is specified. The default value is 1024.
- `ncnum' specifies the maximum number of non-leaf nodes to be cached. If it is not more than 0, the default value is specified. The default value is 512.
- If successful, the return value is true, else, it is false.
- Note that the caching parameters should be set before the database is opened.
The function `tcbdbsetxmsiz' is used in order to set the size of the extra mapped memory of a B+ tree database object.
bool tcbdbsetxmsiz(TCBDB *bdb, int64_t xmsiz);
- `bdb' specifies the B+ tree database object which is not opened.
- `xmsiz' specifies the size of the extra mapped memory. If it is not more than 0, the extra mapped memory is disabled. It is disabled by default.
- If successful, the return value is true, else, it is false.
- Note that the mapping parameters should be set before the database is opened.
The function `tcbdbopen' is used in order to open a database file and connect a B+ tree database object.
bool tcbdbopen(TCBDB *bdb, const char *path, int omode);
- `bdb' specifies the B+ tree database object which is not opened.
- `path' specifies the path of the database file.
- `omode' specifies the connection mode: `BDBOWRITER' as a writer, `BDBOREADER' as a reader. If the mode is `BDBOWRITER', the following may be added by bitwise or: `BDBOCREAT', which means it creates a new database if not exist, `BDBOTRUNC', which means it creates a new database regardless if one exists, `BDBOTSYNC', which means every transaction synchronizes updated contents with the device. Both of `BDBOREADER' and `BDBOWRITER' can be added to by bitwise or: `BDBONOLCK', which means it opens the database file without file locking, or `BDBOLCKNB', which means locking is performed without blocking.
- If successful, the return value is true, else, it is false.
The function `tcbdbclose' is used in order to close a B+ tree database object.
bool tcbdbclose(TCBDB *bdb);
- `bdb' specifies the B+ tree database object.
- If successful, the return value is true, else, it is false.
- Update of a database is assured to be written when the database is closed. If a writer opens a database but does not close it appropriately, the database will be broken.
The function `tcbdbput' is used in order to store a record into a B+ tree database object.
bool tcbdbput(TCBDB *bdb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `bdb' specifies the B+ tree database object connected as a writer.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, it is overwritten.
The function `tcbdbput2' is used in order to store a string record into a B+ tree database object.
bool tcbdbput2(TCBDB *bdb, const char *kstr, const char *vstr);
- `bdb' specifies the B+ tree database object connected as a writer.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, it is overwritten.
The function `tcbdbputkeep' is used in order to store a new record into a B+ tree database object.
bool tcbdbputkeep(TCBDB *bdb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `bdb' specifies the B+ tree database object connected as a writer.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, this function has no effect.
The function `tcbdbputkeep2' is used in order to store a new string record into a B+ tree database object.
bool tcbdbputkeep2(TCBDB *bdb, const char *kstr, const char *vstr);
- `bdb' specifies the B+ tree database object connected as a writer.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, this function has no effect.
The function `tcbdbputcat' is used in order to concatenate a value at the end of the existing record in a B+ tree database object.
bool tcbdbputcat(TCBDB *bdb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `bdb' specifies the B+ tree database object connected as a writer.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If successful, the return value is true, else, it is false.
- If there is no corresponding record, a new record is created.
The function `tcbdbputcat2' is used in order to concatenate a stirng value at the end of the existing record in a B+ tree database object.
bool tcbdbputcat2(TCBDB *bdb, const char *kstr, const char *vstr);
- `bdb' specifies the B+ tree database object connected as a writer.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If successful, the return value is true, else, it is false.
- If there is no corresponding record, a new record is created.
The function `tcbdbputdup' is used in order to store a record into a B+ tree database object with allowing duplication of keys.
bool tcbdbputdup(TCBDB *bdb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `bdb' specifies the B+ tree database object connected as a writer.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, the new record is placed after the existing one.
The function `tcbdbputdup2' is used in order to store a string record into a B+ tree database object with allowing duplication of keys.
bool tcbdbputdup2(TCBDB *bdb, const char *kstr, const char *vstr);
- `bdb' specifies the B+ tree database object connected as a writer.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, the new record is placed after the existing one.
The function `tcbdbputdup3' is used in order to store records into a B+ tree database object with allowing duplication of keys.
bool tcbdbputdup3(TCBDB *bdb, const void *kbuf, int ksiz, const TCLIST *vals);
- `bdb' specifies the B+ tree database object connected as a writer.
- `kbuf' specifies the pointer to the region of the common key.
- `ksiz' specifies the size of the region of the common key.
- `vals' specifies a list object containing values.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, the new records are placed after the existing one.
The function `tcbdbout' is used in order to remove a record of a B+ tree database object.
bool tcbdbout(TCBDB *bdb, const void *kbuf, int ksiz);
- `bdb' specifies the B+ tree database object connected as a writer.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- If successful, the return value is true, else, it is false.
- If the key of duplicated records is specified, the first one is selected.
The function `tcbdbout2' is used in order to remove a string record of a B+ tree database object.
bool tcbdbout2(TCBDB *bdb, const char *kstr);
- `bdb' specifies the B+ tree database object connected as a writer.
- `kstr' specifies the string of the key.
- If successful, the return value is true, else, it is false.
- If the key of duplicated records is specified, the first one is selected.
The function `tcbdbout3' is used in order to remove records of a B+ tree database object.
bool tcbdbout3(TCBDB *bdb, const void *kbuf, int ksiz);
- `bdb' specifies the B+ tree database object connected as a writer.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- If successful, the return value is true, else, it is false.
- If the key of duplicated records is specified, all of them are removed.
The function `tcbdbget' is used in order to retrieve a record in a B+ tree database object.
void *tcbdbget(TCBDB *bdb, const void *kbuf, int ksiz, int *sp);
- `bdb' specifies the B+ tree database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the region of the value of the corresponding record. `NULL' is returned if no record corresponds.
- If the key of duplicated records is specified, the first one is selected. Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcbdbget2' is used in order to retrieve a string record in a B+ tree database object.
char *tcbdbget2(TCBDB *bdb, const char *kstr);
- `bdb' specifies the B+ tree database object.
- `kstr' specifies the string of the key.
- If successful, the return value is the string of the value of the corresponding record. `NULL' is returned if no record corresponds.
- If the key of duplicated records is specified, the first one is selected. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcbdbget3' is used in order to retrieve a record in a B+ tree database object as a volatile buffer.
const void *tcbdbget3(TCBDB *bdb, const void *kbuf, int ksiz, int *sp);
- `bdb' specifies the B+ tree database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the region of the value of the corresponding record. `NULL' is returned if no record corresponds.
- If the key of duplicated records is specified, the first one is selected. Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is volatile and it may be spoiled by another operation of the database, the data should be copied into another involatile buffer immediately.
The function `tcbdbget4' is used in order to retrieve records in a B+ tree database object.
TCLIST *tcbdbget4(TCBDB *bdb, const void *kbuf, int ksiz);
- `bdb' specifies the B+ tree database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- If successful, the return value is a list object of the values of the corresponding records. `NULL' is returned if no record corresponds.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use.
The function `tcbdbvnum' is used in order to get the number of records corresponding a key in a B+ tree database object.
int tcbdbvnum(TCBDB *bdb, const void *kbuf, int ksiz);
- `bdb' specifies the B+ tree database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- If successful, the return value is the number of the corresponding records, else, it is 0.
The function `tcbdbvnum2' is used in order to get the number of records corresponding a string key in a B+ tree database object.
int tcbdbvnum2(TCBDB *bdb, const char *kstr);
- `bdb' specifies the B+ tree database object.
- `kstr' specifies the string of the key.
- If successful, the return value is the number of the corresponding records, else, it is 0.
The function `tcbdbvsiz' is used in order to get the size of the value of a record in a B+ tree database object.
int tcbdbvsiz(TCBDB *bdb, const void *kbuf, int ksiz);
- `bdb' specifies the B+ tree database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- If successful, the return value is the size of the value of the corresponding record, else, it is -1.
- If the key of duplicated records is specified, the first one is selected.
The function `tcbdbvsiz2' is used in order to get the size of the value of a string record in a B+ tree database object.
int tcbdbvsiz2(TCBDB *bdb, const char *kstr);
- `bdb' specifies the B+ tree database object.
- `kstr' specifies the string of the key.
- If successful, the return value is the size of the value of the corresponding record, else, it is -1.
- If the key of duplicated records is specified, the first one is selected.
The function `tcbdbrange' is used in order to get keys of ranged records in a B+ tree database object.
TCLIST *tcbdbrange(TCBDB *bdb, const void *bkbuf, int bksiz, bool binc, const void *ekbuf, int eksiz, bool einc, int max);
- `bdb' specifies the B+ tree database object.
- `bkbuf' specifies the pointer to the region of the key of the beginning border. If it is `NULL', the first record is specified.
- `bksiz' specifies the size of the region of the beginning key.
- `binc' specifies whether the beginning border is inclusive or not.
- `ekbuf' specifies the pointer to the region of the key of the ending border. If it is `NULL', the last record is specified.
- `eksiz' specifies the size of the region of the ending key.
- `einc' specifies whether the ending border is inclusive or not.
- `max' specifies the maximum number of keys to be fetched. If it is negative, no limit is specified.
- The return value is a list object of the keys of the corresponding records. This function does never fail and return an empty list even if no record corresponds.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use.
The function `tcbdbrange2' is used in order to get string keys of ranged records in a B+ tree database object.
TCLIST *tcbdbrange2(TCBDB *bdb, const char *bkstr, bool binc, const char *ekstr, bool einc, int max);
- `bdb' specifies the B+ tree database object.
- `bkstr' specifies the string of the key of the beginning border. If it is `NULL', the first record is specified.
- `binc' specifies whether the beginning border is inclusive or not.
- `ekstr' specifies the string of the key of the ending border. If it is `NULL', the last record is specified.
- `einc' specifies whether the ending border is inclusive or not.
- `max' specifies the maximum number of keys to be fetched. If it is negative, no limit is specified.
- The return value is a list object of the keys of the corresponding records. This function does never fail and return an empty list even if no record corresponds.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use.
The function `tcbdbfwmkeys' is used in order to get forward matching keys in a B+ tree database object.
TCLIST *tcbdbfwmkeys(TCBDB *bdb, const void *pbuf, int psiz, int max);
- `bdb' specifies the B+ tree database object.
- `pbuf' specifies the pointer to the region of the prefix.
- `psiz' specifies the size of the region of the prefix.
- `max' specifies the maximum number of keys to be fetched. If it is negative, no limit is specified.
- The return value is a list object of the corresponding keys. This function does never fail and return an empty list even if no key corresponds.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use.
The function `tcbdbfwmkeys2' is used in order to get forward matching string keys in a B+ tree database object.
TCLIST *tcbdbfwmkeys2(TCBDB *bdb, const char *pstr, int max);
- `bdb' specifies the B+ tree database object.
- `pstr' specifies the string of the prefix.
- `max' specifies the maximum number of keys to be fetched. If it is negative, no limit is specified.
- The return value is a list object of the corresponding keys. This function does never fail and return an empty list even if no key corresponds.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use.
The function `tcbdbaddint' is used in order to add an integer to a record in a B+ tree database object.
int tcbdbaddint(TCBDB *bdb, const void *kbuf, int ksiz, int num);
- `bdb' specifies the B+ tree database object connected as a writer.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `num' specifies the additional value.
- If successful, the return value is the summation value, else, it is `INT_MIN'.
- If the corresponding record exists, the value is treated as an integer and is added to. If no record corresponds, a new record of the additional value is stored.
The function `tcbdbadddouble' is used in order to add a real number to a record in a B+ tree database object.
double tcbdbadddouble(TCBDB *bdb, const void *kbuf, int ksiz, double num);
- `bdb' specifies the B+ tree database object connected as a writer.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `num' specifies the additional value.
- If successful, the return value is the summation value, else, it is `NAN'.
- If the corresponding record exists, the value is treated as a real number and is added to. If no record corresponds, a new record of the additional value is stored.
The function `tcbdbsync' is used in order to synchronize updated contents of a B+ tree database object with the file and the device.
bool tcbdbsync(TCBDB *bdb);
- `bdb' specifies the B+ tree database object connected as a writer.
- If successful, the return value is true, else, it is false.
- This function is useful when another process connects to the same database file.
The function `tcbdboptimize' is used in order to optimize the file of a B+ tree database object.
bool tcbdboptimize(TCBDB *bdb, int32_t lmemb, int32_t nmemb, int64_t bnum, int8_t apow, int8_t fpow, uint8_t opts);
- `bdb' specifies the B+ tree database object connected as a writer.
- `lmemb' specifies the number of members in each leaf page. If it is not more than 0, the current setting is not changed.
- `nmemb' specifies the number of members in each non-leaf page. If it is not more than 0, the current setting is not changed.
- `bnum' specifies the number of elements of the bucket array. If it is not more than 0, the default value is specified. The default value is two times of the number of pages.
- `apow' specifies the size of record alignment by power of 2. If it is negative, the current setting is not changed.
- `fpow' specifies the maximum number of elements of the free block pool by power of 2. If it is negative, the current setting is not changed.
- `opts' specifies options by bitwise or: `BDBTLARGE' specifies that the size of the database can be larger than 2GB by using 64-bit bucket array, `BDBTDEFLATE' specifies that each record is compressed with Deflate encoding, `BDBTTCBS' specifies that each page is compressed with TCBS encoding. If it is `UINT8_MAX', the current setting is not changed.
- If successful, the return value is true, else, it is false.
- This function is useful to reduce the size of the database file with data fragmentation by successive updating.
The function `tcbdbvanish' is used in order to remove all records of a B+ tree database object.
bool tcbdbvanish(TCBDB *bdb);
- `bdb' specifies the B+ tree database object connected as a writer.
- If successful, the return value is true, else, it is false.
The function `tcbdbcopy' is used in order to copy the database file of a B+ tree database object.
bool tcbdbcopy(TCBDB *bdb, const char *path);
- `bdb' specifies the B+ tree database object.
- `path' specifies the path of the destination file. If it begins with `@', the trailing substring is executed as a command line.
- If successful, the return value is true, else, it is false. False is returned if the executed command returns non-zero code.
- The database file is assured to be kept synchronized and not modified while the copying or executing operation is in progress. So, this function is useful to create a backup file of the database file.
The function `tcbdbtranbegin' is used in order to begin the transaction of a B+ tree database object.
bool tcbdbtranbegin(TCBDB *bdb);
- `bdb' specifies the B+ tree database object connected as a writer.
- If successful, the return value is true, else, it is false.
- The database is locked by the thread while the transaction so that only one transaction can be activated with a database object at the same time. Thus, the serializable isolation level is assumed if every database operation is performed in the transaction. Because all pages are cached on memory while the transaction, the amount of referred records is limited by the memory capacity. If the database is closed during transaction, the transaction is aborted implicitly.
The function `tcbdbtrancommit' is used in order to commit the transaction of a B+ tree database object.
bool tcbdbtrancommit(TCBDB *bdb);
- `bdb' specifies the B+ tree database object connected as a writer.
- If successful, the return value is true, else, it is false.
- Update in the transaction is fixed when it is committed successfully.
The function `tcbdbtranabort' is used in order to abort the transaction of a B+ tree database object.
bool tcbdbtranabort(TCBDB *bdb);
- `bdb' specifies the B+ tree database object connected as a writer.
- If successful, the return value is true, else, it is false.
- Update in the transaction is discarded when it is aborted. The state of the database is rollbacked to before transaction.
The function `tcbdbpath' is used in order to get the file path of a B+ tree database object.
const char *tcbdbpath(TCBDB *bdb);
- `bdb' specifies the B+ tree database object.
- The return value is the path of the database file or `NULL' if the object does not connect to any database file.
The function `tcbdbrnum' is used in order to get the number of records of a B+ tree database object.
uint64_t tcbdbrnum(TCBDB *bdb);
- `bdb' specifies the B+ tree database object.
- The return value is the number of records or 0 if the object does not connect to any database file.
The function `tcbdbfsiz' is used in order to get the size of the database file of a B+ tree database object.
uint64_t tcbdbfsiz(TCBDB *bdb);
- `bdb' specifies the B+ tree database object.
- The return value is the size of the database file or 0 if the object does not connect to any database file.
The function `tcbdbcurnew' is used in order to create a cursor object.
BDBCUR *tcbdbcurnew(TCBDB *bdb);
- `bdb' specifies the B+ tree database object.
- The return value is the new cursor object.
- Note that the cursor is available only after initialization with the `tcbdbcurfirst' or the `tcbdbcurjump' functions and so on. Moreover, the position of the cursor will be indefinite when the database is updated after the initialization of the cursor.
The function `tcbdbcurdel' is used in order to delete a cursor object.
void tcbdbcurdel(BDBCUR *cur);
- `cur' specifies the cursor object.
The function `tcbdbcurfirst' is used in order to move a cursor object to the first record.
bool tcbdbcurfirst(BDBCUR *cur);
- `cur' specifies the cursor object.
- If successful, the return value is true, else, it is false. False is returned if there is no record in the database.
The function `tcbdbcurlast' is used in order to move a cursor object to the last record.
bool tcbdbcurlast(BDBCUR *cur);
- `cur' specifies the cursor object.
- If successful, the return value is true, else, it is false. False is returned if there is no record in the database.
The function `tcbdbcurjump' is used in order to move a cursor object to the front of records corresponding a key.
bool tcbdbcurjump(BDBCUR *cur, const void *kbuf, int ksiz);
- `cur' specifies the cursor object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- If successful, the return value is true, else, it is false. False is returned if there is no record corresponding the condition.
- The cursor is set to the first record corresponding the key or the next substitute if completely matching record does not exist.
The function `tcbdbcurjump2' is used in order to move a cursor object to the front of records corresponding a key string.
bool tcbdbcurjump2(BDBCUR *cur, const char *kstr);
- `cur' specifies the cursor object.
- `kstr' specifies the string of the key.
- If successful, the return value is true, else, it is false. False is returned if there is no record corresponding the condition.
- The cursor is set to the first record corresponding the key or the next substitute if completely matching record does not exist.
The function `tcbdbcurprev' is used in order to move a cursor object to the previous record.
bool tcbdbcurprev(BDBCUR *cur);
- `cur' specifies the cursor object.
- If successful, the return value is true, else, it is false. False is returned if there is no previous record.
The function `tcbdbcurnext' is used in order to move a cursor object to the next record.
bool tcbdbcurnext(BDBCUR *cur);
- `cur' specifies the cursor object.
- If successful, the return value is true, else, it is false. False is returned if there is no next record.
The function `tcbdbcurput' is used in order to insert a record around a cursor object.
bool tcbdbcurput(BDBCUR *cur, const void *vbuf, int vsiz, int cpmode);
- `cur' specifies the cursor object of writer connection.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- `cpmode' specifies detail adjustment: `BDBCPCURRENT', which means that the value of the current record is overwritten, `BDBCPBEFORE', which means that the new record is inserted before the current record, `BDBCPAFTER', which means that the new record is inserted after the current record.
- If successful, the return value is true, else, it is false. False is returned when the cursor is at invalid position.
- After insertion, the cursor is moved to the inserted record.
The function `tcbdbcurput2' is used in order to insert a string record around a cursor object.
bool tcbdbcurput2(BDBCUR *cur, const char *vstr, int cpmode);
- `cur' specifies the cursor object of writer connection.
- `vstr' specifies the string of the value.
- `cpmode' specifies detail adjustment: `BDBCPCURRENT', which means that the value of the current record is overwritten, `BDBCPBEFORE', which means that the new record is inserted before the current record, `BDBCPAFTER', which means that the new record is inserted after the current record.
- If successful, the return value is true, else, it is false. False is returned when the cursor is at invalid position.
- After insertion, the cursor is moved to the inserted record.
The function `tcbdbcurout' is used in order to remove the record where a cursor object is.
bool tcbdbcurout(BDBCUR *cur);
- `cur' specifies the cursor object of writer connection.
- If successful, the return value is true, else, it is false. False is returned when the cursor is at invalid position.
- After deletion, the cursor is moved to the next record if possible.
The function `tcbdbcurkey' is used in order to get the key of the record where the cursor object is.
char *tcbdbcurkey(BDBCUR *cur, int *sp);
- `cur' specifies the cursor object.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the region of the key, else, it is `NULL'. `NULL' is returned when the cursor is at invalid position.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcbdbcurkey2' is used in order to get the key string of the record where the cursor object is.
char *tcbdbcurkey2(BDBCUR *cur);
- `cur' specifies the cursor object.
- If successful, the return value is the string of the key, else, it is `NULL'. `NULL' is returned when the cursor is at invalid position.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcbdbcurkey3' is used in order to get the key of the record where the cursor object is, as a volatile buffer.
const char *tcbdbcurkey3(BDBCUR *cur, int *sp);
- `cur' specifies the cursor object.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the region of the key, else, it is `NULL'. `NULL' is returned when the cursor is at invalid position.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is volatile and it may be spoiled by another operation of the database, the data should be copied into another involatile buffer immediately.
The function `tcbdbcurval' is used in order to get the value of the record where the cursor object is.
char *tcbdbcurval(BDBCUR *cur, int *sp);
- `cur' specifies the cursor object.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the region of the value, else, it is `NULL'. `NULL' is returned when the cursor is at invalid position.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcbdbcurval2' is used in order to get the value string of the record where the cursor object is.
char *tcbdbcurval2(BDBCUR *cur);
- `cur' specifies the cursor object.
- If successful, the return value is the string of the value, else, it is `NULL'. `NULL' is returned when the cursor is at invalid position.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcbdbcurval3' is used in order to get the value of the record where the cursor object is, as a volatile buffer.
const char *tcbdbcurval3(BDBCUR *cur, int *sp);
- `cur' specifies the cursor object.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the region of the value, else, it is `NULL'. `NULL' is returned when the cursor is at invalid position.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is volatile and it may be spoiled by another operation of the database, the data should be copied into another involatile buffer immediately.
The function `tcbdbcurrec' is used in order to get the key and the value of the record where the cursor object is.
bool tcbdbcurrec(BDBCUR *cur, TCXSTR *kxstr, TCXSTR *vxstr);
- `cur' specifies the cursor object.
- `kxstr' specifies the object into which the key is wrote down.
- `vxstr' specifies the object into which the value is wrote down.
- If successful, the return value is true, else, it is false. False is returned when the cursor is at invalid position.
コード例
B+木データベースを使ったコード例を以下に示します。
#include <tcutil.h>
#include <tcbdb.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>
int main(int argc, char **argv){
TCBDB *bdb;
BDBCUR *cur;
int ecode;
char *key, *value;
/* オブジェクトを作成する */
bdb = tcbdbnew();
/* データベースを開く */
if(!tcbdbopen(bdb, "casket.bdb", BDBOWRITER | BDBOCREAT)){
ecode = tcbdbecode(bdb);
fprintf(stderr, "open error: %s\n", tcbdberrmsg(ecode));
}
/* レコードを格納する */
if(!tcbdbput2(bdb, "foo", "hop") ||
!tcbdbput2(bdb, "bar", "step") ||
!tcbdbput2(bdb, "baz", "jump")){
ecode = tcbdbecode(bdb);
fprintf(stderr, "put error: %s\n", tcbdberrmsg(ecode));
}
/* レコードを取得する */
value = tcbdbget2(bdb, "foo");
if(value){
printf("%s\n", value);
free(value);
} else {
ecode = tcbdbecode(bdb);
fprintf(stderr, "get error: %s\n", tcbdberrmsg(ecode));
}
/* 横断的にレコードを参照する */
cur = tcbdbcurnew(bdb);
tcbdbcurfirst(cur);
while((key = tcbdbcurkey2(cur)) != NULL){
value = tcbdbcurval2(cur);
if(value){
printf("%s:%s\n", key, value);
free(value);
}
free(key);
tcbdbcurnext(cur);
}
tcbdbcurdel(cur);
/* データベースを閉じる */
if(!tcbdbclose(bdb)){
ecode = tcbdbecode(bdb);
fprintf(stderr, "close error: %s\n", tcbdberrmsg(ecode));
}
/* オブジェクトを破棄する */
tcbdbdel(bdb);
return 0;
}
CLI
B+木データベースAPIを簡単に利用するために、コマンドラインインターフェイスとして `tcbtest
' と `tcbmttest
' と `tcbmgr
' が提供されます。
コマンド `tcbtest
' は、B+木データベースAPIの機能テストや性能テストに用いるツールです。以下の書式で用います。`path' はデータベースファイルのパスを指定し、`rnum' は試行回数を指定し、`lmemb' はリーフ内メンバ数を指定し、`nmemb' は非リーフ内メンバ数を指定し、`bnum' はバケット数を指定し、`apow' はアラインメント力を指定し、`fpow' はフリーブロックプール力を指定します。
tcbtest write [-mt] [-cd|-ci|-cj] [-tl] [-td|-tb|-tt|-tx] [-lc num] [-nc num] [-xm num] [-ls num] [-ca num] [-nl|-nb] [-rnd] path rnum [lmemb [nmemb [bnum [apow [fpow]]]]]
- `00000001'、`00000002' のように変化する8バイトのキーと値を連続してデータベースに追加する。
tcbtest read [-mt] [-cd|-ci|-cj] [-lc num] [-nc num] [-xm num] [-nl|-nb] [-wb] [-rnd] path
- 上記で生成したデータベースの全レコードを検索する。
tcbtest remove [-mt] [-cd|-ci|-cj] [-lc num] [-nc num] [-xm num] [-nl|-nb] [-rnd] path
- 上記で生成したデータベースの全レコードを削除する。
tcbtest rcat [-mt] [-cd|-ci|-cj] [-tl] [-td|-tb|-tt|-tx] [-lc num] [-nc num] [-xm num] [-ls num] [-ca num] [-nl|-nb] [-pn num] [-dai|-dad|-rl] path rnum [lmemb [nmemb [bnum [apow [fpow]]]]]
- キーがある程度重複するようにレコードの追加を行い、連結モードで処理する。
tcbtest queue [-mt] [-cd|-ci|-cj] [-tl] [-td|-tb|-tt|-tx] [-lc num] [-nc num] [-xm num] [-ls num] [-ca num] [-nl|-nb] path rnum [lmemb [nmemb [bnum [apow [fpow]]]]]
- キューの出し入れを行う。
tcbtest misc [-mt] [-tl] [-td|-tb|-tt|-tx] [-nl|-nb] path rnum
- 各種操作の組み合わせテストを行う。
tcbtest wicked [-mt] [-tl] [-td|-tb|-tt|-tx] [-nl|-nb] path rnum
- 各種更新操作を無作為に選択して実行する。
各オプションは以下の機能を持ちます
-mt
: 関数 `tcbdbsetmutex' を呼び出す。
-cd
: 比較関数 `tccmpdecimal' を利用する。
-ci
: 比較関数 `tccmpint32' を利用する。
-cj
: 比較関数 `tccmpint64' を利用する。
-tl
: オプション `BDBTLARGE' を有効にする。
-td
: オプション `BDBTDEFLATE' を有効にする。
-tb
: オプション `BDBTBZIP' を有効にする。
-tt
: オプション `BDBTTCBS' を有効にする。
-tx
: オプション `BDBTEXCODEC' を有効にする。
-lc num
: リーフノード用キャッシュの最大数を指定する。
-nc num
: 非リーフノード用キャッシュの最大数を指定する。
-xm num
: 拡張マップメモリのサイズを指定する。
-ls num
: リーフノードの最大サイズを指定する。
-ca num
: レコードの最大収容数を指定する。
-nl
: オプション `BDBNOLCK' を有効にする。
-nb
: オプション `BDBLCKNB' を有効にする。
-rnd
: キーを無作為に選択する。
-wb
: 関数 `tcbdbget' の代わりに関数 `tcbdbget3' を用いる。
-pn num
: パターン数を指定する。
-dai
: 関数 `tcbdbputcat' の代わりに関数 `tcbdbaddint' を用いる。
-dad
: 関数 `tcbdbputcat' の代わりに関数 `tcbdbadddouble' を用いる。
-rl
: 値を無作為な長さにする。
このコマンドは処理が正常に終了すれば 0 を返し、エラーがあればそれ以外の値を返して終了します。
コマンド `tcbmttest
' は、B+木データベースAPIの機能テストをマルチスレッドで行うツールです。以下の書式で用います。`path' はデータベースファイルのパスを指定し、`tnum' はスレッド数を指定し、`rnum' は試行回数を指定し、`lmemb' はリーフ内メンバ数を指定し、`nmemb' は非リーフ内メンバ数を指定し、`bnum' はバケット数を指定し、`apow' はアラインメント力を指定し、`fpow' はフリーブロックプール力を指定します。
tcbmttest write [-tl] [-td|-tb|-tt|-tx] [-nl|-nb] [-rnd] path tnum rnum [lmemb [nmemb [bnum [apow [fpow]]]]]
- `00000001'、`00000002' のように変化する8バイトのキーと値を連続してデータベースに追加する。
tcbmttest read [-nl|-nb] [-wb] [-rnd] path tnum
- 上記で生成したデータベースの全レコードを検索する。
tcbmttest remove [-nl|-nb] [-rnd] path tnum
- 上記で生成したデータベースの全レコードを削除する。
tcbmttest wicked [-tl] [-td|-tb|-tt|-tx] [-nl|-nb] [-nc] path tnum rnum
- 各種更新操作を無作為に選択して実行する。
tcbmttest typical [-tl] [-td|-tb|-tt|-tx] [-nl|-nb] [-nc] [-rr num] path tnum rnum [lmemb [nmemb [bnum [apow [fpow]]]]]
- 典型的な操作を無作為に選択して実行する。
各オプションは以下の機能を持ちます
-tl
: オプション `BDBTLARGE' を有効にする。
-td
: オプション `BDBTDEFLATE' を有効にする。
-tb
: オプション `BDBTBZIP' を有効にする。
-tt
: オプション `BDBTTCBS' を有効にする。
-tx
: オプション `BDBTEXCODEC' を有効にする。
-nl
: オプション `BDBNOLCK' を有効にする。
-nb
: オプション `BDBLCKNB' を有効にする。
-rnd
: キーを無作為に選択する。
-wb
: 関数 `tcbdbget' の代わりに関数 `tcbdbget3' を用いる。
-nc
: 比較テストを行わない。
-rr num
: 読み込み操作の割合を百分率で指定する。
このコマンドは処理が正常に終了すれば 0 を返し、エラーがあればそれ以外の値を返して終了します。
コマンド `tcbmgr
' は、B+木データベースAPIやそのアプリケーションのテストやデバッグに役立つツールです。以下の書式で用います。`path' はデータベースファイルのパスを指定し、`lmemb' はリーフ内メンバ数を指定し、`nmemb' は非リーフ内メンバ数を指定し、`bnum' はバケット数を指定し、`apow' はアラインメント力を指定し、`fpow' はフリーブロックプール力を指定し、`key' はレコードのキーを指定し、`value' はレコードの値を指定し、`file' は入力ファイルを指定します。
tcbmgr create [-cd|-ci|-cj] [-tl] [-td|-tb|-tt|-tx] path [lmemb [nmemb [bnum [apow [fpow]]]]]
- データベースファイルを作成する。
tcbmgr inform [-nl|-nb] path
- データベースの雑多な情報を出力する。
tcbmgr put [-cd|-ci|-cj] [-nl|-nb] [-sx] [-dk|-dc] path key value
- レコードを追加する。
tcbmgr out [-cd|-ci|-cj] [-nl|-nb] [-sx] path key
- レコードを削除する。
tcbmgr get [-cd|-ci|-cj] [-nl|-nb] [-sx] [-px] [-pz] path key
- レコードの値を取得して標準出力する。
tcbmgr list [-cd|-ci|-cj] [-nl|-nb] [-m num] [-bk] [-pv] [-px] [-j str] [-rb bkey ekey] [-fm str] path
- 全てのレコードのキーを改行で区切って標準出力する。
tcbmgr optimize [-cd|-ci|-cj] [-tl] [-td|-tb|-tt|-tx] [-tz] [-nl|-nb] path [lmemb [nmemb [bnum [apow [fpow]]]]]
- データベースを最適化する。
tcbmgr importtsv [-nl|-nb] [-sc] path [file]
- TSVファイルの各行をキーと値とみなしてレコードを登録する。
tcbmgr version
- Tokyo Cabinetのバージョン情報を標準出力する。
各オプションは以下の機能を持ちます
-cd
: 比較関数 `tccmpdecimal' を利用する。
-ci
: 比較関数 `tccmpint32' を利用する。
-cj
: 比較関数 `tccmpint64' を利用する。
-tl
: オプション `BDBTLARGE' を有効にする。
-td
: オプション `BDBTDEFLATE' を有効にする。
-tb
: オプション `BDBTBZIP' を有効にする。
-tt
: オプション `BDBTTCBS' を有効にする。
-tx
: オプション `BDBTEXCODEC' を有効にする。
-nl
: オプション `BDBNOLCK' を有効にする。
-nb
: オプション `BDBLCKNB' を有効にする。
-sx
: 入力を16進数の文字列で行う。
-dk
: 関数 `tchdbput' の代わりに関数 `tchdbputkeep' を用いる。
-dc
: 関数 `tchdbput' の代わりに関数 `tchdbputcat' を用いる。
-px
: 出力を16進数の文字列で行う。
-pz
: 出力の末尾に改行を付加しない。
-m num
: 出力の最大数を指定する。
-bk
: 走査を逆方向で行う。
-pv
: レコードの値も出力する。
-j str
: カーソルを指定位置にジャンプさせる。
-rb bkey ekey
: 処理対象を範囲指定する。
-fm str
: キーの接頭辞を指定する。
-tz
: オプション `UINT8_MAX' を有効にする。
-sc
: キーを小文字に正規化する。
このコマンドは処理が正常に終了すれば 0 を返し、エラーがあればそれ以外の値を返して終了します。
The Fixed-length Database API
固定長データベースは、固定長の要素からなる配列を単一のファイルに記録したデータベースです。それを扱うのが固定長データベースAPIです。`tcfdb.h
' にAPIの仕様の完全な記述があります。
概要
固定長データベースAPIを使うためには、`tcutil.h
'、`tcfdb.h
' および関連する標準ヘッダファイルをインクルードしてください。通常、ソースファイルの冒頭付近で以下の記述を行います。
#include <tcutil.h>
#include <tcfdb.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>
固定長データベースを扱う際には、`TCFDB
' 型へのポインタをオブジェクトとして用います。固定長データベースオブジェクトは、関数 `tcfdbnew
' で作成し、関数 `tcfdbdel
' で破棄します。作成したオブジェクトを使い終わったら必ず破棄してください。そうしないとメモリリークが発生します。
レコードの格納や探索を行う前提として、固定長データベースオブジェクトをデータベースファイルと接続させる必要があります。データベースファイルを開いて接続するには関数 `tcfdbopen
' を用い、接続の解除してファイルを閉じるには関数 `tcfdbclose
' を用います。開いたデータベースファイルは必ず閉じてください。そうしないとデータベースファイルが壊れたり格納したデータが失われたりする可能性があります。
API(英語スマソ)
The function `tcfdberrmsg' is used in order to get the message string corresponding to an error code.
const char *tcfdberrmsg(int ecode);
- `ecode' specifies the error code.
- The return value is the message string of the error code.
The function `tcfdbnew' is used in order to create a fixed-length database object.
TCFDB *tcfdbnew(void);
- The return value is the new fixed-length database object.
The function `tcfdbdel' is used in order to delete a fixed-length database object.
void tcfdbdel(TCFDB *fdb);
- `fdb' specifies the fixed-length database object.
- If the database is not closed, it is closed implicitly. Note that the deleted object and its derivatives can not be used anymore.
The function `tcfdbecode' is used in order to get the last happened error code of a fixed-length database object.
int tcfdbecode(TCFDB *fdb);
- `fdb' specifies the fixed-length database object.
- The return value is the last happened error code.
- The following error codes are defined: `TCESUCCESS' for success, `TCETHREAD' for threading error, `TCEINVALID' for invalid operation, `TCENOFILE' for file not found, `TCENOPERM' for no permission, `TCEMETA' for invalid meta data, `TCERHEAD' for invalid record header, `TCEOPEN' for open error, `TCECLOSE' for close error, `TCETRUNC' for trunc error, `TCESYNC' for sync error, `TCESTAT' for stat error, `TCESEEK' for seek error, `TCEREAD' for read error, `TCEWRITE' for write error, `TCEMMAP' for mmap error, `TCELOCK' for lock error, `TCEUNLINK' for unlink error, `TCERENAME' for rename error, `TCEMKDIR' for mkdir error, `TCERMDIR' for rmdir error, `TCEKEEP' for existing record, `TCENOREC' for no record found, and `TCEMISC' for miscellaneous error.
The function `tcfdbsetmutex' is used in order to set mutual exclusion control of a fixed-length database object for threading.
bool tcfdbsetmutex(TCFDB *fdb);
- `fdb' specifies the fixed-length database object which is not opened.
- If successful, the return value is true, else, it is false.
- Note that the mutual exclusion control is needed if the object is shared by plural threads and this function should should be called before the database is opened.
The function `tcfdbtune' is used in order to set the tuning parameters of a fixed-length database object.
bool tcfdbtune(TCFDB *fdb, int32_t width, int64_t limsiz);
- `fdb' specifies the fixed-length database object which is not opened.
- `width' specifies the width of the value of each record. If it is not more than 0, the default value is specified. The default value is 255.
- `limsiz' specifies the limit size of the database file. If it is not more than 0, the default value is specified. The default value is 268435456.
- If successful, the return value is true, else, it is false.
- Note that the tuning parameters should be set before the database is opened.
The function `tcfdbopen' is used in order to open a database file and connect a fixed-length database object.
bool tcfdbopen(TCFDB *fdb, const char *path, int omode);
- `fdb' specifies the fixed-length database object which is not opened.
- `path' specifies the path of the database file.
- `omode' specifies the connection mode: `FDBOWRITER' as a writer, `FDBOREADER' as a reader. If the mode is `FDBOWRITER', the following may be added by bitwise or: `FDBOCREAT', which means it creates a new database if not exist, `FDBOTRUNC', which means it creates a new database regardless if one exists. Both of `FDBOREADER' and `FDBOWRITER' can be added to by bitwise or: `FDBONOLCK', which means it opens the database file without file locking, or `FDBOLCKNB', which means locking is performed without blocking.
- If successful, the return value is true, else, it is false.
The function `tcfdbclose' is used in order to close a fixed-length database object.
bool tcfdbclose(TCFDB *fdb);
- `fdb' specifies the fixed-length database object.
- If successful, the return value is true, else, it is false.
- Update of a database is assured to be written when the database is closed. If a writer opens a database but does not close it appropriately, the database will be broken.
The function `tcfdbput' is used in order to store a record into a fixed-length database object.
bool tcfdbput(TCFDB *fdb, int64_t id, const void *vbuf, int vsiz);
- `fdb' specifies the fixed-length database object connected as a writer.
- `id' specifies the ID number. It should be more than 0. If it is `FDBIDMIN', the minimum ID number of existing records is specified. If it is `FDBIDPREV', the number less by one than the minimum ID number of existing records is specified. If it is `FDBIDMAX', the maximum ID number of existing records is specified. If it is `FDBIDNEXT', the number greater by one than the maximum ID number of existing records is specified.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value. If the size of the value is greater than the width tuning parameter of the database, the size is cut down to the width.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, it is overwritten.
The function `tcfdbput2' is used in order to store a record with a decimal key into a fixed-length database object.
bool tcfdbput2(TCFDB *fdb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `fdb' specifies the fixed-length database object connected as a writer.
- `kbuf' specifies the pointer to the region of the decimal key. It should be more than 0. If it is "min", the minimum ID number of existing records is specified. If it is "prev", the number less by one than the minimum ID number of existing records is specified. If it is "max", the maximum ID number of existing records is specified. If it is "next", the number greater by one than the maximum ID number of existing records is specified.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value. If the size of the value is greater than the width tuning parameter of the database, the size is cut down to the width.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, it is overwritten.
The function `tcfdbput3' is used in order to store a string record with a decimal key into a fixed-length database object.
bool tcfdbput3(TCFDB *fdb, const char *kstr, const void *vstr);
- `fdb' specifies the fixed-length database object connected as a writer.
- `kstr' specifies the string of the decimal key. It should be more than 0. If it is "min", the minimum ID number of existing records is specified. If it is "prev", the number less by one than the minimum ID number of existing records is specified. If it is "max", the maximum ID number of existing records is specified. If it is "next", the number greater by one than the maximum ID number of existing records is specified.
- `vstr' specifies the string of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, it is overwritten.
The function `tcfdbputkeep' is used in order to store a new record into a fixed-length database object.
bool tcfdbputkeep(TCFDB *fdb, int64_t id, const void *vbuf, int vsiz);
- `fdb' specifies the fixed-length database object connected as a writer.
- `id' specifies the ID number. It should be more than 0. If it is `FDBIDMIN', the minimum ID number of existing records is specified. If it is `FDBIDPREV', the number less by one than the minimum ID number of existing records is specified. If it is `FDBIDMAX', the maximum ID number of existing records is specified. If it is `FDBIDNEXT', the number greater by one than the maximum ID number of existing records is specified.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value. If the size of the value is greater than the width tuning parameter of the database, the size is cut down to the width.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, this function has no effect.
The function `tcfdbputkeep2' is used in order to store a new record with a decimal key into a fixed-length database object.
bool tcfdbputkeep2(TCFDB *fdb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `fdb' specifies the fixed-length database object connected as a writer.
- `kbuf' specifies the pointer to the region of the decimal key. It should be more than 0. If it is "min", the minimum ID number of existing records is specified. If it is "prev", the number less by one than the minimum ID number of existing records is specified. If it is "max", the maximum ID number of existing records is specified. If it is "next", the number greater by one than the maximum ID number of existing records is specified.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value. If the size of the value is greater than the width tuning parameter of the database, the size is cut down to the width.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, this function has no effect.
The function `tcfdbputkeep3' is used in order to store a new string record with a decimal key into a fixed-length database object.
bool tcfdbputkeep3(TCFDB *fdb, const char *kstr, const void *vstr);
- `fdb' specifies the fixed-length database object connected as a writer.
- `kstr' specifies the string of the decimal key. It should be more than 0. If it is "min", the minimum ID number of existing records is specified. If it is "prev", the number less by one than the minimum ID number of existing records is specified. If it is "max", the maximum ID number of existing records is specified. If it is "next", the number greater by one than the maximum ID number of existing records is specified.
- `vstr' specifies the string of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, this function has no effect.
The function `tcfdbputcat' is used in order to concatenate a value at the end of the existing record in a fixed-length database object.
bool tcfdbputcat(TCFDB *fdb, int64_t id, const void *vbuf, int vsiz);
- `fdb' specifies the fixed-length database object connected as a writer.
- `id' specifies the ID number. It should be more than 0. If it is `FDBIDMIN', the minimum ID number of existing records is specified. If it is `FDBIDPREV', the number less by one than the minimum ID number of existing records is specified. If it is `FDBIDMAX', the maximum ID number of existing records is specified. If it is `FDBIDNEXT', the number greater by one than the maximum ID number of existing records is specified.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value. If the size of the value is greater than the width tuning parameter of the database, the size is cut down to the width.
- If successful, the return value is true, else, it is false.
- If there is no corresponding record, a new record is created.
The function `tcfdbputcat2' is used in order to concatenate a value with a decimal key in a fixed-length database object.
bool tcfdbputcat2(TCFDB *fdb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `fdb' specifies the fixed-length database object connected as a writer.
- `kbuf' specifies the pointer to the region of the decimal key. It should be more than 0. If it is "min", the minimum ID number of existing records is specified. If it is "prev", the number less by one than the minimum ID number of existing records is specified. If it is "max", the maximum ID number of existing records is specified. If it is "next", the number greater by one than the maximum ID number of existing records is specified.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value. If the size of the value is greater than the width tuning parameter of the database, the size is cut down to the width.
- If successful, the return value is true, else, it is false.
- If there is no corresponding record, a new record is created.
The function `tcfdbputcat3' is used in order to concatenate a string value with a decimal key in a fixed-length database object.
bool tcfdbputcat3(TCFDB *fdb, const char *kstr, const void *vstr);
- `fdb' specifies the fixed-length database object connected as a writer.
- `kstr' specifies the string of the decimal key. It should be more than 0. If it is "min", the minimum ID number of existing records is specified. If it is "prev", the number less by one than the minimum ID number of existing records is specified. If it is "max", the maximum ID number of existing records is specified. If it is "next", the number greater by one than the maximum ID number of existing records is specified.
- `vstr' specifies the string of the value.
- If successful, the return value is true, else, it is false.
- If there is no corresponding record, a new record is created.
The function `tcfdbout' is used in order to remove a record of a fixed-length database object.
bool tcfdbout(TCFDB *fdb, int64_t id);
- `fdb' specifies the fixed-length database object connected as a writer.
- `id' specifies the ID number. It should be more than 0. If it is `FDBIDMIN', the minimum ID number of existing records is specified. If it is `FDBIDMAX', the maximum ID number of existing records is specified.
- If successful, the return value is true, else, it is false.
The function `tcfdbout2' is used in order to remove a record with a decimal key of a fixed-length database object.
bool tcfdbout2(TCFDB *fdb, const void *kbuf, int ksiz);
- `fdb' specifies the fixed-length database object connected as a writer.
- `kbuf' specifies the pointer to the region of the decimal key. It should be more than 0. If it is "min", the minimum ID number of existing records is specified. If it is "max", the maximum ID number of existing records is specified.
- `ksiz' specifies the size of the region of the key.
- If successful, the return value is true, else, it is false.
The function `tcfdbout3' is used in order to remove a string record with a decimal key of a fixed-length database object.
bool tcfdbout3(TCFDB *fdb, const char *kstr);
- `fdb' specifies the fixed-length database object connected as a writer.
- `kstr' specifies the string of the decimal key. It should be more than 0. If it is "min", the minimum ID number of existing records is specified. If it is "max", the maximum ID number of existing records is specified.
- If successful, the return value is true, else, it is false.
The function `tcfdbget' is used in order to retrieve a record in a fixed-length database object.
void *tcfdbget(TCFDB *fdb, int64_t id, int *sp);
- `fdb' specifies the fixed-length database object.
- `id' specifies the ID number. It should be more than 0. If it is `FDBIDMIN', the minimum ID number of existing records is specified. If it is `FDBIDMAX', the maximum ID number of existing records is specified.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the region of the value of the corresponding record. `NULL' is returned if no record corresponds.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcfdbget2' is used in order to retrieve a record with a decimal key in a fixed-length database object.
void *tcfdbget2(TCFDB *fdb, const void *kbuf, int ksiz, int *sp);
- `fdb' specifies the fixed-length database object.
- `kbuf' specifies the pointer to the region of the decimal key. It should be more than 0. If it is "min", the minimum ID number of existing records is specified. If it is "max", the maximum ID number of existing records is specified.
- `ksiz' specifies the size of the region of the key.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the region of the value of the corresponding record. `NULL' is returned if no record corresponds.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcfdbget3' is used in order to retrieve a string record with a decimal key in a fixed-length database object.
char *tcfdbget3(TCFDB *fdb, const char *kstr);
- `fdb' specifies the fixed-length database object.
- `kstr' specifies the string of the decimal key. It should be more than 0. If it is "min", the minimum ID number of existing records is specified. If it is "max", the maximum ID number of existing records is specified.
- If successful, the return value is the string of the value of the corresponding record. `NULL' is returned if no record corresponds.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcfdbget4' is used in order to retrieve a record in a fixed-length database object and write the value into a buffer.
int tcfdbget4(TCFDB *fdb, int64_t id, void *vbuf, int max);
- `fdb' specifies the fixed-length database object.
- `id' specifies the ID number. It should be more than 0. If it is `FDBIDMIN', the minimum ID number of existing records is specified. If it is `FDBIDMAX', the maximum ID number of existing records is specified.
- `vbuf' specifies the pointer to the buffer into which the value of the corresponding record is written.
- `max' specifies the size of the buffer.
- If successful, the return value is the size of the written data, else, it is -1. -1 is returned if no record corresponds to the specified key.
- Note that an additional zero code is not appended at the end of the region of the writing buffer.
The function `tcfdbvsiz' is used in order to get the size of the value of a record in a fixed-length database object.
int tcfdbvsiz(TCFDB *fdb, int64_t id);
- `fdb' specifies the fixed-length database object.
- `id' specifies the ID number. It should be more than 0. If it is `FDBIDMIN', the minimum ID number of existing records is specified. If it is `FDBIDMAX', the maximum ID number of existing records is specified.
- If successful, the return value is the size of the value of the corresponding record, else, it is -1.
The function `tcfdbvsiz2' is used in order to get the size of the value with a decimal key in a fixed-length database object.
int tcfdbvsiz2(TCFDB *fdb, const void *kbuf, int ksiz);
- `fdb' specifies the fixed-length database object.
- `kbuf' specifies the pointer to the region of the decimal key. It should be more than 0. If it is "min", the minimum ID number of existing records is specified. If it is "max", the maximum ID number of existing records is specified.
- `ksiz' specifies the size of the region of the key.
- If successful, the return value is the size of the value of the corresponding record, else, it is -1.
The function `tcfdbvsiz3' is used in order to get the size of the string value with a decimal key in a fixed-length database object.
int tcfdbvsiz3(TCFDB *fdb, const char *kstr);
- `fdb' specifies the fixed-length database object.
- `kstr' specifies the string of the decimal key. It should be more than 0. If it is "min", the minimum ID number of existing records is specified. If it is "max", the maximum ID number of existing records is specified.
- If successful, the return value is the size of the value of the corresponding record, else, it is -1.
The function `tcfdbiterinit' is used in order to initialize the iterator of a fixed-length database object.
bool tcfdbiterinit(TCFDB *fdb);
- `fdb' specifies the fixed-length database object.
- If successful, the return value is true, else, it is false.
- The iterator is used in order to access the key of every record stored in a database.
The function `tcfdbiternext' is used in order to get the next ID number of the iterator of a fixed-length database object.
uint64_t tcfdbiternext(TCFDB *fdb);
- `fdb' specifies the fixed-length database object.
- If successful, the return value is the next ID number of the iterator, else, it is 0. 0 is returned when no record is to be get out of the iterator.
- It is possible to access every record by iteration of calling this function. It is allowed to update or remove records whose keys are fetched while the iteration. The order of this traversal access method is ascending of the ID number.
The function `tcfdbiternext2' is used in order to get the next decimay key of the iterator of a fixed-length database object.
void *tcfdbiternext2(TCFDB *fdb, int *sp);
- `fdb' specifies the fixed-length database object.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the region of the next decimal key, else, it is `NULL'. `NULL' is returned when no record is to be get out of the iterator.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use. It is possible to access every record by iteration of calling this function. It is allowed to update or remove records whose keys are fetched while the iteration. The order of this traversal access method is ascending of the ID number.
The function `tcfdbiternext3' is used in order to get the next decimay key string of the iterator of a fixed-length database object.
char *tcfdbiternext3(TCFDB *fdb);
- `fdb' specifies the fixed-length database object.
- If successful, the return value is the string of the next decimal key, else, it is `NULL'. `NULL' is returned when no record is to be get out of the iterator.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use. It is possible to access every record by iteration of calling this function. It is allowed to update or remove records whose keys are fetched while the iteration. The order of this traversal access method is ascending of the ID number.
The function `tcfdbrange' is used in order to get range matching ID numbers in a fixed-length database object.
uint64_t *tcfdbrange(TCFDB *fdb, int64_t lower, int64_t upper, int max, int *np);
- `fdb' specifies the fixed-length database object.
- `lower' specifies the lower limit of the range. If it is `FDBIDMIN', the minimum ID is specified.
- `upper' specifies the upper limit of the range. If it is `FDBIDMAX', the maximum ID is specified.
- `max' specifies the maximum number of keys to be fetched. If it is negative, no limit is specified.
- `np' specifies the pointer to the variable into which the number of elements of the return value is assigned.
- If successful, the return value is the pointer to an array of ID numbers of the corresponding records. `NULL' is returned on failure. This function does never fail and return an empty array even if no key corresponds.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcfdbrange2' is used in order to get range matching decimal keys in a fixed-length database object.
TCLIST *tcfdbrange2(TCFDB *fdb, const void *lbuf, int lsiz, const void *ubuf, int usiz, int max);
- `fdb' specifies the fixed-length database object.
- `lbuf' specifies the pointer to the region of the lower key. If it is "min", the minimum ID number of existing records is specified.
- `lsiz' specifies the size of the region of the lower key.
- `ubuf' specifies the pointer to the region of the upper key. If it is "max", the maximum ID number of existing records is specified.
- `usiz' specifies the size of the region of the upper key.
- `max' specifies the maximum number of keys to be fetched. If it is negative, no limit is specified.
- The return value is a list object of the corresponding decimal keys. This function does never fail and return an empty list even if no key corresponds.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use. Note that this function may be very slow because every key in the database is scanned.
The function `tcfdbrange3' is used in order to get range matching decimal keys with strings in a fixed-length database object.
TCLIST *tcfdbrange3(TCFDB *fdb, const char *lstr, const char *ustr, int max);
- `fdb' specifies the fixed-length database object.
- `lstr' specifies the string of the lower key. If it is "min", the minimum ID number of existing records is specified.
- `ustr' specifies the string of the upper key. If it is "max", the maximum ID number of existing records is specified.
- `max' specifies the maximum number of keys to be fetched. If it is negative, no limit is specified.
- The return value is a list object of the corresponding decimal keys. This function does never fail and return an empty list even if no key corresponds.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use. Note that this function may be very slow because every key in the database is scanned.
The function `tcfdbrange4' is used in order to get keys with an interval notation in a fixed-length database object.
TCLIST *tcfdbrange4(TCFDB *fdb, const void *ibuf, int isiz, int max);
- `fdb' specifies the fixed-length database object.
- `ibuf' specifies the pointer to the region of the interval notation.
- `isiz' specifies the size of the region of the interval notation.
- `max' specifies the maximum number of keys to be fetched. If it is negative, no limit is specified.
- The return value is a list object of the corresponding decimal keys. This function does never fail and return an empty list even if no key corresponds.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use. Note that this function may be very slow because every key in the database is scanned.
The function `tcfdbrange5' is used in order to get keys with an interval notation string in a fixed-length database object.
TCLIST *tcfdbrange5(TCFDB *fdb, const void *istr, int max);
- `fdb' specifies the fixed-length database object.
- `istr' specifies the pointer to the region of the interval notation string.
- `max' specifies the maximum number of keys to be fetched. If it is negative, no limit is specified.
- The return value is a list object of the corresponding decimal keys. This function does never fail and return an empty list even if no key corresponds.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use. Note that this function may be very slow because every key in the database is scanned.
The function `tcfdbaddint' is used in order to add an integer to a record in a fixed-length database object.
int tcfdbaddint(TCFDB *fdb, int64_t id, int num);
- `fdb' specifies the fixed-length database object connected as a writer.
- `id' specifies the ID number. It should be more than 0. If it is `FDBIDMIN', the minimum ID number of existing records is specified. If it is `FDBIDPREV', the number less by one than the minimum ID number of existing records is specified. If it is `FDBIDMAX', the maximum ID number of existing records is specified. If it is `FDBIDNEXT', the number greater by one than the maximum ID number of existing records is specified.
- `num' specifies the additional value.
- If successful, the return value is the summation value, else, it is `INT_MIN'.
- If the corresponding record exists, the value is treated as an integer and is added to. If no record corresponds, a new record of the additional value is stored.
The function `tcfdbadddouble' is used in order to add a real number to a record in a fixed-length database object.
double tcfdbadddouble(TCFDB *fdb, int64_t id, double num);
- `fdb' specifies the fixed-length database object connected as a writer.
- `id' specifies the ID number. It should be more than 0. If it is `FDBIDMIN', the minimum ID number of existing records is specified. If it is `FDBIDPREV', the number less by one than the minimum ID number of existing records is specified. If it is `FDBIDMAX', the maximum ID number of existing records is specified. If it is `FDBIDNEXT', the number greater by one than the maximum ID number of existing records is specified.
- `num' specifies the additional value.
- If successful, the return value is the summation value, else, it is `NAN'.
- If the corresponding record exists, the value is treated as a real number and is added to. If no record corresponds, a new record of the additional value is stored.
The function `tcfdbsync' is used in order to synchronize updated contents of a fixed-length database object with the file and the device.
bool tcfdbsync(TCFDB *fdb);
- `fdb' specifies the fixed-length database object connected as a writer.
- If successful, the return value is true, else, it is false.
- This function is useful when another process connects to the same database file.
The function `tcfdboptimize' is used in order to optimize the file of a fixed-length database object.
bool tcfdboptimize(TCFDB *fdb, int32_t width, int64_t limsiz);
- `fdb' specifies the fixed-length database object connected as a writer.
- `width' specifies the width of the value of each record. If it is not more than 0, the current setting is not changed.
- `limsiz' specifies the limit size of the database file. If it is not more than 0, the current setting is not changed.
- If successful, the return value is true, else, it is false.
The function `tcfdbvanish' is used in order to remove all records of a fixed-length database object.
bool tcfdbvanish(TCFDB *fdb);
- `fdb' specifies the fixed-length database object connected as a writer.
- If successful, the return value is true, else, it is false.
The function `tcfdbcopy' is used in order to copy the database file of a fixed-length database object.
bool tcfdbcopy(TCFDB *fdb, const char *path);
- `fdb' specifies the fixed-length database object.
- `path' specifies the path of the destination file. If it begins with `@', the trailing substring is executed as a command line.
- If successful, the return value is true, else, it is false. False is returned if the executed command returns non-zero code.
- The database file is assured to be kept synchronized and not modified while the copying or executing operation is in progress. So, this function is useful to create a backup file of the database file.
The function `tcfdbpath' is used in order to get the file path of a fixed-length database object.
const char *tcfdbpath(TCFDB *fdb);
- `fdb' specifies the fixed-length database object.
- The return value is the path of the database file or `NULL' if the object does not connect to any database file.
The function `tcfdbrnum' is used in order to get the number of records of a fixed-length database object.
uint64_t tcfdbrnum(TCFDB *fdb);
- `fdb' specifies the fixed-length database object.
- The return value is the number of records or 0 if the object does not connect to any database file.
The function `tcfdbfsiz' is used in order to get the size of the database file of a fixed-length database object.
uint64_t tcfdbfsiz(TCFDB *fdb);
- `fdb' specifies the fixed-length database object.
- The return value is the size of the database file or 0 if the object does not connect to any database file.
コード例
固定長データベースを使ったコード例を以下に示します。
#include <tcutil.h>
#include <tcfdb.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>
int main(int argc, char **argv){
TCFDB *fdb;
int ecode;
char *key, *value;
/* create the object */
fdb = tcfdbnew();
/* open the database */
if(!tcfdbopen(fdb, "casket.fdb", FDBOWRITER | FDBOCREAT)){
ecode = tcfdbecode(fdb);
fprintf(stderr, "open error: %s\n", tcfdberrmsg(ecode));
}
/* store records */
if(!tcfdbput3(fdb, "1", "one") ||
!tcfdbput3(fdb, "12", "twelve") ||
!tcfdbput3(fdb, "144", "one forty four")){
ecode = tcfdbecode(fdb);
fprintf(stderr, "put error: %s\n", tcfdberrmsg(ecode));
}
/* retrieve records */
value = tcfdbget3(fdb, "1");
if(value){
printf("%s\n", value);
free(value);
} else {
ecode = tcfdbecode(fdb);
fprintf(stderr, "get error: %s\n", tcfdberrmsg(ecode));
}
/* traverse records */
tcfdbiterinit(fdb);
while((key = tcfdbiternext3(fdb)) != NULL){
value = tcfdbget3(fdb, key);
if(value){
printf("%s:%s\n", key, value);
free(value);
}
free(key);
}
/* close the database */
if(!tcfdbclose(fdb)){
ecode = tcfdbecode(fdb);
fprintf(stderr, "close error: %s\n", tcfdberrmsg(ecode));
}
/* delete the object */
tcfdbdel(fdb);
return 0;
}
CLI
固定長データベースAPIを簡単に利用するために、コマンドラインインターフェイスとして `tcftest
' と `tcfmttest
' と `tcfmgr
' が提供されます。
コマンド `tcftest
' は、ハッシュデータベースAPIの機能テストや性能テストに用いるツールです。以下の書式で用います。`path' はデータベースファイルのパスを指定し、`rnum' は試行回数を指定し、`width' は各レコードの値の幅を指定し、`limsiz' はデータベースファイルの制限サイズを指定します。
tcftest write [-mt] [-nl|-nb] [-rnd] path rnum [width [limsiz]]
- `00000001'、`00000002' のように変化する8バイトのキーと値を連続してデータベースに追加する。
tcftest read [-mt] [-nl|-nb] [-wb] [-rnd] path
- 上記で生成したデータベースの全レコードを検索する。
tcftest remove [-mt] [-nl|-nb] [-rnd] path
- 上記で生成したデータベースの全レコードを削除する。
tcftest rcat [-mt] [-nl|-nb] [-pn num] [-dai|-dad|-rl] path rnum [width [limsiz]]
- キーがある程度重複するようにレコードの追加を行い、連結モードで処理する。
tcftest misc [-mt] [-nl|-nb] path rnum
- 各種操作の組み合わせテストを行う。
tcftest wicked [-mt] [-nl|-nb] path rnum
- 各種更新操作を無作為に選択して実行する。
各オプションは以下の機能を持ちます
-mt
: 関数 `tcfdbsetmutex' を呼び出す。
-nl
: オプション `FDBNOLCK' を有効にする。
-nb
: オプション `FDBLCKNB' を有効にする。
-rnd
: キーを無作為に選択する。
-wb
: 関数 `tcfdbget' の代わりに関数 `tcfdbget3' を用いる。
-pn num
: パターン数を指定する。
-dai
: 関数 `tcfdbputcat' の代わりに関数 `tcfdbaddint' を用いる。
-dad
: 関数 `tcfdbputcat' の代わりに関数 `tcfdbadddouble' を用いる。
-rl
: 値を無作為な長さにする。
このコマンドは処理が正常に終了すれば 0 を返し、エラーがあればそれ以外の値を返して終了します。
コマンド `tcfmttest
' は、ハッシュデータベースAPIの機能テストをマルチスレッドで行うツールです。以下の書式で用います。`path' はデータベースファイルのパスを指定し、`tnum' はスレッド数を指定し、`rnum' は試行回数を指定し、`width' は各レコードの値の幅を指定し、`limsiz' はデータベースファイルの制限サイズを指定します。
tcfmttest write [-nl|-nb] [-rnd] path tnum rnum [width [limsiz]]
- `00000001'、`00000002' のように変化する8バイトのキーと値を連続してデータベースに追加する。
tcfmttest read [-nl|-nb] [-wb] [-rnd] path tnum
- 上記で生成したデータベースの全レコードを検索する。
tcfmttest remove [-nl|-nb] [-rnd] path tnum
- 上記で生成したデータベースの全レコードを削除する。
tcfmttest wicked [-nl|-nb] [-nc] path tnum rnum
- 各種更新操作を無作為に選択して実行する。
tcfmttest typical [-nl|-nb] [-nc] [-rr num] path tnum rnum [width [limsiz]]
- 典型的な操作を無作為に選択して実行する。
各オプションは以下の機能を持ちます
-nl
: オプション `FDBNOLCK' を有効にする。
-nb
: オプション `FDBLCKNB' を有効にする。
-rnd
: キーを無作為に選択する。
-wb
: 関数 `tcfdbget' の代わりに関数 `tcfdbget3' を用いる。
-nc
: 比較テストを行わない。
-rr num
: 読み込み操作の割合を百分率で指定する。
このコマンドは処理が正常に終了すれば 0 を返し、エラーがあればそれ以外の値を返して終了します。
コマンド `tcfmgr
' は、ハッシュデータベースAPIやそのアプリケーションのテストやデバッグに役立つツールです。以下の書式で用います。`path' はデータベースファイルのパスを指定し、`width' は各レコードの値の幅を指定し、`limsiz' はデータベースファイルの制限サイズを指定し、`key' はレコードのキーを指定し、`value' はレコードの値を指定し、`file' は入力ファイルを指定します。
tcfmgr create path [width [limsiz]]
- データベースファイルを作成する。
tcfmgr inform [-nl|-nb] path
- データベースの雑多な情報を出力する。
tcfmgr put [-nl|-nb] [-sx] [-dk|-dc] path key value
- レコードを追加する。
tcfmgr out [-nl|-nb] [-sx] path key
- レコードを削除する。
tcfmgr get [-nl|-nb] [-sx] [-px] [-pz] path key
- レコードの値を取得して標準出力する。
tcfmgr list [-nl|-nb] [-m num] [-pv] [-px] [-rb lkey ukey] [-ri str] path
- 全てのレコードのキーを改行で区切って標準出力する。
tcfmgr optimize [-tz] [-nl|-nb] path [width [limsiz]]
- データベースを最適化する。
tcfmgr importtsv [-nl|-nb] [-sc] path [file]
- TSVファイルの各行をキーと値とみなしてレコードを登録する。
tcfmgr version
- Tokyo Cabinetのバージョン情報を標準出力する。
各オプションは以下の機能を持ちます
-nl
: オプション `FDBNOLCK' を有効にする。
-nb
: オプション `FDBLCKNB' を有効にする。
-sx
: 入力を16進数の文字列で行う。
-dk
: 関数 `tcfdbput' の代わりに関数 `tcfdbputkeep' を用いる。
-dc
: 関数 `tcfdbput' の代わりに関数 `tcfdbputcat' を用いる。
-px
: 出力を16進数の文字列で行う。
-pz
: 出力の末尾に改行を付加しない。
-m num
: 出力の最大数を指定する。
-pv
: レコードの値も出力する。
-rb lkey ukey
: 処理対象を範囲指定する。
-ri str
: 処理対象の範囲を区間記法で指定する。
-sc
: キーを小文字に正規化する。
このコマンドは処理が正常に終了すれば 0 を返し、エラーがあればそれ以外の値を返して終了します。
抽象データベースAPI
抽象データベースは、オンメモリハッシュデータベースとオンメモリツリーデータベースとハッシュデータベースとB+木データベースと固定長データベースを同一のAPIで抽象化したデータベースです。それを扱うのが抽象データベースAPIです。`tcadb.h
' にAPIの仕様の完全な記述があります。
概要
抽象データベースAPIを使うためには、`tcutil.h
'、`tcadb.h
' および関連する標準ヘッダファイルをインクルードしてください。通常、ソースファイルの冒頭付近で以下の記述を行います。
#include <tcutil.h>
#include <tcadb.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>
抽象データベースを扱う際には、`TCADB
' 型へのポインタをオブジェクトとして用います。B+木データベースオブジェクトは、関数 `tcadbnew
' で作成し、関数 `tcadbdel
' で破棄します。作成したオブジェクトを使い終わったら必ず破棄してください。そうしないとメモリリークが発生します。
レコードの格納や探索を行う前提として、抽象データベースオブジェクトを具象データベースと接続させる必要があります。具象データベースを開いて接続するには関数 `tcadbopen
' を用い、接続の解除してファイルを閉じるには関数 `tcadbclose
' を用います。開いた具象データベースは必ず閉じてください。そうしないと具象データベースが壊れたり格納したデータが失われたりする可能性があります。
API(英語ごめんね)
The function `tcadbnew' is used in order to create an abstract database object.
TCADB *tcadbnew(void);
- The return value is the new abstract database object.
The function `tcadbdel' is used in order to delete an abstract database object.
void tcadbdel(TCADB *adb);
- `adb' specifies the abstract database object.
The function `tcadbopen' is used in order to open an abstract database.
bool tcadbopen(TCADB *adb, const char *name);
- `adb' specifies the abstract database object.
- `name' specifies the name of the database. If it is "*", the database will be an on-memory hash database. If it is "+", the database will be an on-memory tree database. If its suffix is ".tch", the database will be a hash database. If its suffix is ".tcb", the database will be a B+ tree database. If its suffix is ".tcf", the database will be a fixed-length database. Otherwise, this function fails. Tuning parameters can trail the name, separated by "#". Each parameter is composed of the name and the value, separated by "=". On-memory hash database supports "bnum", "capnum", and "capsiz". On-memory tree database supports "capnum" and "capsiz". Hash database supports "mode", "bnum", "apow", "fpow", "opts", "rcnum", and "xmsiz". B+ tree database supports "mode", "lmemb", "nmemb", "bnum", "apow", "fpow", "opts", "lcnum", "ncnum", and "xmsiz". Fixed-length database supports "mode", "width", and "limsiz". "capnum" specifies the capacity number of records. "capsiz" specifies the capacity size of using memory. Records spilled the capacity are removed by the storing order. "mode" can contain "w" of writer, "r" of reader, "c" of creating, "t" of truncating, "e" of no locking, and "f" of non-blocking lock. The default mode is relevant to "wc". "opts" can contains "l" of large option, "d" of Deflate option, "b" of BZIP2 option, and "t" of TCBS option. For example, "casket.tch#bnum=1000000#opts=ld" means that the name of the database file is "casket.tch", and the bucket number is 1000000, and the options are large and Deflate.
- If successful, the return value is true, else, it is false.
The function `tcadbclose' is used in order to close an abstract database object.
bool tcadbclose(TCADB *adb);
- `adb' specifies the abstract database object.
- If successful, the return value is true, else, it is false.
- Update of a database is assured to be written when the database is closed. If a writer opens a database but does not close it appropriately, the database will be broken.
The function `tcadbput' is used in order to store a record into an abstract database object.
bool tcadbput(TCADB *adb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `adb' specifies the abstract database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, it is overwritten.
The function `tcadbput2' is used in order to store a string record into an abstract object.
bool tcadbput2(TCADB *adb, const char *kstr, const char *vstr);
- `adb' specifies the abstract database object.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, it is overwritten.
The function `tcadbputkeep' is used in order to store a new record into an abstract database object.
bool tcadbputkeep(TCADB *adb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `adb' specifies the abstract database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, this function has no effect.
The function `tcadbputkeep2' is used in order to store a new string record into an abstract database object.
bool tcadbputkeep2(TCADB *adb, const char *kstr, const char *vstr);
- `adb' specifies the abstract database object.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If successful, the return value is true, else, it is false.
- If a record with the same key exists in the database, this function has no effect.
The function `tcadbputcat' is used in order to concatenate a value at the end of the existing record in an abstract database object.
bool tcadbputcat(TCADB *adb, const void *kbuf, int ksiz, const void *vbuf, int vsiz);
- `adb' specifies the abstract database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `vbuf' specifies the pointer to the region of the value.
- `vsiz' specifies the size of the region of the value.
- If successful, the return value is true, else, it is false.
- If there is no corresponding record, a new record is created.
The function `tcadbputcat2' is used in order to concatenate a string value at the end of the existing record in an abstract database object.
bool tcadbputcat2(TCADB *adb, const char *kstr, const char *vstr);
- `adb' specifies the abstract database object.
- `kstr' specifies the string of the key.
- `vstr' specifies the string of the value.
- If successful, the return value is true, else, it is false.
- If there is no corresponding record, a new record is created.
The function `tcadbout' is used in order to remove a record of an abstract database object.
bool tcadbout(TCADB *adb, const void *kbuf, int ksiz);
- `adb' specifies the abstract database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- If successful, the return value is true, else, it is false.
The function `tcadbout2' is used in order to remove a string record of an abstract database object.
bool tcadbout2(TCADB *adb, const char *kstr);
- `adb' specifies the abstract database object.
- `kstr' specifies the string of the key.
- If successful, the return value is true, else, it is false.
The function `tcadbget' is used in order to retrieve a record in an abstract database object.
void *tcadbget(TCADB *adb, const void *kbuf, int ksiz, int *sp);
- `adb' specifies the abstract database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the region of the value of the corresponding record. `NULL' is returned if no record corresponds.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcadbget2' is used in order to retrieve a string record in an abstract database object.
char *tcadbget2(TCADB *adb, const char *kstr);
- `adb' specifies the abstract database object.
- `kstr' specifies the string of the key.
- If successful, the return value is the string of the value of the corresponding record. `NULL' is returned if no record corresponds.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use.
The function `tcadbvsiz' is used in order to get the size of the value of a record in an abstract database object.
int tcadbvsiz(TCADB *adb, const void *kbuf, int ksiz);
- `adb' specifies the abstract database object.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- If successful, the return value is the size of the value of the corresponding record, else, it is -1.
The function `tcadbvsiz2' is used in order to get the size of the value of a string record in an abstract database object.
int tcadbvsiz2(TCADB *adb, const char *kstr);
- `adb' specifies the abstract database object.
- `kstr' specifies the string of the key.
- If successful, the return value is the size of the value of the corresponding record, else, it is -1.
The function `tcadbiterinit' is used in order to initialize the iterator of an abstract database object.
bool tcadbiterinit(TCADB *adb);
- `adb' specifies the abstract database object.
- If successful, the return value is true, else, it is false.
- The iterator is used in order to access the key of every record stored in a database.
The function `tcadbiternext' is used in order to get the next key of the iterator of an abstract database object.
void *tcadbiternext(TCADB *adb, int *sp);
- `adb' specifies the abstract database object.
- `sp' specifies the pointer to the variable into which the size of the region of the return value is assigned.
- If successful, the return value is the pointer to the region of the next key, else, it is `NULL'. `NULL' is returned when no record is to be get out of the iterator.
- Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use. It is possible to access every record by iteration of calling this function. It is allowed to update or remove records whose keys are fetched while the iteration. However, it is not assured if updating the database is occurred while the iteration. Besides, the order of this traversal access method is arbitrary, so it is not assured that the order of storing matches the one of the traversal access.
The function `tcadbiternext2' is used in order to get the next key string of the iterator of an abstract database object.
char *tcadbiternext2(TCADB *adb);
- `adb' specifies the abstract database object.
- If successful, the return value is the string of the next key, else, it is `NULL'. `NULL' is returned when no record is to be get out of the iterator.
- Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use. It is possible to access every record by iteration of calling this function. However, it is not assured if updating the database is occurred while the iteration. Besides, the order of this traversal access method is arbitrary, so it is not assured that the order of storing matches the one of the traversal access.
The function `tcadbfwmkeys' is used in order to get forward matching keys in an abstract database object.
TCLIST *tcadbfwmkeys(TCADB *adb, const void *pbuf, int psiz, int max);
- `adb' specifies the abstract database object.
- `pbuf' specifies the pointer to the region of the prefix.
- `psiz' specifies the size of the region of the prefix.
- `max' specifies the maximum number of keys to be fetched. If it is negative, no limit is specified.
- The return value is a list object of the corresponding keys. This function does never fail and return an empty list even if no key corresponds.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use. Note that this function may be very slow because every key in the database is scanned.
The function `tcadbfwmkeys2' is used in order to get forward matching string keys in an abstract database object.
TCLIST *tcadbfwmkeys2(TCADB *adb, const char *pstr, int max);
- `adb' specifies the abstract database object.
- `pstr' specifies the string of the prefix.
- `max' specifies the maximum number of keys to be fetched. If it is negative, no limit is specified.
- The return value is a list object of the corresponding keys. This function does never fail and return an empty list even if no key corresponds.
- Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use. Note that this function may be very slow because every key in the database is scanned.
The function `tcadbaddint' is used in order to add an integer to a record in an abstract database object.
int tcadbaddint(TCADB *adb, const void *kbuf, int ksiz, int num);
- `adb' specifies the abstract database object connected as a writer.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `num' specifies the additional value.
- If successful, the return value is the summation value, else, it is `INT_MIN'.
- If the corresponding record exists, the value is treated as an integer and is added to. If no record corresponds, a new record of the additional value is stored.
The function `tcadbadddouble' is used in order to add a real number to a record in an abstract database object.
double tcadbadddouble(TCADB *adb, const void *kbuf, int ksiz, double num);
- `adb' specifies the abstract database object connected as a writer.
- `kbuf' specifies the pointer to the region of the key.
- `ksiz' specifies the size of the region of the key.
- `num' specifies the additional value.
- If successful, the return value is the summation value, else, it is `NAN'.
- If the corresponding record exists, the value is treated as a real number and is added to. If no record corresponds, a new record of the additional value is stored.
The function `tcadbsync' is used in order to synchronize updated contents of an abstract database object with the file and the device.
bool tcadbsync(TCADB *adb);
- `adb' specifies the abstract database object.
- If successful, the return value is true, else, it is false.
The function `tcadbvanish' is used in order to remove all records of an abstract database object.
bool tcadbvanish(TCADB *adb);
- `adb' specifies the abstract database object.
- If successful, the return value is true, else, it is false.
The function `tcadbcopy' is used in order to copy the database file of an abstract database object.
bool tcadbcopy(TCADB *adb, const char *path);
- `adb' specifies the abstract database object.
- `path' specifies the path of the destination file. If it begins with `@', the trailing substring is executed as a command line.
- If successful, the return value is true, else, it is false. False is returned if the executed command returns non-zero code.
- The database file is assured to be kept synchronized and not modified while the copying or executing operation is in progress. So, this function is useful to create a backup file of the database file.
The function `tcadbrnum' is used in order to get the number of records of an abstract database object.
uint64_t tcadbrnum(TCADB *adb);
- `adb' specifies the abstract database object.
- The return value is the number of records or 0 if the object does not connect to any database instance.
The function `tcadbsize' is used in order to get the size of the database of an abstract database object.
uint64_t tcadbsize(TCADB *adb);
- `adb' specifies the abstract database object.
- The return value is the size of the database or 0 if the object does not connect to any database instance.
The function `tcadbmisc' is used in order to call a versatile function for miscellaneous operations of an abstract database object.
TCLIST *tcadbmisc(TCADB *adb, const char *name, const TCLIST *args);
- `adb' specifies the abstract database object.
- `name' specifies the name of the function.
- `args' specifies a list object containing arguments.
- If successful, the return value is a list object of the result. `NULL' is returned on failure.
- All databases support "putlist", "outlist", and "getlist". "putlist" is to store records. It receives keys and values one after the other, and returns an empty list. "outlist" is to remove records. It receives keys, and returns an empty list. "getlist" is to retrieve records. It receives keys, and returns keys and values of corresponding records one after the other. Because the object of the return value is created with the function `tclistnew', it should be deleted with the function `tclistdel' when it is no longer in use.
コード例
抽象データベースを使ったコード例を以下に示します。
#include <tcutil.h>
#include <tcadb.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>
int main(int argc, char **argv){
TCADB *adb;
char *key, *value;
/* create the object */
adb = tcadbnew();
/* open the database */
if(!tcadbopen(adb, "casket.tch")){
fprintf(stderr, "open error\n");
}
/* store records */
if(!tcadbput2(adb, "foo", "hop") ||
!tcadbput2(adb, "bar", "step") ||
!tcadbput2(adb, "baz", "jump")){
fprintf(stderr, "put error\n");
}
/* retrieve records */
value = tcadbget2(adb, "foo");
if(value){
printf("%s\n", value);
free(value);
} else {
fprintf(stderr, "get error\n");
}
/* traverse records */
tcadbiterinit(adb);
while((key = tcadbiternext2(adb)) != NULL){
value = tcadbget2(adb, key);
if(value){
printf("%s:%s\n", key, value);
free(value);
}
free(key);
}
/* close the database */
if(!tcadbclose(adb)){
fprintf(stderr, "close error\n");
}
/* delete the object */
tcadbdel(adb);
return 0;
}
CLI
抽象データベースAPIを簡単に利用するために、コマンドラインインターフェイスとして `tcatest
' と `tcamgr
' が提供されます。
コマンド `tcatest
' は、抽象データベースAPIの機能テストや性能テストに用いるツールです。以下の書式で用います。`name' はデータベースの名前を指定し、`rnum' は試行回数を指定し、`tnum' はトランザクションの回数を指定します。
tcatest write name rnum
- `00000001'、`00000002' のように変化する8バイトのキーと値を連続してデータベースに追加する。
tcatest read name
- 上記で生成したデータベースの全レコードを検索する。
tcatest remove name
- 上記で生成したデータベースの全レコードを削除する。
tcatest rcat name rnum
- キーがある程度重複するようにレコードの追加を行い、連結モードで処理する。
tcatest misc name rnum
- 各種操作の組み合わせテストを行う。
tcatest wicked name rnum
- 各種更新操作を無作為に選択して実行する。
tcatest compare name tnum rnum
- 各種データベースの比較テストを行う。
このコマンドは処理が正常に終了すれば 0 を返し、エラーがあればそれ以外の値を返して終了します。
コマンド `tcamgr
' は、抽象データベースAPIやそのアプリケーションのテストやデバッグに役立つツールです。以下の書式で用います。`name' はデータベースの名前を指定し、`key' はレコードのキーを指定し、`value' はレコードの値を指定し、`func' は関数の名前を指定し、`arg' は関数の引数を指定します。
tcamgr create name
- データベースを作成する。
tcamgr inform name
- データベースの雑多な情報を出力する。
tcamgr put [-sx] [-dk|-dc] name key value
- レコードを追加する。
tcamgr out [-sx] name key
- レコードを削除する。
tcamgr get [-sx] [-px] [-pz] name key
- レコードの値を取得して標準出力する。
tcamgr list [-m num] [-pv] [-px] [-fm str] name
- 全てのレコードのキーを改行で区切って標準出力する。
tcamgr misc [-sx] [-px] name func [arg...]
- 雑多な操作の多目的関数を呼び出す。
tcamgr version
- Tokyo Cabinetのバージョン情報を標準出力する。
各オプションは以下の機能を持ちます
-sx
: 入力を16進数の文字列で行う。
-dk
: 関数 `tcadbput' の代わりに関数 `tcadbputkeep' を用いる。
-dc
: 関数 `tcadbput' の代わりに関数 `tcadbputcat' を用いる。
-px
: 出力を16進数の文字列で行う。
-pz
: 出力の末尾に改行を付加しない。
-m num
: 出力の最大数を指定する。
-pv
: レコードの値も出力する。
-fm str
: キーの接頭辞を指定する。
このコマンドは処理が正常に終了すれば 0 を返し、エラーがあればそれ以外の値を返して終了します。
CGI
抽象データベースAPIを簡単に利用するために、コモンゲートウェイインタフェースとして `tcawmgr.cgi
' が提供されます。
CGIスクリプト `tcawmgr.cgi
' は、Webインターフェイスで抽象データベースの内容を閲覧したり編集したりするのに役立つツールです。操作対象のデータベースは、このCGIスクリプトのカレントディレクトリに "casket.tch
" または "casket.tcb
" または "casket.tcf
" という名前で設置されている必要があります。また、そのパーミッションにおいてCGIスクリプトの実行ユーザに対する読み込みと書き込みが可能になっていることが必要です。このCGIスクリプトをWebサーバの公開ディレクトリに設置したら、割り当てられたURLにWebブラウザでアクセスすると利用を開始することができます。
ちょっとしたコツ
この節ではTokyo Cabinetの使い方のコツや知っておくと便利な小技を紹介します。
ユーティリティAPI
C++、Perl、Ruby、Javaといった高水準な言語では必ずといってリストやマップといったデータ構造を簡単に利用できる機能が標準ライブラリとしてついてきます。しかし、C言語にはそれに相当するものはありません。GNOME GlibやApache APRなどの非標準ライブラリを使うのも一興ですが、Tokyo Cabinetにも高機能・高性能なユーティリティが付属しています。STL(C++の標準テンプレートライブラリ)のstringにあたるものがTCXSTRで、listにあたるものがTCLISTで、mapやsetにあたるものがTCMAPとTCTREEです。他にも文字列処理や各種符号処理のユーティリティも提供されます。それらを使いこなすとC言語でもC++やその他の高水準言語並みの直感的なプログラミングができるでしょう。
TCXSTRの何が便利かと言えば、`tcxstrcat
' です。特にバッファリングに有用で、後ろにデータをどんどんくっつけていけるのです。メモリ領域は内部で適宜拡張してくれるので、アプリケーション側でメモリ管理に悩む必要はありませんし、性能もかなり良いです。
TCLISTは配列で実装されたリストです。これはスタック(`tclistpush
' で格納して `tclistpop
' で取り出す)としてもキュー(`tclistpush
' で格納して `tclistshift
' で取り出す)としても使えます。もちろんメモリ管理は内部でよろしくやってくれますし、性能もかなり良いです。
TCMAPはハッシュ表によるマップ(連想配列)の実装です。任意のキーに対応づけて任意の値を格納できます。ハッシュデータベースのオンメモリ版と考えてもよいでしょう。TCMAPのイテレータはレコードを格納した順番に取り出すことができるというのが特徴で、かつ任意のレコードを先頭や末尾に移動させることもできるので、LRU消去方式のキャッシュとしても利用することができます。もちろんメモリ管理は内部でよろしくやってくれますし、性能もかなり良いです。
TCTREEは順序木によるマップ(連想配列)の実装です。任意のキーに対応づけて任意の値を格納できます。B+木データベースのオンメモリ版と考えてもよいでしょう。TCTREEのイテレータはレコードを比較関数の昇順に取り出すことができるというのが特徴で、かつイテレータを任意の場所に飛ばすことができるので、文字列の前方一致検索や数値の範囲を行うことができます。もちろんメモリ管理は内部でよろしくやってくれますし、性能もかなり良いです。
TCXSTRとTCLISTとTCMAPとTCTREEの各関数はリエントラントですが、該当のオブジェクトを複数のスレッドで共有する場合にはアプリケーション側で排他制御を行うことが求められます。ただし、ハッシュマップと順序木に関しては排他制御を内部で行う実装としてTCMDBとTCNDBが提供されます。
TCMPOOLというのもあります。これはいわゆるメモリプールの実装で、メモリ管理の単位を一括して楽をすることができる機能です。例えば `malloc
で確保した領域は必ず `free
' で解放しないとメモリリークになってしまいますが、`tcmpoolmalloc
' で確保した領域は明示的に解放しないでよいのです。ではいつ解放されるのかと言えば、メモリプール自体を解放した時です。つまりアプリケーション側ではメモリプールの寿命にだけ気を付ければよく、個々のオブジェクトの寿命を気にしなくてもよくなるということです。メモリプールはTCXSTRやTCLISTやTCMAPやTCTREEのオブジェクトを発生させることもできますし、任意のオブジェクトをデストラクタとともに登録することもできます。典型的には以下のような使い方をします。
TCMPOOL *mpool;
int i, j;
char *buf;
for(i = 0; i < 100; i++){
mpool = tcmpoolnew();
for(j = 0; j < 100; ++){
buf = tcmpoolmalloc(10); // メモリプール内オブジェクトの生成
... // いちいち解放しなくてOK
}
tcmpooldel(mpool); // ここで一気に解放
}
ハッシュデータベースのチューニング
チューニングをするかしないかでデータベース操作の性能は劇的に変わるので、まじめなユースケースでは、チューニングは必須となるでしょう。関数 `tchdbtune
' でそれを行います。この関数では「バケット数」と「アラインメント力」と「フリーブロックプール力」と「オプション」が指定されます。
最も重要なのは、バケット数の設定です。これは、データベースに格納するレコードの最終的な数の数倍(2〜4倍程度がオススメ)を指定すべきです。デフォルトは131071なので、100000個以上のレコードを入れるならばまずこれを設定すべきです。例えば100万レコードくらいを入れる予定ならば、バケット数は200万〜400万くらいにしておくとよいでしょう。バケット配列の個々の要素のサイズは4バイト(32ビット)なので、バケット数を200万にした場合にはファイルサイズが8MB増えて、メモリも8MB必要となるわけですが、21世紀のコンピュータならそれくらい大したことないでしょう。とりあえずバケット数は大きめにとりましょう。
アラインメントは、レコードの開始位置を揃える機構です。指定したアラインメント力で1を高位にビットシフトした数に開始アドレスが揃えられます。デフォルトは4です。例えばアラインメント力を8にしたならば、1<<8で、256の倍数に開始位置が揃えられます。アラインメントの利点は三つあります。一つめは、開始アドレスを揃えることでレコード間にパディング(隙間)ができることです。レコードサイズの増減がパディングの範囲に収まれば、更新時にレコードの位置を変えなくてもよくなります。二つめは、レコードの読み書きをファイルシステムのブロック単位にあわせて行うことができるために、OSレベルでのI/Oの処理が効率化されることです。三つめは、開始アドレスをアラインメントの商として記録できるようになるため、4バイトのバケットで表せる変域が増加することです。アラインメントを用いない場合は2GB(1<<31)までのデータベースファイルしか扱えませんが、例えばアラインメントが256であれば、2GB*256で512GBまでのデータベースファイルを扱うことができます。
フリーブロックとは、更新によってできたファイル内の未使用領域のことです。フリーブロックプールはそれを管理して再利用する機構です。指定したフリーブロックプール力で1を高位にビットシフトした数がフリーブロックプールの容量になります。デフォルトは10です。この設定を変える必要はほとんどないでしょう。
オプションとは、レコードの格納方法を指定するフラグの集合のことです。`HDBTLARGE
' と `HDBTDEFLATE
' と `HDBTBZIP
' と `HDBTTCBS
' と `HDBTEXCODEC
' の論理和で指定します。`HDBTLARGE
' を指定すると、バケットの個々の要素を8バイト(64ビット)で扱います。バケット配列のサイズが2倍になるかわりに、データベースのサイズの上限を8EBに引き上げます。`HDBTDEFLATE
' を指定すると、レコードをDeflateアルゴリズムで圧縮してから記録します。大きいサイズ(だいたい256バイト以上)のレコードを圧縮して格納する場合に有利です。`HDBTBZIP
' を指定すると、レコードをBZIP2アルゴリズムで圧縮して格納します。Deflateよりは遅いですが、圧縮率は有利です。`HDBTTCBS
' を指定すると、レコードをBWT、MTF、Elias Gamma符号で圧縮して格納します。小さいサイズ(256バイト未満)のレコードを圧縮して格納する場合に有利です。`HDBTEXCODEC
' は外部の圧縮伸長アルゴリズムを使うためのオプションです。具体的なアルゴリズムは隠しAPIの関数 `tchdbsetcodecfunc
' で指定します。
チューニングパラメータの設定はデータベースを作成する前に行う必要があります。チューニングパラメータはメタデータとしてデータベース内に記録されるので、作成した後は指定する必要はありません。なお、いったん作成したデータベースのチューニングを変更することはできません(最適化すればできますが)。バケット数を1000000、アラインメント数を12(4096)、フリーブロックをデフォルト、オプションを `HDBTLARGE
' と `HDBTDEFLATE
' に指定してデータベースを作成する場合、以下のようなコードになります。
TCHDB *hdb;
hdb = tchdbnew();
tchdbtune(hdb, 1000000, 12, -1, HDBTLARGE | HDBTDEFLATE);
tchdbopen(hdb, "casket.hdb", HDBOWRITER | HDBOCREAT);
...
ハッシュデータベースはキャッシュ機構を備えます。これは一旦検索されたレコードをメモリ上に保持しておくもので、同一のレコードが何度も検索される場合の性能を向上させてくれます。キャッシュ上にあるレコードが更新された場合、そのレコードはキャッシュから削除されますので、検索の頻度よりも更新の頻度が多い場合にはあまり効果はありません。また、キャッシュを有効にするとキャッシュを管理するためのオーバーヘッドがかかるので、キャッシュのヒット率がある程度以上でないと逆に処理が遅くなってしまいます。したがって、キャッシュのヒット率がかなり高い場合(つまり同じレコードを何度も参照するような場合)にのみキャッシュ機構を利用すべきです。ハッシュデータベースのキャッシュはデフォルトでは無効になっていますので、有効にする場合は関数 `tchdbsetcache
' で設定してください。キャッシュパラメータの設定はデータベースに接続する前に行う必要があり、また接続する度に毎回行う必要があります。
ハッシュデータベースはmmapを介してファイル入出力を行うための拡張マップメモリという機構を備えます。これは、デフォルトでmmapによってマップされるバケット配列とは別に、レコード用の領域をmmapでメモリにマップしたものです。mmapを介したファイル入出力はpreadやpwriteを使った入出力よりも高速で、並列処理性能も高いという利点もあります。その反面、データベースを開いた瞬間に拡張マップメモリとして指定したサイズの領域が仮想メモリ空間に確保され、そのサイズが実メモリの利用可能量を上回った場合にはスワップが発生してしまいます。デフォルトでは64MBの拡張マップメモリが利用されますが、想定されるデータベースファイルがそれより大きくて実メモリ容量よりも小さいような場合は、データベースサイズよりも少し大きいくらいの拡張マップメモリを指定するとよいでしょう。拡張マップメモリのサイズは関数 `tchdbsetxmsiz
' で指定してください。拡張マップメモリのパラメータの設定はデータベースに接続する前に行う必要があり、また接続する度に毎回行う必要があります。
B+木データベースのチューニング
チューニングをするかしないかで性能が劇的に変わるのはB+木データベースについても同じです。まじめなユースケースではちゃんとチューニングしましょう。チューニングは関数 `tcbdbtune
' で行います。この関数では「リーフ内メンバ数」「非リーフ内メンバ数」「バケット数」と「アラインメント力」と「フリーブロックプール力」と「オプション」が指定されます。
リーフまたはリーフページとは、B+木の末端のノードのことで、複数のレコードのキーと値のリストが格納される記憶単位のことです。リーフ内メンバ数とは、ひとつのリーフの中にいくつのレコードを格納するかの設定です。デフォルトは128です。比較関数の順序通りにレコードを格納または探索することが多い場合はこの値を大きくした方が性能がよくなり、逆に比較関数の順序とは無関係にレコードを格納または探索することが多い場合は小さくした方がよくなります。非リーフまたは非リーフページとはB+木の末端以外のノードのことで、複数のレコードのキーのみが格納される記憶単位のことです。非リーフの数はリーフに比べて少なく、性能に与える影響はあまり大きくありません。非リーフ内メンバ数をデフォルトから変える必要はほとんどないでしょう。
バケット数やその他のパラメータ、B+木データベースの下層にあるハッシュデータベースにそのまま渡されます。B+木の各ページはハッシュデータベースのレコードとして記録されるので、バケット数などのパラメータはその際に意味を持ちます。したがって、ここで指定するバケット数は、B+木データベースにおける最終的なレコード数をリーフ内メンバ数で割った値の数倍に設定するのが最善です。とはいえB+木データベースにおいてはバケット数などのパラメータを変更する必要はあまりないでしょう。
チューニングの例として、平均8バイトのキーと平均32バイトの値のレコードを100万件格納することを考えてみます。各コードのヘッダなどのオーバーヘッドは5バイト程度です。ファイルシステムのブロックサイズは4096バイトとします。すると、1ブロックに入れられるレコード数は4096/(8+32+5)で90個ほどということになります。さらに、Deflate圧縮オプションを有効にして、その圧縮率が50%ほどだとしましょう。となると180個ほどのレコードが1ブロックに収まることが期待されます。各リーフのサイズは2ブロックか3ブロックのサイズが望ましいので、180を2倍した360がリーフ内メンバ数の理想値になります。となると、バケット数は1000000/360で2777となり、デフォルトの32749から変える必要はないでしょう。アラインメント力はファイルシステムのブロックサイズにあわせるためにlog2(4096)で12にします。以上の設定をコードに反映すると以下のようになります。
TCHDB *bdb;
bdb = tcbdbnew();
tcbdbtune(hdb, 360, -1, -1, 12, -1, BDBTDEFLATE);
tcbdbopen(hdb, "casket.bdb", BDBOWRITER | BDBOCREAT);
...
B+木データベースもキャッシュ機構を備えます。これは処理対象のページをメモリ上に保持しておくもので、同一のページが何度も読み書きされる場合の性能を向上させてくれます。キャッシュ上にあるページが更新された場合でも、そのページはメモリ上に保持されたままなので、検索も更新も高速化されます。B+木データベースのキャッシュはデフォルトでは小さめに設定されていますので、メモリを多く使っても高速化したい場合は関数 `tcbdbsetcache
' で設定してください。キャッシュパラメータの設定はデータベースに接続する前に行う必要があり、また接続する度に毎回行う必要があります。
B+木データベースでも拡張マップメモリを利用することができます。しかし、B+木のキャッシュ機構がファイル入出力のバッファリングの役目を果たしているので、デフォルトでは無効になっています。メモリ利用効率は無視してとにかくスループットを追求したい場合のみ、関数 `tcbdbsetxmsiz
' で拡張マップメモリを有効化してください。拡張マップメモリのパラメータの設定はデータベースに接続する前に行う必要があり、また接続する度に毎回行う必要があります。
マルチスレッド対応
Tokyo CabinetのAPIにおける各関数はリエントラントなので、引数として与えるデータが各スレッドで別々のものであれば完全に並列に操作を実行することができます。しかし、データベースオブジェクトは内部状態を持つので、ひとつのデータベースオブジェクトを複数のスレッドで共有する場合には、更新操作に関連して排他制御を行う必要があります。とはいえ、特に難しいことはありません。複数のスレッドで共有するデータベースオブジェクトに対して、作成した直後に関数 `tchdbsetmutex
' や `tcbdbsetmutex
' を呼び出すだけでOKです。そうすると以後の操作の内部で適切にロックを用いて排他制御が行われるようになります。複数のスレッドを使うが各々が別個のデータベースオブジェクトにアクセスする場合には排他制御は必要ありませんし、排他制御をしない方が高速に動作します。
スレッド間の排他制御はリードライトロックで行われます。`open'、`close'、`put'、`out' などの操作にはライトロック(排他ロック)がかけられ、`get'、`curkey'、`curval' などの操作にはリードロック(共有ロック)がかけられます。ロックの単位は、ハッシュデータベースではレコード単位で、B+木データベースではデータベース単位になります。同一のロックに対する読み込みは激しく同時に行えますが、書き込みをしている間は他のスレッドはブロックされます。排他制御の設定はデータベースに接続する前に行う必要があり、また接続する度に毎回行う必要があります。以下のようなコードになります。
TCHDB *hdb;
hdb = tchdbnew();
tchdbsetmutex(hdb);
tchdbopen(hdb, "casket.hdb", HDBOWRITER);
...
トランザクション
ハッシュデータベースとB+木データベースにはトランザクション機構があります。トランザクションを開始してから行った一連の操作は、コミットすることで確定させたり、アボートすることでなかったことにしたりすることができます。トランザクション中にアプリケーションがクラッシュ場合にも、トランザクション中の操作がなかったことになるだけで、データベースの整合性は維持されます。トランザクションは以下のようなコードで用います。
tchdbtranbegin(hdb);
do_something();
if(is_all_ok){
tchdbtrancommit(hdb);
} else {
tchdbtranabort(hdb);
}
トランザクションを実行できるのは同時1スレッドのみで、他のスレッドはその間ブロックされます。したがって、データベースの参照をトランザクション内でのみ行うならば、トランザクションの分離レベルは直列化可能(serializable)になります。しかし、あるスレッドがトランザクションの最中でも他のスレッドはトランザクションを実行せずにデータベースを参照できます。その場合の分離レベルは非コミット読み取り(read uncommitted)になります。状況に応じて使い分けてください。
トランザクション機構は、ハッシュデータベースではファイル上のログ先行書き込み(write ahead logging)によって実現され、B+木データベースではメモリ上のシャドウページング(shadow paging)によって実現されます。これらの手法とロックによって、データベース単位のACID属性(atomicity、consistency、isolation、durability)が確保されます。
ファイルシステムのdurabilityすらも信用しない場合(突然の電源切断に耐える確率を上げたい場合)には、データベースを開く際に `HDBOTSYNC
' または `BDBOTSYNC
' オプションをつけてください。そうすると、すべてのトランザクションの前後にfsyncで更新内容とディスクの内容の同期がとられるようになります(めちゃくちゃ遅くなりますが)。とはいえ、いかにトランザクションを使ってもディスクが壊れたらオシマイなので、重要なデータベースに関してはバックアップや冗長化の手法を適用してください。
カーソル
B+木データベースにはカーソル機構があります。カーソルは指定したキーの場所にジャンプさせることができ、そこから前後にひとつずつずらしながらレコードを参照したり更新したりすることができます。例えば文字列の前方一致検索を行う場合、接頭辞をキーとして指定してカーソルをジャンプさせて、そこから前に進みながらキーを一つ一つ参照していって、前方一致しなかった時点で止めるという処理になります。例えば "tokyo" で始まるキーのレコードを取り出すには以下のようなコードになるでしょう。
cur = tcbdbcurnew();
tcbdbcurjump2(cur, "tokyo");
while((key = tcbdbcurkey2(cur)) != NULL){
if(!tcstrfwm(kbuf, "tokyo")){
free(key);
break;
}
if((val = tcbdbcurval2(cur)) != NULL){
do_something(key, val);
free(val);
}
free(key);
tcbdbcurnext();
}
tcbdbcurdel(cur);
カーソルをジャンプさせてから、他のスレッドが同一のデータベースに対して更新を行った場合、そのカーソルの位置はずれる可能性があります。具体的には、カーソルのあるリーフ上でカーソルより前にレコード挿入された場合、カーソルは小さい方向にひとつずれます。また、カーソルのあるリーフ上でカーソルより前にあるレコードが削除された場合、カーソルは大きい方向にひとつずれます。したがって、検索などの非クリティカルな操作では特別な配慮は必要ありませんが、更新にカーソルを使う場合には、処理中にカーソルの位置がずれないようにトランザクションを使うか、アプリケーション側の責任で排他制御をすることになるでしょう。なお、典型的な検索操作である範囲検索をアトミックに行うために関数 `tcbdbrange
' および関数 `tcbdbfwmkeys
' が提供されています。
バックアップ
データベースファイルのバックアップは、通常のファイルと同様にcp
やtar
やcpio
といったコマンドで行うことができます。ただし、ライタとして接続しているプロセスがデータベースを更新中である場合、コピー元のファイルの状態が中途半端になっている可能性があるため、コピー先のファイルに不整合が起きる場合があります。したがって、データベースが更新中でないこと確認してからバックアップ作業を行うことが必要となります。
デーモンプロセスなどの常駐プロセスがデータベースに接続し続けるユースケースでは上記の手順は現実的ではありません。そういった場合、その常駐プロセスの責任でバックアップ処理を駆動することができます。関数 `tchdbcopy
' や `tcbdbcopy
' を呼び出すと、更新内容をデータベースファイルと同期させた上で、その間にファイルの複製を行います。
バックアップ用関数は任意のコマンドを呼び出すこともできます。コピー先のファイル名の代わりに "@" で始まるコマンド名を指定するとそれが呼び出されます。そのコマンドの第1引数にはデータベース名が指定され、第2引数には現在のUNIX時間のマイクロ秒が指定されます。例えば、以下のようなシェルスクリプトを用意してそれを呼び出すようにするとよいでしょう。
#! /bin/sh
srcpath="$1"
destpath="$1.$2"
rm -f "$destpath"
cp -f "$srcpath" "$destpath"
バックアップ用のコマンドを実行している間はそのデータベースの更新はブロックしますので、コピーに時間がかかる場合には留意が必要です。無停止のホットバックアップを望むならば、"cp" などによる単純なファイル複製の代わりにファイルシステム(LVM)のスナップショット機能を使うとよいでしょう。
ハッシュデータベースとB+木データベースの比較
キーと値のペアを格納したいというのははっきりしているが、ハッシュデータベースとB+木データベースのどちらを使えばよいかわからないという場合もあるかもしれません。その場合、レコードの検索条件が完全一致だけで済むのなら、ハッシュデータベースを試してください。レコードを順序に基づいて参照したいなら、B+木データベースを試してください。メモリ上だけ保持してファイルに書き出す必要がないならば、ユーティリティAPIのハッシュマップを試してください。
検索条件が完全一致の場合にはハッシュデータベースを使うのが一般的ですが、B+木でも完全一致検索はできます。ファイルシステムのI/Oキャッシュに乗らない大規模のデータベースでは、ハッシュデータベースとB+木データベースの性能特性を考えて、使うデータベースの種類を選択することが重要です。
ハッシュデータベースのキャッシュ機構はレコード単位ですが、B+木データベースはキャッシュ機構はページ単位であるというのが性能上の最大の留意点です。B+木データベースにおいては、データベース内の全てのレコードはキーの昇順で並べられ、順番が近いレコードをページにまとめて管理します。キャッシュやI/Oはページを単位として行います。したがって、順番が近いレコードを参照する場合にはキャッシュがヒットしてI/Oを伴わずに操作が完結するので効率がよくなります。ということは、多数のレコードを格納する際に、対象のレコード群をキーの昇順でソートしてからデータベースに格納すると、I/Oの回数が最小化されて時間効率も空間効率も最高になります。これはアプリケーション層でもキャッシュ機構を持つことを要求するものですが、至高を求めるあなたには不可能ではないはずです。全文検索システムHyper Estraierのインデクシングが高速な秘訣はまさにここにあります。
逆に考えれば、データベースにアクセスする順序が制御できない場合は、B+木データベースよりもハッシュデータベースを使う方が有利ということになります。キャッシュに乗らない場合には、ハッシュデータベースの方がメモリ使用量も小さく、個々のレコードを取り出す際の計算量も小くて済みます。なお、ハッシュデータベースの構築時に一気にレコードを入れるような用途の場合には、非同期モードを使うとB+木データベース以上の更新性能を実現できます。新しいレコードはファイルの末尾に記録されることを利用して、ファイルの末尾部分に特化したキャッシュを作ることができるからです。
抽象データベース
ハッシュデーターベースかB+木データベースかを実行時に決定したい場合には、抽象データベースAPIを使うとよいでしょう。抽象データベースAPIはハッシュデータベースAPIとB+木データベースAPIの共通のインターフェイスで、関数 `tcadbopen
' でデータベースを開く際のデータベース名で具体的にどの種類のデータベースを扱うかを指定することができます。ハッシュデータベースの名前には接尾辞として ".tch" をつけ、B+木データベースの名前には接尾辞として ".tcb" をつけることで区別されます。チューニングパラメータは、名前の後に "#" で区切って "name=value" の形式で指定します。例えば "casket.tch#bnum=1000000#apow=10" などとします。数値表現には "k"、"m"、"g" などの2進接頭辞を接尾させることもできます。また、データベース名の接尾辞に ".tcf" をつけると固定長データベースになります。連番のID番号をキーにして固定長のデータを管理する場合には最も効率が良くなります。
抽象データベースAPIはオンメモリハッシュデータベースやオンメモリツリーデータベースとしても利用することができます。データベース名を "*" とするとオンメモリハッシュデータベースになり、"+" とするとオンメモリツリーデータベースになります。また、それらをキャッシュとして利用したい場合は、"*#capsiz=100m" などとするとよいでしょう。キャッシュの容量を100MBに限定して、それを越えた際には格納した順序が古いレコードから自動的に消していくようになります。オンメモリハッシュデータベースとオンメモリツリーデータベースの使い分けですが、パフォーマンスを求める場合には前者を用い、メモリ効率を求めたり前方一致検索を行いたい場合には後者を用いるとよいでしょう。
リモートインターフェイス
多種のアプリケーションでデータベースを共有したい場合やWebアプリケーション等でマルチプロセスの並列処理を行う場合は、Tokyo Cabinetのファイルロック機構が鬱陶しく感じるかもしれません。また、複数のマシンからデータベースを参照したい場合にはTokyo Cabinetだと困ってしまうかもしれません。
データベースの管理のみを行うサーバを別プロセスとして立ちあげて、アプリケーションのプロセスがネットワークソケットを介してそのサーバに接続すれば上記の問題は解決します。そのようなデータベースサーバとそれに接続するためのライブラリが別パッケージ「Tokyo Tyrant」として提供されています。Tokyo Tyrantのサーバは抽象データベースを扱うので、Tokyo Cabinetの全種類のデータベースをリモートインターフェイスで操作することができます。
C言語以外の言語のバインディング
PerlとRubyとJavaとLuaの言語バインディングに関しては、Tokyo Cabinetの作者が開発およびメンテナンスを行います。それ以外の言語に関しては、第三者が提供してくれることを望みます。現状では、少なくともPythonとPHPとSchemeとLispとErlangの処理系でもTokyo Cabinetを利用できるようです。
ユーザの利便性を考えると、C言語以外の言語においても、APIのシンボル名や使い方はできるだけ似通ったものにすることが望ましいでしょう。そのために、`tokyocabinet.idl
' が提供されます。これはIDLで言語共通のインターフェイスを定義したものですので、新たな言語バインディングを設計する際には、できるだけそれに準拠するようにしてください。IDLで定義されていない機能は各言語の流儀にできるだけ合わせてください。インストールの手順やドキュメントなどのパッケージの構造についても、各言語の流儀にできるだけ合わせるとよいでしょう。
この節ではデータベースファイルのフォーマットに関する仕様を示します。
ハッシュデータベースのファイルフォーマット
ハッシュデータベースが管理するデータベースファイルの内容は、ヘッダ部、バケット部、フリーブロックプール部、レコード部の4つに大別されます。ファイルに記録される数値は固定長数値もしくは可変長数値として記録されます。前者は数値を特定の領域にリトルエンディアンで直列化したものです。後者は数値を可変長の領域に128進法のデルタ符号で直列化したものです。
ヘッダ部はファイルの先頭から256バイトの固定長でとられ、以下の情報が記録されます。
名前 |
オフセット |
データ長 |
機能 |
マジックナンバ |
0 |
32 |
データベースファイルであることの判別。「ToKyO CaBiNeT」で始まる |
データベースタイプ |
32 |
1 |
ハッシュ表(0x01)かB+木(0x02) |
追加フラグ |
33 |
1 |
開きっぱなし(1<<0)、致命的エラー(1<<1)の論理和 |
アラインメント力 |
34 |
1 |
アラインメントに対する2の冪乗 |
フリーブロックプール力 |
35 |
1 |
フリーブロックプールの要素数に対する2の冪乗 |
オプション |
36 |
1 |
ラージモード(1<<0)、Deflate圧縮モード(1<<1)、BZIP2圧縮モード(1<<2)、TCBS圧縮モード(1<<3)、外部圧縮モード(1<<4)の論理和 |
バケット数 |
40 |
8 |
バケット配列の要素数 |
レコード数 |
48 |
8 |
格納しているレコードの数 |
ファイルサイズ |
56 |
8 |
データベースファイルのサイズ |
先頭レコード |
64 |
8 |
最初のレコードのオフセット |
不透明領域 |
128 |
128 |
ユーザが自由に使える領域 |
バケット部はヘッダ部の直後にバケット配列の要素数に応じた大きさでとられ、ハッシュチェーンの先頭要素のオフセットが各要素に記録されます。各要素は固定長数値で、そのサイズはノーマルモードでは4バイト、ラージモードでは8バイトです。また、オフセットはアラインメントで割った商として記録されます。
フリーブロックプール部はバケット部の直後にフリーブロックプールの要素数に応じた大きさでとられ、未使用領域のオフセットと長さが各要素に記録されます。オフセットはアラインメントで割った商に変換した上で、直前の要素の値との差分として記録されます。オフセットとサイズは可変長数値として扱われます。
レコード部はバケット部の直後からファイルの末尾までを占め、各レコードの以下の情報を持つ要素が記録されます。各レコードの領域は常にアラインメントされた位置から始まります。
名前 |
オフセット |
データ長 |
機能 |
マジックナンバ |
0 |
1 |
データの識別と整合性確認に用いる。0xC8固定 |
ハッシュ値 |
1 |
1 |
チェーンの進路決定に用いるハッシュ値 |
左チェーン |
2 |
4 |
左チェーン接続先のオフセットのアラインメント商 |
右チェーン |
6 |
4 |
右チェーン接続先のオフセットのアラインメント商 |
パディングサイズ |
10 |
2 |
パディングのサイズ |
キーサイズ |
12 |
可変 |
キーのサイズ |
値サイズ |
可変 |
可変 |
値のサイズ |
キー |
可変 |
可変 |
キーのデータ |
値 |
可変 |
可変 |
値のデータ |
パディング |
可変 |
可変 |
パディング |
ただし、フリーブロックとなった領域には、各レコードの以下の情報を持つ要素が記録されます。
名前 |
オフセット |
データ長 |
機能 |
マジックナンバ |
0 |
1 |
データの識別と整合性確認に用いる。0xB0固定 |
ブロックサイズ |
1 |
4 |
ブロックのサイズ |
トランザクションログはデータベース名に ".wal" を後置した名前のファイルとして記録されます。ファイルの先頭8バイトにトランザクション開始時のデータベースファイルのサイズを記録し、その後に更新操作による差分情報を持つ以下の要素を連結します。
名前 |
オフセット |
データ長 |
機能 |
オフセット |
0 |
8 |
更新された領域の先頭のオフセット |
サイズ |
8 |
4 |
更新された領域のサイズ |
データ |
12 |
可変 |
更新される領域の更新前のデータ |
B+木データベースのファイルフォーマット
B+木データベースが扱う全てのデータはハッシュデータベースに記録されます。記録されるデータは、メタデータと論理ページに分類されます。論理ページはリーフノードと非リーフノードに分類されます。固定長数値と可変長数値の形式はハッシューデータベースと同じです。
メタデータはハッシュデータベースのヘッダにおける不透明領域にとられ、以下の情報が記録されます。
名前 |
オフセット |
データ長 |
機能 |
比較関数フラグ |
0 |
1 |
比較関数がtccmplexical(デフォルト)なら0x0、tccmpdecimalなら0x1、tccmpint32なら0x2、tccmpint64なら0x3、それ以外なら0xff |
予約領域 |
1 |
7 |
現状では利用していない。 |
リーフ内レコード数 |
8 |
4 |
個々のリーフノードに入れるレコードの最大数 |
非リーフ内インデックス数 |
12 |
4 |
個々の非リーフノードに入れるインデックスの最大数 |
ルートノードID |
16 |
8 |
B+木のルートノードのページID |
先頭リーフID |
24 |
8 |
先頭のリーフノードのID |
末尾リーフID |
32 |
8 |
末尾のリーフノードのID |
リーフ数 |
40 |
8 |
リーフノードの数 |
非リーフ数 |
48 |
8 |
非リーフノードの数 |
レコード数 |
56 |
8 |
格納しているレコードの数 |
リーフノードはレコードのリストを保持し、非リーフノードはページを参照する疎インデックスを保持します。レコードはユーザデータの論理的な単位です。キーが重複する論理レコードは物理的には単一のレコードにまとめられます。物理レコードは以下の形式で直列化されます。
名前 |
オフセット |
データ長 |
機能 |
キーサイズ |
0 |
可変 |
キーのサイズ |
値サイズ |
可変 |
可変 |
最初の値のサイズ |
重複数 |
可変 |
可変 |
キーが重複した値の数 |
キー |
可変 |
可変 |
キーのデータ |
値 |
可変 |
可変 |
最初の値のデータ |
重複レコード |
可変 |
可変 |
値のサイズと値のデータのリスト |
リーフノードはレコードの集合を格納するための物理的な単位です。リーフノードは1からインクリメントして振られるID番号で識別されます。リーフノードはID番号を16進数の文字列として表現したデータをキーとし、以下の値を持つレコードとしてハッシュデータベースに格納されます。レコードは常にキーの昇順に整列した状態で保持されます。
名前 |
オフセット |
データ長 |
機能 |
前リーフ |
0 |
可変 |
直前のリーフノードのID |
後リーフ |
可変 |
可変 |
直後のリーフノードのID |
レコードリスト |
可変 |
可変 |
ページのレコードを直列化して連結したデータ |
インデックスはページを探索するためのポインタの論理的な単位です。インデックスは以下の形式で直列化されます。
名前 |
オフセット |
データ長 |
機能 |
ページID |
0 |
可変 |
参照先のページのID |
キーサイズ |
可変 |
可変 |
キーのサイズ |
キー |
可変 |
可変 |
キーのデータ |
非リーフノードはインデックスの集合を格納するための物理的な単位です。非リーフノードは281474976710657からインクリメントして振られるID番号で識別されます。非リーフノードはID番号から281474976710657を引いた値を16進数の文字列とにした上で「#」を接頭させた文字列をキーとし、以下の値を持つレコードとしてハッシュデータベースに格納されます。インデックスは常に昇順に整列した状態で保持されます。
名前 |
オフセット |
データ長 |
機能 |
継承ID |
0 |
可変 |
最初の子ノードのID |
インデックスリスト |
可変 |
可変 |
ページ内のインデックスを直列化して連結したデータ |
固定長データベースのファイルフォーマット
固定長データベースが管理するデータベースファイルの内容は、ヘッダ部とレコード部の2つに大別されます。ファイルに記録される数値はリトルエンディアンの固定長数値として記録されます。
ヘッダ部はファイルの先頭から256バイトの固定長でとられ、以下の情報が記録されます。
名前 |
オフセット |
データ長 |
機能 |
マジックナンバ |
0 |
32 |
データベースファイルであることの判別。「ToKyO CaBiNeT」で始まる |
データベースタイプ |
32 |
1 |
0x03 |
追加フラグ |
33 |
1 |
開きっぱなし(1<<0)、致命的エラー(1<<1)の論理和 |
レコード数 |
48 |
8 |
格納しているレコードの数 |
ファイルサイズ |
56 |
8 |
データベースファイルのサイズ |
レコード幅 |
64 |
8 |
各レコードの値の幅 |
制限サイズ |
72 |
8 |
データベースファイルの制限サイズ |
最小ID |
80 |
8 |
現在のレコードIDの最小値 |
最大ID |
88 |
8 |
現在のレコードIDの最大値 |
不透明領域 |
128 |
128 |
ユーザが自由に使える領域 |
レコード部はヘッダ部の直後からファイルの末尾までを占め、各レコードの以下の情報を持つ要素が記録されます。値サイズに必要な領域は、レコード幅が255以下なら1バイト、65535以下なら2バイト、それを越えれば4バイトです。レコード長は値サイズに必要な領域とレコード幅を足したものです。各レコードの領域は、レコードIDから1を引いた値にレコード長を掛け、それに256を足した位置から始まります。
名前 |
オフセット |
データ長 |
機能 |
値のサイズ |
0 |
可変 |
値のサイズ |
値 |
可変 |
可変 |
値のデータ |
パディング |
可変 |
可変 |
パディング。値サイズが0の時は、先頭バイトの真偽値でレコードの有無を示す |
注記
データベースファイルはスパースではないので、通常のファイルと同様に複製等の操作を行うことができます。ハッシュデータベースのファイルもB+木データベースのファイルも実行環境のバイトオーダに依存しない形式なので、バイトオーダの異なる環境にデータベースファイルを移設してもそのままで利用できます。
なるべくなら、ハッシュデータベースのファイルをネットワークで配布する際には、MIMEタイプを `application/x-tokyocabinet-hash
' にしてください。ファイル名の接尾辞は `.tch
' にしてください。B+木データベースのファイルをネットワークで配布する際には、MIMEタイプを `application/x-tokyocabinet-btree
' にしてください。ファイル名の接尾辞は `.tcb
' にしてください。
データベースファイルのマジックデータを `file
' コマンドに識別させたい場合は、`magic
' ファイルに以下の行を追記してください。
# Tokyo Cabinet magic data
0 string ToKyO\ CaBiNeT\n Tokyo Cabinet
>14 string x \b (%s)
>32 byte 0 \b, Hash
>32 byte 1 \b, B+ tree
>32 byte 2 \b, Fixed-length
>33 byte &1 \b, [open]
>33 byte &2 \b, [fatal]
>34 byte x \b, apow=%d
>35 byte x \b, fpow=%d
>36 byte &1 \b, [large]
>36 byte &2 \b, [deflate]
>36 byte &4 \b, [tcbs]
>40 lelong x \b, bnum=%d
>48 lelong x \b, rnum=%d
>56 lelong x \b, fsiz=%d
よく聞かれる質問
- Q. : Tokyo CabinetはSQLをサポートしますか?
- A. : Tokyo CabinetはSQLをサポートしません。Tokyo CabinetはRDBMS(関係データベース管理システム)ではありません。組み込みのRDBMSを求めるなら、SQLiteなどを利用するとよいでしょう。
- Q. : Berkeley DBとどう違うのですか?
- A. : 時間効率と空間効率の双方でTokyo Cabinetが優っています。
- Q. : アプリケーションの良いサンプルコードはありますか?
- A. : 各APIのコマンドのソースコードを参考にしてください。`
tchmgr.c
' と `tcbmgr.c
' が最も簡潔でしょう。
- Q. : データベースが壊れたのですが、どうしてでしょうか?。
- A. : 大抵の場合、あなたのアプリケーションがきちんとデータベースを閉じていないのが原因です。デーモンプロセスであろうが、CGIスクリプトであろうが、アプリケーションが終了する際には必ずデータベースを閉じなければなりません。なお、CGIのプロセスはSIGPIPEやSIGTERMによって殺されることがあることにも留意しましょう。
- Q. : データベースを壊れにくくするにはどうすればよいですか?
- A. : トランザクションを使ってください。ディスクやファイルシステムが壊れなければデータベースが壊れないようにすることができます。
- Q. : 壊れたデータベースを修復するにはどうすればよいですか?
- A. : データベースファイルをロックなしオプション(
HDBONOLCK
かBDBONOLCK
)をつけて開いて、最適化機能(tchdboptimize
かtcbdboptimize
)を実行してください。コマンドラインで修復処理を行いたい場合、「tchmgr optimize -nl casket
」もしくは「tcbmgr optimize -nl casket
」を実行してください。
- Q. : 性能を引き出すシステムの設定はどんなものがありますか?
- A. : できれば、データベースのサイズと同等以上のRAMをマシンに搭載してください。そして、I/Oバッファのサイズを大きくし、ダーティバッファをフラッシュする頻度が少なくするように設定してください。ファイルシステムの選択も重要です。Linux上では、通常はEXT2が最高速ですが、EXT3のwritebackモードの方が速いこともあります。ReiserFSもかなり高速です。EXT3のその他のモードはかなり遅いです。他のファイルシステムに関しては各自で実験してみてください。
- Q. : 2GBを越えるサイズのファイルを扱おうとするとエラーになるのですが、どうしてですか?
- A. : 32ビットのファイルシステムでは、LFSなどの明示的な指定をしないと2GBを越えるサイズのファイルを作ることができません。32ビットOS上でXFSやFeiserFSなどの64ビットファイルシステムを利用する場合は2GBを越えるサイズのファイルを扱うことができますが、その際にはTokyo Cabinetを `
--enable-off64
' をつけた設定でビルドしておく必要があります。純粋な64ビット環境で利用する場合は特別な設定は必要ありません。なお、ulimitやquotaでファイルサイズの制限がかかっていないことも確認しておいてください。
- Q. : QDBMはもうメンテナンスしないのですか?
- A. : メンテナンスは続けます。積極的な機能追加の予定はありませんが、もしバグが見つかれば対処します。
- Q. : Windowsで利用できませんか?
- A. : 残念ながらできません。今のところ対応予定もありません。
- Q. : ライセンスをBSDLかMITLに変えてくれませんか?
- A. : 嫌です。そうすることに特に利点を感じません。
- Q. : 「Tokyo Cabinet」の名前の由来はなんですか?
- A. : 作者が住んでいる街なので「tokyo」で、モノをしまうから「cabinet」です。略して「TC」と呼ぶのもよい考えです。「東京キャビネット」とか「とうきょうきゃびねっと」とかいう表記でも構いません。東京ディズニーランドや東京ラブストーリーや東京パフォーマンスドールとは一切関係ありません。識別子以外で「TokyoCabinet」とつなげて表記するのは推奨しません。
- Q. : あなたは千葉県とどういう関係なのですか?
- A. : 特に関係はありません。出身地は埼玉県です。落花生は好きです。
ライセンス
Tokyo Cabinetはフリーソフトウェアです。あなたは、Free Software Foundationが公表したGNU Lesser General Public Licenseのバージョン2.1あるいはそれ以降の各バージョンの中からいずれかを選択し、そのバージョンが定める条項に従ってTokyo Cabinetを再頒布または変更することができます。
Tokyo Cabinetは有用であると思われますが、頒布にあたっては、市場性及び特定目的適合性についての暗黙の保証を含めて、いかなる保証も行ないません。詳細についてはGNU Lesser General Public Licenseを読んでください。
あなたは、Tokyo Cabinetと一緒にGNU Lesser General Public Licenseの写しを受け取っているはずです(`COPYING
' ファイルを参照してください)。そうでない場合は、Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA へ連絡してください。
Tokyo Cabinetは平林幹雄が作成しました。作者と連絡をとるには、`mikio@users.sourceforge.net
' 宛に電子メールを送ってください。