Boost.MultiArray の添字演算子について

Boost.MultiArray が添え字の境界チェックをしているのかよく分からなかったので、調べてみた。

リファレンスには次のように書かれている。

By default, the array access methods operator() and operator[] perform range checking. If a supplied index is out of the range defined for an array, an assertion will abort the program. To disable range checking (for performance reasons in production releases), define the BOOST_DISABLE_ASSERTS preprocessor macro prior to including multi_array.hpp in an application.

Boost.MultiArray Reference Manual

和訳

明示的な指定がない場合、配列にアクセスするための () および [] 演算子は範囲の確認をおこないます。指定されたインデックスが配列の境界を超える場合、アサーションが発行されプログラムは終了します。（製品のリリース時など、性能上の観点から）境界チェックを無効にしたい場合は、アプリケーションが multi_array.hpp をインクルードする前に BOOST_DISABLE_ASSERTS マクロを定義してください。

境界チェックには BOOST_ASSERT が使われている。これは <boost/assert.hpp> で定義されていて、

BOOST_DISABLE_ASSERTS が定義されている場合：
テストを行わない
NDEBUG と BOOST_ENABLE_ASSERT_DEBUG_HANDLER が同時に定義されている場合：
テストを行わない
BOOST_ENABLE_ASSERT_HANDLER が定義されているか、
NDEBUG が定義されておらず BOOST_ENABLE_ASSERT_DEBUG_HANDLER のみ定義されている場合：
ユーザーが定義した boost::assertion_failed ハンドラを呼び出す
それ以外の場合：
<assert.h> を読み込み標準の assert を呼び出す。

ややこしいのだが、境界チェックを外すためには BOOST_DISABLE_ASSERTS が必要とのこと。 NDEBUG は関係なさそうだ。

実装を確認する

<boost/multi_array.hpp> の実装を調べた結果も載せておく。

boost::multi_array の継承関係は以下のようになっている。（名前空間名は省略）

+ multi_array<T, NumDims, Allocator>
  + multi_array_ref<T, NumDims>
    + const_multi_array_ref<T, NumDims, TPtr = T*>
      + multi_array_impl_base<T, NumDims>
        + value_accessor_generator<T, NumDims>::type
          = choose_value_accessor_one<T>::type
            = value_accessor_one<T>
              + multi_array_base
          = choose_value_accessor_n<T, NumDims>::type
            = value_accessor_n<T, NimDims>
              + multi_array_base

添え字演算子 [] の実装をしているのは value_accessor_one と value_accessor_n である。

boost/multi_array/base.hpp:L108 近辺で BOOST_ASSERT が使われている。テンプレートで実装されているため、次元に関するループはコンパイル時に展開される。また、boost::multi_array では各次元の添字の始点を設定できるため、この値をメモリから読み込む処理が必要になる。

処理の疑似コードは以下の通り。

// <boost/multi_array/base.hpp> より引用・改変

template<size_t NumDims>
const T& get_element(const T* data,
                     const ptrdiff_t* indices,
                     const size_t* extents,
                     const ptrdiff_t* strides,
                     const ptrdiff_t* index_bases)
{
#if !defined(BOOST_DISABLE_ASSERTS)
    assert(0 <= (indices[0] - index_bases[0]));
    assert(static_cast<size_t>(indices[0] - index_bases[0]) < extents[0]);
#endif
    data += (indices[0] - index_bases[0]) * strides[0];

#if !defined(BOOST_DISABLE_ASSERTS)
    assert(0 <= (indices[1] - index_bases[1]));
    assert(static_cast<size_t>(indices[1] - index_bases[1]) < extents[1]);
#endif
    data += (indices[1] - index_bases[1]) * strides[1];
    ...
    data += (indices[NumDims - 1] - index_bases[NumDims - 1])
            * strides[NumDims - 1];

    return *data;
}

また、関数呼び出し演算子 () の実体は multi_array_impl_base::access_element である。こちらはなんと NDEBUG または BOOST_DISABLE_ASSERTS のどちらかが定義されていれば境界チェックをスキップしてしまう。 NDEBUG のみ指定された環境では、添字演算子 [] のみ境界チェックをするということになり、std::vector と逆の動きをする点に注意。

こちらは次元に関して for でループを回している。処理の疑似コードは以下の通り。

// <boost/multi_array/base.hpp> より引用・改変

template<size_t NumDims>
const T& get_element(const T* data,
                     const ptrdiff_t* indices,
                     const size_t* extents,
                     const ptrdiff_t* strides,
                     const ptrdiff_t* index_bases)
{
#if !defined(NDEBUG) && !defined(BOOST_DISABLE_ASSERTS)
    for (size_t i = 0; i < NumDims; ++i)
    {
        assert(0 <= (indices[i] - index_bases[i]));
        assert(static_cast<size_t>(indices[i] - index_bases[i]) < extents[i]);
    }
#endif
    for (size_t i = 0; i < NumDims; ++i)
    {
        data += (indices[i] - index_bases[i]) * strides[i];
    }
    return *data;
}

結論

アサーションがあると分岐ペナルティが発生するため、BOOST_DISABLE_ASSERTS を設定することが望ましい。
次元方向のループ展開はあまりパフォーマンスに影響しないので、添字演算子 [] と関数呼び出し演算子 () の選択はお好みで。
最速を目指す場合はメモリアドレスを手動で計算し、添字の始点の取り扱いを省略する。

実装を確認する

結論

参考文献