Architecture Review Meeting

Name: Architecture Review Meeting
Start: 2016-05-18T15:00:00-05:00
End: 2016-05-18T16:00:00-05:00
Location: Fermilab

Wednesday 18 May 2016, 15:00 → 16:00 US/Central

WH7XE "Bullpen" (Fermilab)

WH7XE "Bullpen"

Fermilab

Gianluca Petrillo (Fermilab)

Description

Audio connection will be opened with ReadyTalk:
Conference code: 8867778
Phone numbers: https://www.readytalk.com/account-administration/international-numbers

+1 (866) 740-1260 (U.S.A., toll-free)
+1 (303) 248-0285 (U.S.A., toll)

Hide

Note Some notes from the participants to the meeting have not been integrated yet in this summary.

Date: May 18, 2016
Participants: James Amundson, Herbert Greenlee, James Kowalkowski, Robert Kutschke, Marc Paterno, Gianluca Petrillo, Brian Rebel, Erica Snider, Saba Sehrish

Scope of the meeting

The discussion concerns the use of structures to represent specific types of data in LArSoft.

Interest has been expressed by elements from a broader community, that could yield a wider forum. This discussion will keep it in mind, but will focus on LArSoft only. The outcome may be of use in that wider forum.

Areas for recommendation

Three areas were proposed that could yield a specific recommendation. Robert Kutschke has suggested to add a fourth.

geometry representation: 2D and 3D vectors, transformations
physics-related vector quantities (mostly vectors in Minkowski space)
linear algebra: vectors, matrices, operations with them
libraries facilitating multi-threading and SIMD operations

It was agreed that while the fourth item be out of the scope of the present recommendation, it needs to be kept in mind.

The incorporation of the first two items was rejected on the ground of being distinct enough that accommodating both with a single library would come with a risk of unnecessarily degrading one or both of the areas. The risk is considerable: the convened could identify only two libraries that explicitly support the area 2, both developed by the physics community. This does not preclude a scenario where the two areas are eventually satisfied with the same library. It is also conceivable to provide the missing features bridging from a 4D vector in Cartesian metric to one in a Minkowski metric by specific functions.

The area 3, linear algebra, should be weighted toward small data structures, as most of the applications do not go beyond rank 2 (matrices) and dimension 5. It is still possible to deal with exceptions as such, by using a specific library for a specific case, if the benefit is overwhelming.

List of candidate libraries

A open list of candidate libraries was presented as a starting point for discussion. The possibility of an entirely custom, newly developed library has not been considered. It was instead suggested that custom interfaces of small, maintainable size could be developed to fill a usability gap, should such a gap manifest on a otherwise excellent library.

Most of the items in the list were quickly dismissed. For instance:

PETSc is designed for a different area
GNU Scientific Library is a failure from both interface and performances
BLAS-based libraries are generally not competitive in terms of performances
Intel Math Kernel Library has a license incompatible with LArSoft

Elemental was not known to the convened, but being based on BLAS it has the same limitations.

Overall, a few libraries were selected as papable:

CLHEP (not for area 3)
ROOT in the GenVector and SMatrix incarnations
Eigen
Armadillo

The conclusions of a investigation by the ATLAS experiment with a purpose similar to this one were presented at CHEP 2013 This presentation is three years old and contains some outdated information. CLHEP, Intel Math Kernel Library, ROOT and Eigen were compared on representative synthetic benchmarks. Custom implementations of matrix operations were also included, including explicit vectorisation optimisation. Their conclusions are reflected in the choice of ATLAS to replace their CLHEP code with Eigen and their mathematical function library implementation.

Relevant requirements

A open list of features was proposed to pick requirements from. Discussion elected the following as requirements:

license compatible with LArSoft usage
portable on all LArSoft-supported platforms (Scientific Linux Fermi 6/7 and Darwin 13/14 to date)
actively maintained
allowing object serialisation via ROOT I/O

One specific characterisation of portability is that the binary distribution based on Scientific Linux Fermi 6 should work on all the compatible Linux systems.

Another set of features was considered relevant:

memory overhead
performances, in terms of resource usage (memory, processing time)
ability to convert the structures to a different format with little overhead

Memory overhead is intended as usage of memory to store redundant data or metadata. Two examples have been enumerated. ROOT TVector3 class, deriving from TObject, has additional inherited data members that are not necessary to define the content of a 3D vector, including a pointer to the virtual table that enables TObject polymorphism. The C++ standard std::vector dynamically allocates its content, adding three pointers plus a header in the heap for common implementations. A further example was not discussed, of small-matrix optimisation used by ROOT TMatrix, that always contains 25 elements used to avoid dynamic allocation: if the size of the data is known, this approach is always non-optimal.

A point was made that memory overhead is an aspect that is often traded for execution speed. For example, storing only the upper triangular part of a symmetric matrix about halves its memory, but it might degrade the speed of its operations.

The conversion of data to a different format is often a necessity when using libraries that do not support our original format. A proposed example is Fourier transform from FFTW library, that expect its data to be stored in a contiguous area of double precision real numbers. The directness of such a conversion will typically depend on the internal data representation, and the more abstruse that is, the more likely is the need for a conversion by copy.

The ability to use ROOT-serialised classes in an environment different than art (ROOT interactive console, python) is not considered endangered, as techniques are known and acceptably convenient to overcome the issues.

The language of implementation is a moot point given the current selection of candidates, all in C++. Whether the libraries are header only or not is considered irrelevant. Support for sparse data, although not irrelevant to LArSoft use case, should not affect the decision.

It was also proposed as a criterion a judgement about the ease to write code with the library. Proper education of the community should quell the issue. Nevertheless, past history shows that has seldom happened in LArSoft. Resources should be actively devoted to a education effort proportional to the learning difficulty. Moderate usability barriers can be overcome with additional custom interfaces. This is also a balance between performance improvement, maintenance of the interface, and steepening of the learning curve.

Next steps for an informed decision

The election of a library from the surviving candidates must be informed by tuned performance benchmarks. The suggested path is to:

identify some representative use cases
isolate the components relevant for the present decision and their use pattern
design synthetic benchmarks exercising those components and emulating those patterns
implement the benchmarks with the different candidate libraries, and compare performances

The identification of representative cases was quickly done at the meeting. MicroBooNE detector electronics response and the following reconstruction are good use cases. The detector simulation is potentially dominated by Geant4, on which we don't have leverage, and it was therefore discarded.

For the isolation of the relevant components a profiling procedure has been suggested that counts the calls to relevant objects (vectors, matrices, etc.). The call stack can point to the originating code, from where the usage pattern can be read.

There are minutes attached to this event. Show them.

The agenda of this meeting is empty