Performance improvements and refactoring of Factor class
Created by: jpoeppel
This PR represents work in progress to harvest some low hanging performance fruits in the Factor class (might be extended to include the FactorTree algorithm as well).
So far, the biggest improvement (of nearly a speedup of 2) was achieved by providing a custom copy method to factors, as copy.deepcopy was very slow previously.
Apart from that the multiplication method was changed in order to reduce the amount of copying objects and numpy arrays around even further by trying to use more views into arrays where possible.
Another potential improvement that has been started but is not yet used, is the idea of inplace multiplication/division. Currently a multiplication/division will always create a new factor, requiring additional copies. However, the algorithms currently used (bucket/variable-elimination or factorTree) usually do not require the old factors anymore, meaning that we could save additional copies with inplace operations. Exploring the benefits in this will be the next step. Since I would like to keep the __mul__ operation to allow Factor * Factor notation, we'll have to decide if we want to move the actual work to a private function _multiplication which allows to specify inplace/new factor via optional parameters or if the algorithms are changed to use f1.__mul__(f2, inplace) where required.
On master the examples/performanceTest.py yields these results: Factor Tree took: 4.2510831356 Factor Tree took: 4.17080688477 Factor Tree took: 4.16728687286 Bucket took: 3.3462331295 Bucket took: 3.34677505493 Bucket took: 3.34675097466
On this branch this becomes: Factor Tree took: 2.27126693726 Factor Tree took: 2.18012094498 Factor Tree took: 2.18277311325 Bucket took: 1.22670102119 Bucket took: 1.22501397133 Bucket took: 1.29871082306
With the changes mentioned above. (The "worse" performance of the FactorTree compared to the Bucket-Elimination is due to the fact that the FactorTree times incorporate the creation of the factor trees in each run, as only getting the marginals of a factor tree does not require multiplication)