I have posted here several modifications for some libc functions optimised to run faster on modern processors (see the Algorithms and Technologies category).
All these functions were implemented in DataparkSearch Engine. As the performance of these functions, especially in comparison with standard implementations on specified platform, depends on microprocessor and compiler optimisation level used, I've added special performance testing during configure stage to select only those function implementations which run faster on target platform. That gives you maximum performance of DataparkSearch on every platform.
Also, I've spun off these function implementations as a separate library, libdp, which can be installed on your PC separately and be used to speed up any dynamic linked application (via LD_PRELOAD environment variable). E.g.:
LD_PRELOAD=/usr/local/dpsearch/lib/libdp-4.so perl programm.pl
There is PerlBench performance comparison for Perl interpreter running on Intel Celeron M processor under Ubuntu Linux 10.04. Configuration A is the system Perl invoked as /usr/bin/perl, Configuration B is the same interpreter invoked as /usr/bin/perl5.10.1, Configuration C is the system Perl invoked with libdp preloaded via LD_PRELOAD environment variable. Small performance differences in 1-2% can be counted as sporadic fluctuations and corresponding tests as run at the same performance.
And there is rjsh-pybench performance comparison for system Python interpreter running on the same laptop without (benchmark p0) and with (benchmark p1) libdp preloaded.
PYBENCH 1.4 Benchmark: p1 (rounds=20, warp=1) Comparing with: p0 (rounds=20, warp=1) Tests: min run cmp run avg run diff ----------------------------------------------------------------------------- BuiltinFunctionCalls: 169.50 ms 159.50 ms 173.80 ms +6.27% BuiltinMethodLookup: 179.00 ms 179.00 ms 190.12 ms -0.00% CompareFloats: 289.00 ms 289.00 ms 300.25 ms +0.00% CompareFloatsIntegers: 239.00 ms 219.50 ms 245.63 ms +8.88% CompareIntegers: 219.00 ms 218.50 ms 228.90 ms +0.23% CompareInternedStrings: 217.00 ms 217.00 ms 226.40 ms +0.00% CompareLongs: 209.00 ms 208.50 ms 213.35 ms +0.24% CompareStrings: 277.50 ms 277.50 ms 285.00 ms +0.00% CompareUnicode: 218.50 ms 218.50 ms 223.83 ms +0.00% ConcatStrings: 139.50 ms 159.50 ms 156.62 ms -12.54% ConcatUnicode: 169.50 ms 189.50 ms 185.28 ms -10.55% CreateInstances: 189.00 ms 189.00 ms 193.50 ms +0.00% CreateNewInstances: 1117.00 ms 1127.00 ms 1141.15 ms -0.89% CreateStringsWithConcat: 198.50 ms 208.50 ms 201.85 ms -4.80% CreateUnicodeWithConcat: 179.50 ms 209.50 ms 196.13 ms -14.32% DictCreation: 179.00 ms 179.00 ms 186.83 ms +0.00% DictWithFloatKeys: 199.00 ms 199.00 ms 206.60 ms -0.00% DictWithIntegerKeys: 218.00 ms 218.00 ms 224.75 ms +0.00% DictWithStringKeys: 317.00 ms 317.00 ms 326.80 ms +0.00% ForLoops: 179.50 ms 189.50 ms 195.80 ms -5.28% IfThenElse: 258.00 ms 258.00 ms 271.63 ms +0.00% ListSlicing: 289.00 ms 289.00 ms 297.30 ms +0.00% NestedForLoops: 449.50 ms 449.50 ms 467.98 ms +0.00% NormalClassAttribute: 248.00 ms 249.00 ms 254.55 ms -0.40% NormalInstanceAttribute: 248.50 ms 248.50 ms 253.50 ms +0.00% PythonFunctionCalls: 289.00 ms 289.00 ms 294.23 ms -0.00% PythonMethodCalls: 279.50 ms 279.50 ms 291.72 ms +0.00% Recursion: 179.50 ms 179.50 ms 192.52 ms +0.00% SecondImport: 868.00 ms 888.50 ms 884.02 ms -2.31% SecondPackageImport: 119.50 ms 119.50 ms 127.45 ms +0.00% SecondSubmoduleImport: 129.50 ms 129.50 ms 138.88 ms +0.00% SimpleComplexArithmetic: 189.50 ms 199.00 ms 203.45 ms -4.77% SimpleDictManipulation: 308.50 ms 308.50 ms 322.00 ms +0.00% SimpleFloatArithmetic: 348.00 ms 348.50 ms 358.18 ms -0.14% SimpleIntFloatArithmetic: 268.00 ms 268.00 ms 277.42 ms +0.00% SimpleIntegerArithmetic: 188.50 ms 188.50 ms 195.37 ms +0.00% SimpleListManipulation: 268.50 ms 258.50 ms 275.75 ms +3.87% SimpleLongArithmetic: 189.50 ms 179.00 ms 198.60 ms +5.87% SmallLists: 148.50 ms 148.50 ms 153.90 ms -0.00% SmallTuples: 179.00 ms 169.50 ms 183.95 ms +5.60% SpecialClassAttribute: 199.00 ms 199.00 ms 202.32 ms +0.00% SpecialInstanceAttribute: 279.00 ms 279.00 ms 285.30 ms -0.00% StringMappings: 259.00 ms 259.00 ms 266.23 ms -0.00% StringPredicates: 235.50 ms 235.50 ms 246.42 ms -0.00% StringSlicing: 128.50 ms 139.00 ms 138.25 ms -7.55% TryExcept: 198.00 ms 198.50 ms 202.45 ms -0.25% TryRaiseExcept: 129.00 ms 129.00 ms 135.25 ms +0.00% TupleSlicing: 289.50 ms 299.00 ms 305.12 ms -3.18% UnicodeMappings: 229.00 ms 239.00 ms 247.88 ms -4.18% UnicodePredicates: 244.50 ms 244.50 ms 253.53 ms -0.00% UnicodeProperties: 186.50 ms 186.00 ms 194.32 ms +0.27% UnicodeSlicing: 109.50 ms 129.00 ms 122.97 ms -15.12% ----------------------------------------------------------------------------- Notional minimum round time: 13036.00 ms 13156.50 ms -0.92%
Small differences in 1-5% less 1% can be considered as fluctuations as well.